Automapping

Snow Owl's automapping feature can provide candidates for mapping targets. Map target candidates are searched based on lexical or semantic similarities. The returned candidates are ordered based on their similarity ranking and the candidate with the highest ranking is populated as the target concept. If required and supported, source terms can be automatically translated to English for lexical candidate search.

Definitions

  • Term: The entire string representing the lexical meaning of a concept.

  • Term word: The segments in the term separated by space or any of the following delimiters: ()[]/,.:;%#&+-*~'^><=\"`.

  • Stop words: A list of words ignored when performing lexical matches: "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"

  • Fuzzy: Returns matches that contain words similar to the search term, as measured by a Levenshtein edit distance.

    An edit distance is the number of one-character changes needed to turn one term into another. These changes can include:

    • Changing a character (box → fox)

    • Removing a character (black → lack)

    • Inserting a character (sic → sick)

    • Transposing two adjacent characters (act → cat)

Automap Process

The main parameters are:

  • Target code system: The code system to search map target candidates from.

  • Target code system version: The code system version of the selected code system to search map target candidates from. The default value is the selected code system's active version which can be a released version of the selected code system, HEAD which is the latest state (often referenced as MAIN), or the currently active task's ID.

Automap correlation threshold (%): Defines the accuracy of the lexical term search algorithm in a percentage where 100% is the most accurate and 1% is the least accurate. The range between is distributed in four levels:

  • 91% - 100%: Case-insensitive prefix match for every source term word regardless of order.

  • 75% - 90%: Case-insensitive prefix match for every source term word regardless of order excluding a list of stop words.

  • 61% - 74%: Case-insensitive prefix match for every source term word regardless of the order excluding list of stop words, misspellings allowed (Fuzzy match for 1 character/word differences)

  • 1% - 60%: Case-insensitive prefix match for every source term word regardless of the order excluding list of stop words, for a minimum number of source term words

    • Mapping term Non carious lesion at cervical margin of tooth by the threshold set to 60% (minimum number of source term words: 5)

    • Mapping term Non carious lesion at cervical margin of tooth by the threshold set to 30% (minimum number of the source term word: 3)

Minimum term match selection:

If the threshold is set to 60% it selects all words except 1 to match. If the threshold is 1% it means a single word is enough for a match. Between them the minimum term match is calculated by counting the term's words delimited by the listed delimiters with stop words ignored, multiplied by the threshold percentage rounded up.

Example

Mapping the term "Suspected rupture of membranes not found for normal first pregnancy" with the threshold set to 25% would result in 3 words. The term contains 8 words with stop words ignored. 25% would be around 42% on a normal 1 - 100 scale. 8 * 0.42 = 3.36 which is rounded down to 3 to match.

In case the target code system is selected to be SNOMED CT, the user can constrain the set of map target candidates to a particular domain (e.g. pharmaceutical products or diseases, etc.) using an ECL expression.

The default value is '*' which means everything.

If available, the user can manually pick a better replacement for the given mapping.

Optional candidates are only available within the editor until the editor is closed. Upon re-opening, the automapping feature needs to be executed again to re-populate map target candidates.

Miscellaneous

Translation

If language translation is required and supported by the server, the automap wizard can automatically detect the source concept's language, translate the terms to English in order to find lexical matches within the available target concepts.

Selecting sources to automap

In order to restrict the process to a subset of concepts, the user needs to select the mappings within the editor before starting the process.

If there are no mappings selected in the editor, the automapping process attempts to find target candidates for all the source concepts with UNSPECIFIED target presented in the Mapping Set.

Exclusion

The Automapping Wizard has a dedicated page where certain Map Type Reference Sets can be excluded from the automap process, meaning the selected Reference Sets are not taken into consideration by the SNOMED CT Reference Set matching provider. The wizard lists the available Map Type Reference Sets and allows multi-selection using the checkbox in front of the listed items.