Snow Owl's automapping feature can provide candidates for mapping targets. Map target candidates are searched based on lexical or semantic similarities. The returned candidates are ordered based on their similarity ranking and the candidate with the highest ranking is populated as the target concept. If required and supported, source terms can be automatically translated to English for lexical candidate search.
- Term: The entire string representing the lexical meaning of a concept.
- Term word: The segments in the term separated by space or any of the following delimiters: ()/,.:;%#&+-*~'^><=\"`.
- Stop words: A list of words ignored when performing lexical matches: "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"
- Fuzzy: Returns matches that contain words similar to the search term, as measured by a Levenshtein edit distance.An edit distance is the number of one-character changes needed to turn one term into another. These changes can include:
- Changing a character (box → fox)
- Removing a character (black → lack)
- Inserting a character (sic → sick)
- Transposing two adjacent characters (act → cat)
The Automapping Wizard is the main means to set parameters for the automapping process. For a given Mapping Set, it can be invoked by pressing the
Automapping Wizard button placed in the upper-right corner of the Mapping Set Editor.
The main parameters are:
- Target code system: The code system to search map target candidates from.
- Target code system version: The code system version of the selected code system to search map target candidates from. The default value is the selected code system's active version which can be a released version of the selected code system, HEAD which is the latest state (often referenced as MAIN), or the currently active task's ID.
Automap correlation threshold (%): Defines the accuracy of the lexical term search algorithm in a percentage where 100% is the most accurate and 1% is the least accurate. The range between is distributed in four levels:
- 91% - 100%: Case-insensitive prefix match for every source term word regardless of order.
- Cell to cell distinctive relationship → Cell to cell relationship, distinctive - Every term word of the source is present in the target regardless of their order
- Abnormal flush and sweat → Abnormal flushing and sweating - Every term word prefix of the source is present in the target
- Instability of joint → Joint instability - Stop words are not filtered in this range. This is applied by decreasing the threshold below or equal to 75%
- 75% - 90%: Case-insensitive prefix match for every source term word regardless of order excluding a list of stop words.
- Instability of joint → Joint instability - Every term word of the source is present in the target after the stop word "of" was filtered
- Left foot deformity → Deformity of left foot - Every term word of the source is present in the target regardless of their order after stop word "of" was filtered
- 61% - 74%: Case-insensitive prefix match for every source term word regardless of the order excluding list of stop words, misspellings allowed (Fuzzy match for 1 character/word differences)
- Bod structure → Body structure - Inserting a character
- Boyd structure → Body structure - Transposing two adjacent characters
- Body structure → Bodz structure - Changing a character
- Body structures → Body structure -Removing a character
- Boyd structuer → Body structure - Requires more than 1 edit (Transposition 2 time)
- Bod structures → Body structure - Requires more than 1 edit (1 insertion 1 removal)
- 1% - 60%: Case-insensitive prefix match for every source term word regardless of the order excluding list of stop words, for a minimum number of source term words
- Mapping term Non carious lesion at cervical margin of tooth by the threshold set to 60% (minimum number of source term words: 5)
- Non carious lesion at cervical margin of tooth - 5 source term word can be applied on the target term words
- Mapping term Non carious lesion at cervical margin of tooth by the threshold set to 30% (minimum number of the source term word: 3)
- Non carious lesion at cervical margin of tooth - at least 3 source term word can be applied to the target term words
- Nonrestorable carious tooth - 3 source term word prefix can be applied on the target term words
- Entire cervical margin of tooth - 3 source term word can be applied on the target term words
- Caries of cervical margin of tooth - 3 source term word can be applied on the target term words
- Margin of tooth - Stop words are not considered as words to match
Minimum term match selection:
If the threshold is set to 60% it selects all words except 1 to match. If the threshold is 1% it means a single word is enough for a match. Between them the minimum term match is calculated by counting the term's words delimited by the listed delimiters with stop words ignored, multiplied by the threshold percentage rounded up.
Mapping the term "Suspected rupture of membranes not found for normal first pregnancy" with the threshold set to 25% would result in 3 words. The term contains 8 words with stop words ignored. 25% would be around 42% on a normal 1 - 100 scale. 8 * 0.42 = 3.36 which is rounded down to 3 to match.
Automapping wizard's parameters page
In case the target code system is selected to be SNOMED CT, the user can constrain the set of map target candidates to a particular domain (e.g. pharmaceutical products or diseases, etc.) using an ECL expression.
The default value is '*' which means everything.
Default ECL Expression
Upon the completion of the automapping process, the highest-ranked candidate for each source concept will be populated as target concept in the Mapping Set Editor. By clicking on the term of a populated map target, a button
is revealed that can provide access to the list of complete candidates ordered by their ranks.
If available, the user can manually pick a better replacement for the given mapping.
Optional candidates are only available within the editor until the editor is closed. Upon re-opening, the automapping feature needs to be executed again to re-populate map target candidates.
If language translation is required and supported by the server, the automap wizard can automatically detect the source concept's language, translate the terms to English in order to find lexical matches within the available target concepts.
In order to restrict the process to a subset of concepts, the user needs to select the mappings within the editor before starting the process.
If there are no mappings selected in the editor, the automapping process attempts to find target candidates for all the source concepts with UNSPECIFIED target presented in the Mapping Set.
The Automapping Wizard has a dedicated page where certain Map Type Reference Sets can be excluded from the automap process, meaning the selected Reference Sets are not taken into consideration by the SNOMED CT Reference Set matching provider.
Reference Set Exclusion from Automapping