Matches in SemOpenAlex for { <https://semopenalex.org/work/W2015326825> ?p ?o ?g. }
Showing items 1 to 92 of
92
with 100 items per page.
- W2015326825 endingPage "127" @default.
- W2015326825 startingPage "127" @default.
- W2015326825 abstract "Motivation and Objectives Biomedical terminologies play important roles in clinical data capture, annotation, reporting, information integration, indexing and retrieval. More particularly, genomic terminologies and ontologies are very useful for indexing genomic information. Several sources of information and terminologies have already been developed. For instance, the Gene Ontology (GO, http://www.geneontology.org/, last accessed on July 17, 2012), which is a controlled vocabulary widely used for the annotation of gene products; the Human Phenotype Ontology (HPO, http://www.human-phenotype-ontology.org/, last accessed on July 17, 2012) in which terms describe phenotypic abnormalities encountered in human disease, such as “atrial septal defect”; and ORPHANET, http://www.orpha.net/consor/www/cgi-bin/index.php?lng=FR, last accessed on July 17, 2012) the portal for rare diseases and orphan drugs. These knowledge sources have mostly different formats and purposes. For example, ORPHANET is a rare disease database whereas HPO is an ontology which supports the description of phenotypic information. Faced with this reality and the need to allow cooperation between various health actors and their related health information systems, it appeared necessary to link these terminologies by developing a semantic repository to integrate them. The most known repository is the Unified Medical Language System (UMLS) (Lindberg et al., 1993). Several works were based on the UMLS to align terminologies in French (Merabti et al., 2012) and in English (Bodenreider et al., 1998; Milicic Brandt et al., 2011; Mougin et al., 2011). However, HPO and ORPHANET are not yet included in the UMLS. Thus, another solution is to find correspondences between these terminologies in French and in English using automatic methods. In (Merabti et al., 2012) we have proposed a lexical method to map biomedical terminologies either included or not into the UMLS. Nevertheless, these methods remain very dependent on the terminologies languages since they used NLP tools such as stemming or normalization. We propose in this study a string-based method to find correspon-dences between a subset of terminologies for an easier access to biomedical information. It is based on the combination of several string metrics and it is neither based on the UMLS, nor language dependent. Mixed with lexical or conceptual approaches developed in previous studies (Merabti et al., 2012), it could improve the number of correspondences between terminologies with a high precision. Semantic methods are also an envisaged issue to complete this study. Methods To map biomedical terminologies, we used string matching methods where concept names, terms and their labels are considered as sequences of characters. A string distance is determined to compute a similarity degree. Some of these methods can skip the order of characters. In this paper, the union of three metrics was used (i) Dice (Dice, 1945), (ii) Levenshtein (Levenshtein, 1965) and (iii) Stoilos (Stoilos et al., 2005). The Dice’s coefficient calculates the ratio between the number of bigrams of characters incommon to both the strings x and y and the total number of bigrams for two strings defined by the following equation where nb-big(x) is the number of bigrams of x: The Levenshtein distance between two strings x and y is defined as the minimum number of elementary operations that is required to pass from a string x to a string y. There are three possible transactions: replacing a character with another, deleting a character and adding a character. This measure takes its values in the interval [0, ∞ [. The Normalized Levenshtein (Yujian and Bo, 2007) (LevNorm) in the range [0, 1] is obtained by dividing the distance of Levenshtein Lev(x, y) by the size of the longest string and it is defined by: LevNorm (x,y) is element of [0,1] as Lev(x,y) < Max(|x|,|y|). |x| is the length of the string x. The Stoilos distance has been specifically developed for strings that are labels of concepts in ontologies. It is based on the idea that the similarity between two entities is related to their commonalities as well as their differences. Thus, the similarity should be a function of both these features. It is defined by: Where Comm(x,y) stands for the commonality between the strings x and y, Diff(x,y) for the difference between x and y, and Winkler(x,y) for the improvement of the result using the method introduced by Winkler in (Winkler, 1999). The function of commonality is determined by the substring function. The biggest common substring between two strings (MaxComSubString) is computed. This process is further extended by removing the common substring and by searching again for the next biggest substring until none can be identified. The function of commonality is given by the equation: The function of Difference is defined in the fo-llowing equation where p is element of [0, ∞ [(usually p= 0.6), |ux| and |uy| represent the length of the unmatched substring from the strings x and y scaled respectively by their length: The Winkler parameter Winkler(x,y) is defined by the equation: where L is the length of common prefix between the strings x and y at the start of the string up to a maximum of 4 characters and P is a constant scaling factor for how much the score is adjusted upwards for having common prefixes. The standard value for this constant in Winkler’s work is P=0.1. To evaluate the correspondences between the terminologies found using the proposed method we have calculated the precision on a sample set evaluated manually and defined as: Results and Discussion In this study we presented a combination of tree string matching methods to align several biomedical terminologies. The results showed that combining these methods on general terminologies such as MeSH and SNOMED provided more correspondences than only one method and with good results (with a precision>99%). Aligning genomic terminologies provided also good results with high precision. However, we evaluated here only “exact” correspondences and rated them as “correct” or “not correct”. Indeed, correspondences such as “broader–narrower” or “sibling” relations between terms were not considered. For example, when a correspondence is founded between two terms which one string is included in another one in most cases it is more general than the second, and a “broader-narrower” correspondence could exist (for example, correspondence between “insuffisance surrenale” term (Adrenal insufficiency) and all the terms such as “insuffisance surrenale aigue” (Acute Adrenal insufficiency), “insuffisance surrenale primaire” (Primary adrenal insufficiency)). These preliminary good results encouraged us to apply the combination of these string matching methods on other health terminologies. The correspondences found between two terminologies in their French version may be projected on their versions in other languages. As perspectives of this study, these methods will be completed with normalization techniques and the validation of the correspondences, manual here, will be done according to the UMLS semantic types for the terminologies included in it such as in (Mougin et al, 2011). References Bodenreider O, Nelson SJ, et al. (1998) Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies. In Proc. AMIA Symp. 1998, pp.815–819. Dice LR (1945). Measures of the amount of ecologic association between species. Ecology 26, pp.297–302. Levenshtein VI (1965) Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Dokl.10, pp.707–10. Lindberg DA, Humphreys BL, et al. (1993) The Unified Medical Language System, Methods Inf Med 32(4): 281–291. Merabti T, Soualmia LF, et al. (2012) Aligning Biomedical Terminologies in French: Towards Semantic Interoperability in Medical Applications. In Book Medical informatics, InTech, pp.41–68. Milicic Brandt M, Rath A, et al. (2011) Mapping Orphanet terminology to UMLS. In Proc. AIME, LNAI 6747, pp.194–203. Mougin F, Dupuch M, et al. (2011) Improving the mapping between MedDRA and SNOMED CT. In Proc. AIME. LNAI 6747, pp. 220-224. Stoilos G, Stamou G, et al. (2005) A string Metric for Ontology Alignment. In Proc. ISWC, pp.624–37. Winkler W (1999) The state record linkage and current research problems. Technical report: Statistics of Income Division, Internal Revenue Service Publication. Yujian L, Bo L (2007) A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1091–1095. Note: Figures tables an equations are available in PDF version only." @default.
- W2015326825 created "2016-06-24" @default.
- W2015326825 creator A5015805321 @default.
- W2015326825 creator A5045417015 @default.
- W2015326825 creator A5047697321 @default.
- W2015326825 date "2012-11-09" @default.
- W2015326825 modified "2023-09-26" @default.
- W2015326825 title "Extracting correspondences between terminologies for an easier access to biomedical information" @default.
- W2015326825 cites W110953263 @default.
- W2015326825 cites W150479364 @default.
- W2015326825 cites W1524770959 @default.
- W2015326825 cites W1647671624 @default.
- W2015326825 cites W1987869189 @default.
- W2015326825 cites W2100684772 @default.
- W2015326825 cites W2101747459 @default.
- W2015326825 cites W2156279557 @default.
- W2015326825 cites W2162311237 @default.
- W2015326825 doi "https://doi.org/10.14806/ej.18.b.576" @default.
- W2015326825 hasPublicationYear "2012" @default.
- W2015326825 type Work @default.
- W2015326825 sameAs 2015326825 @default.
- W2015326825 citedByCount "0" @default.
- W2015326825 crossrefType "journal-article" @default.
- W2015326825 hasAuthorship W2015326825A5015805321 @default.
- W2015326825 hasAuthorship W2015326825A5045417015 @default.
- W2015326825 hasAuthorship W2015326825A5047697321 @default.
- W2015326825 hasBestOaLocation W20153268251 @default.
- W2015326825 hasConcept C110615152 @default.
- W2015326825 hasConcept C111472728 @default.
- W2015326825 hasConcept C136764020 @default.
- W2015326825 hasConcept C137982476 @default.
- W2015326825 hasConcept C138885662 @default.
- W2015326825 hasConcept C154945302 @default.
- W2015326825 hasConcept C2129575 @default.
- W2015326825 hasConcept C23123220 @default.
- W2015326825 hasConcept C25810664 @default.
- W2015326825 hasConcept C2776321320 @default.
- W2015326825 hasConcept C2777601683 @default.
- W2015326825 hasConcept C41008148 @default.
- W2015326825 hasConcept C41895202 @default.
- W2015326825 hasConcept C50971890 @default.
- W2015326825 hasConcept C69505689 @default.
- W2015326825 hasConcept C75165309 @default.
- W2015326825 hasConcept C78726541 @default.
- W2015326825 hasConceptScore W2015326825C110615152 @default.
- W2015326825 hasConceptScore W2015326825C111472728 @default.
- W2015326825 hasConceptScore W2015326825C136764020 @default.
- W2015326825 hasConceptScore W2015326825C137982476 @default.
- W2015326825 hasConceptScore W2015326825C138885662 @default.
- W2015326825 hasConceptScore W2015326825C154945302 @default.
- W2015326825 hasConceptScore W2015326825C2129575 @default.
- W2015326825 hasConceptScore W2015326825C23123220 @default.
- W2015326825 hasConceptScore W2015326825C25810664 @default.
- W2015326825 hasConceptScore W2015326825C2776321320 @default.
- W2015326825 hasConceptScore W2015326825C2777601683 @default.
- W2015326825 hasConceptScore W2015326825C41008148 @default.
- W2015326825 hasConceptScore W2015326825C41895202 @default.
- W2015326825 hasConceptScore W2015326825C50971890 @default.
- W2015326825 hasConceptScore W2015326825C69505689 @default.
- W2015326825 hasConceptScore W2015326825C75165309 @default.
- W2015326825 hasConceptScore W2015326825C78726541 @default.
- W2015326825 hasIssue "B" @default.
- W2015326825 hasLocation W20153268251 @default.
- W2015326825 hasOpenAccess W2015326825 @default.
- W2015326825 hasPrimaryLocation W20153268251 @default.
- W2015326825 hasRelatedWork W1566018662 @default.
- W2015326825 hasRelatedWork W176211028 @default.
- W2015326825 hasRelatedWork W1822004795 @default.
- W2015326825 hasRelatedWork W1869958975 @default.
- W2015326825 hasRelatedWork W2087361167 @default.
- W2015326825 hasRelatedWork W2171313960 @default.
- W2015326825 hasRelatedWork W2171690228 @default.
- W2015326825 hasRelatedWork W2189133605 @default.
- W2015326825 hasRelatedWork W2294912395 @default.
- W2015326825 hasRelatedWork W2332730167 @default.
- W2015326825 hasRelatedWork W2401961673 @default.
- W2015326825 hasRelatedWork W2585620645 @default.
- W2015326825 hasRelatedWork W2611386781 @default.
- W2015326825 hasRelatedWork W2884813381 @default.
- W2015326825 hasRelatedWork W2887442125 @default.
- W2015326825 hasRelatedWork W3092008332 @default.
- W2015326825 hasRelatedWork W3130971878 @default.
- W2015326825 hasRelatedWork W2123859802 @default.
- W2015326825 hasRelatedWork W2183609570 @default.
- W2015326825 hasRelatedWork W44286267 @default.
- W2015326825 hasVolume "18" @default.
- W2015326825 isParatext "false" @default.
- W2015326825 isRetracted "false" @default.
- W2015326825 magId "2015326825" @default.
- W2015326825 workType "article" @default.