Matches in SemOpenAlex for { <https://semopenalex.org/work/W4311219784> ?p ?o ?g. }
- W4311219784 abstract "Homologous series are groups of related compounds that share the same core structure attached to a motif that repeats to different degrees. Compounds forming homologous series are of interest in multiple domains, including natural products, environmental chemistry, and drug design. However, many homologous compounds remain unannotated as such in compound datasets, which poses obstacles to understanding chemical diversity and their analytical identification via database matching. To overcome these challenges, an algorithm to detect homologous series within compound datasets was developed and implemented using the RDKit. The algorithm takes a list of molecules as SMILES strings and a monomer (i.e., repeating unit) encoded as SMARTS as its main inputs. In an iterative process, substructure matching of repeating units, molecule fragmentation, and core detection lead to homologous series classification through grouping of identical cores. Three open compound datasets from environmental chemistry (NORMAN Suspect List Exchange, NORMAN-SLE), exposomics (PubChemLite for Exposomics), and natural products (the COlleCtion of Open NatUral producTs, COCONUT) were subject to homologous series classification using the algorithm. Over 2000, 12,000, and 5000 series with CH2 repeating units were classified in the NORMAN-SLE, PubChemLite, and COCONUT respectively. Validation of classified series was performed using published homologous series and structure categories, including a comparison with a similar existing method for categorising PFAS compounds. The OngLai algorithm and its implementation for classifying homologues are openly available at: https://github.com/adelenelai/onglai-classify-homologues ." @default.
- W4311219784 created "2022-12-24" @default.
- W4311219784 creator A5001536528 @default.
- W4311219784 creator A5016319274 @default.
- W4311219784 creator A5039069854 @default.
- W4311219784 creator A5040194803 @default.
- W4311219784 date "2022-12-13" @default.
- W4311219784 modified "2023-09-25" @default.
- W4311219784 title "An algorithm to classify homologous series within compound datasets" @default.
- W4311219784 cites W1937338252 @default.
- W4311219784 cites W1976892175 @default.
- W4311219784 cites W1982870104 @default.
- W4311219784 cites W1988037271 @default.
- W4311219784 cites W1999440281 @default.
- W4311219784 cites W2018622518 @default.
- W4311219784 cites W2033542580 @default.
- W4311219784 cites W2044525777 @default.
- W4311219784 cites W2044834685 @default.
- W4311219784 cites W2051760084 @default.
- W4311219784 cites W2055102854 @default.
- W4311219784 cites W2060531713 @default.
- W4311219784 cites W2071234745 @default.
- W4311219784 cites W2071653911 @default.
- W4311219784 cites W2078432427 @default.
- W4311219784 cites W2103626206 @default.
- W4311219784 cites W2118086604 @default.
- W4311219784 cites W2145885671 @default.
- W4311219784 cites W2179066627 @default.
- W4311219784 cites W2179324695 @default.
- W4311219784 cites W2213735563 @default.
- W4311219784 cites W2315565293 @default.
- W4311219784 cites W2318794083 @default.
- W4311219784 cites W2591002382 @default.
- W4311219784 cites W2783024199 @default.
- W4311219784 cites W2804460419 @default.
- W4311219784 cites W2900920506 @default.
- W4311219784 cites W2903331367 @default.
- W4311219784 cites W2963613589 @default.
- W4311219784 cites W2964237212 @default.
- W4311219784 cites W2968362538 @default.
- W4311219784 cites W2969475047 @default.
- W4311219784 cites W2977052496 @default.
- W4311219784 cites W3023313568 @default.
- W4311219784 cites W3043559002 @default.
- W4311219784 cites W3096321011 @default.
- W4311219784 cites W3097145107 @default.
- W4311219784 cites W3097605476 @default.
- W4311219784 cites W3111515949 @default.
- W4311219784 cites W3118695441 @default.
- W4311219784 cites W3120715532 @default.
- W4311219784 cites W3121899439 @default.
- W4311219784 cites W3133965623 @default.
- W4311219784 cites W3159789740 @default.
- W4311219784 cites W3163850647 @default.
- W4311219784 cites W3196865634 @default.
- W4311219784 cites W3211416309 @default.
- W4311219784 cites W3216996724 @default.
- W4311219784 cites W4200417966 @default.
- W4311219784 cites W4205683904 @default.
- W4311219784 cites W4210869613 @default.
- W4311219784 cites W4213287388 @default.
- W4311219784 cites W4280641045 @default.
- W4311219784 cites W4286497146 @default.
- W4311219784 cites W4294534758 @default.
- W4311219784 cites W4297256900 @default.
- W4311219784 cites W4306986298 @default.
- W4311219784 cites W4308683713 @default.
- W4311219784 doi "https://doi.org/10.1186/s13321-022-00663-y" @default.
- W4311219784 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36510332" @default.
- W4311219784 hasPublicationYear "2022" @default.
- W4311219784 type Work @default.
- W4311219784 citedByCount "1" @default.
- W4311219784 countsByYear W43112197842023 @default.
- W4311219784 crossrefType "journal-article" @default.
- W4311219784 hasAuthorship W4311219784A5001536528 @default.
- W4311219784 hasAuthorship W4311219784A5016319274 @default.
- W4311219784 hasAuthorship W4311219784A5039069854 @default.
- W4311219784 hasAuthorship W4311219784A5040194803 @default.
- W4311219784 hasBestOaLocation W43112197841 @default.
- W4311219784 hasConcept C104317684 @default.
- W4311219784 hasConcept C105795698 @default.
- W4311219784 hasConcept C11413529 @default.
- W4311219784 hasConcept C124101348 @default.
- W4311219784 hasConcept C143724316 @default.
- W4311219784 hasConcept C151730666 @default.
- W4311219784 hasConcept C154945302 @default.
- W4311219784 hasConcept C165064840 @default.
- W4311219784 hasConcept C185592680 @default.
- W4311219784 hasConcept C23123220 @default.
- W4311219784 hasConcept C2776482079 @default.
- W4311219784 hasConcept C33923547 @default.
- W4311219784 hasConcept C41008148 @default.
- W4311219784 hasConcept C54355233 @default.
- W4311219784 hasConcept C64894306 @default.
- W4311219784 hasConcept C70721500 @default.
- W4311219784 hasConcept C8010536 @default.
- W4311219784 hasConcept C86803240 @default.
- W4311219784 hasConceptScore W4311219784C104317684 @default.
- W4311219784 hasConceptScore W4311219784C105795698 @default.
- W4311219784 hasConceptScore W4311219784C11413529 @default.