Matches in SemOpenAlex for { <https://semopenalex.org/work/W3045522182> ?p ?o ?g. }
- W3045522182 endingPage "3446" @default.
- W3045522182 startingPage "3446" @default.
- W3045522182 abstract "Molecular similarity is an elusive but core “unsupervised” cheminformatics concept, yet different “fingerprint” encodings of molecular structures return very different similarity values, even when using the same similarity metric. Each encoding may be of value when applied to other problems with objective or target functions, implying that a priori none are “better” than the others, nor than encoding-free metrics such as maximum common substructure (MCSS). We here introduce a novel approach to molecular similarity, in the form of a variational autoencoder (VAE). This learns the joint distribution p(z|x) where z is a latent vector and x are the (same) input/output data. It takes the form of a “bowtie”-shaped artificial neural network. In the middle is a “bottleneck layer” or latent vector in which inputs are transformed into, and represented as, a vector of numbers (encoding), with a reverse process (decoding) seeking to return the SMILES string that was the input. We train a VAE on over six million druglike molecules and natural products (including over one million in the final holdout set). The VAE vector distances provide a rapid and novel metric for molecular similarity that is both easily and rapidly calculated. We describe the method and its application to a typical similarity problem in cheminformatics." @default.
- W3045522182 created "2020-08-03" @default.
- W3045522182 creator A5040686060 @default.
- W3045522182 creator A5051816210 @default.
- W3045522182 creator A5062241736 @default.
- W3045522182 creator A5077370611 @default.
- W3045522182 creator A5091484246 @default.
- W3045522182 date "2020-07-29" @default.
- W3045522182 modified "2023-10-18" @default.
- W3045522182 title "VAE-Sim: A Novel Molecular Similarity Measure Based on a Variational Autoencoder" @default.
- W3045522182 cites W1013989637 @default.
- W3045522182 cites W1534441087 @default.
- W3045522182 cites W1928390080 @default.
- W3045522182 cites W1973301610 @default.
- W3045522182 cites W1975147762 @default.
- W3045522182 cites W1981988682 @default.
- W3045522182 cites W1984673823 @default.
- W3045522182 cites W1988037271 @default.
- W3045522182 cites W1988115241 @default.
- W3045522182 cites W1990961668 @default.
- W3045522182 cites W1992578437 @default.
- W3045522182 cites W2000747708 @default.
- W3045522182 cites W2008127220 @default.
- W3045522182 cites W2016979469 @default.
- W3045522182 cites W2018288400 @default.
- W3045522182 cites W2021748110 @default.
- W3045522182 cites W2030909788 @default.
- W3045522182 cites W2032008870 @default.
- W3045522182 cites W2041567094 @default.
- W3045522182 cites W2042007894 @default.
- W3045522182 cites W2064661612 @default.
- W3045522182 cites W2071551353 @default.
- W3045522182 cites W2080754816 @default.
- W3045522182 cites W2109630522 @default.
- W3045522182 cites W2119512897 @default.
- W3045522182 cites W2124290836 @default.
- W3045522182 cites W2132842501 @default.
- W3045522182 cites W2133098435 @default.
- W3045522182 cites W2135920081 @default.
- W3045522182 cites W2137983211 @default.
- W3045522182 cites W2143759264 @default.
- W3045522182 cites W2151554678 @default.
- W3045522182 cites W2155741020 @default.
- W3045522182 cites W2163922914 @default.
- W3045522182 cites W2168480393 @default.
- W3045522182 cites W2169863228 @default.
- W3045522182 cites W2171658832 @default.
- W3045522182 cites W2174991771 @default.
- W3045522182 cites W2200017991 @default.
- W3045522182 cites W2290847742 @default.
- W3045522182 cites W2320034101 @default.
- W3045522182 cites W2412446857 @default.
- W3045522182 cites W2472085920 @default.
- W3045522182 cites W2517981151 @default.
- W3045522182 cites W2558461359 @default.
- W3045522182 cites W2578240541 @default.
- W3045522182 cites W2590907701 @default.
- W3045522182 cites W2594247694 @default.
- W3045522182 cites W2765224015 @default.
- W3045522182 cites W2769484073 @default.
- W3045522182 cites W2777416523 @default.
- W3045522182 cites W2790608062 @default.
- W3045522182 cites W2792130717 @default.
- W3045522182 cites W2793051516 @default.
- W3045522182 cites W2809927589 @default.
- W3045522182 cites W2810417098 @default.
- W3045522182 cites W2889326414 @default.
- W3045522182 cites W2891868449 @default.
- W3045522182 cites W2901476322 @default.
- W3045522182 cites W2906697496 @default.
- W3045522182 cites W2915175970 @default.
- W3045522182 cites W2925830236 @default.
- W3045522182 cites W2950838202 @default.
- W3045522182 cites W2963445908 @default.
- W3045522182 cites W2991736596 @default.
- W3045522182 cites W2992072991 @default.
- W3045522182 cites W2992613109 @default.
- W3045522182 cites W2995422623 @default.
- W3045522182 cites W2998473774 @default.
- W3045522182 cites W3003906593 @default.
- W3045522182 cites W3006714852 @default.
- W3045522182 cites W3011286504 @default.
- W3045522182 cites W3030068589 @default.
- W3045522182 cites W3030727442 @default.
- W3045522182 cites W3098269892 @default.
- W3045522182 cites W955320968 @default.
- W3045522182 doi "https://doi.org/10.3390/molecules25153446" @default.
- W3045522182 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/7435890" @default.
- W3045522182 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/32751155" @default.
- W3045522182 hasPublicationYear "2020" @default.
- W3045522182 type Work @default.
- W3045522182 sameAs 3045522182 @default.
- W3045522182 citedByCount "21" @default.
- W3045522182 countsByYear W30455221822020 @default.
- W3045522182 countsByYear W30455221822021 @default.
- W3045522182 countsByYear W30455221822022 @default.
- W3045522182 countsByYear W30455221822023 @default.
- W3045522182 crossrefType "journal-article" @default.