Matches in SemOpenAlex for { <https://semopenalex.org/work/W2165897980> ?p ?o ?g. }
- W2165897980 endingPage "383" @default.
- W2165897980 startingPage "370" @default.
- W2165897980 abstract "Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers, the equivalent of society is database, and the equivalent of use is a way to search the database. We present a new theory of similarity between words and phrases based on information distance and Kolmogorov complexity. To fix thoughts, we use the World Wide Web (WWW) as the database, and Google as the search engine. The method is also applicable to other search engines and databases. This theory is then applied to construct a method to automatically extract similarity, the Google similarity distance, of words and phrases from the WWW using Google page counts. The WWW is the largest database on earth, and the context information entered by millions of independent users averages out to provide automatic semantics of useful quality. We give applications in hierarchical clustering, classification, and language translation. We give examples to distinguish between colors and numbers, cluster names of paintings by 17th century Dutch masters and names of books by English novelists, the ability to understand emergencies and primes, and we demonstrate the ability to do a simple automatic English-Spanish translation. Finally, we use the WordNet database as an objective baseline against which to judge the performance of our method. We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87 percent with the expert crafted WordNet categories" @default.
- W2165897980 created "2016-06-24" @default.
- W2165897980 creator A5008815615 @default.
- W2165897980 creator A5029628745 @default.
- W2165897980 date "2007-03-01" @default.
- W2165897980 modified "2023-10-13" @default.
- W2165897980 title "The Google Similarity Distance" @default.
- W2165897980 cites W1575912334 @default.
- W2165897980 cites W1983578042 @default.
- W2165897980 cites W1995875735 @default.
- W2165897980 cites W2014706780 @default.
- W2165897980 cites W2017739727 @default.
- W2165897980 cites W2036373663 @default.
- W2165897980 cites W2066277072 @default.
- W2165897980 cites W2088930249 @default.
- W2165897980 cites W2093641143 @default.
- W2165897980 cites W2099111195 @default.
- W2165897980 cites W2107658650 @default.
- W2165897980 cites W2124416056 @default.
- W2165897980 cites W2125598538 @default.
- W2165897980 cites W2128859735 @default.
- W2165897980 cites W2129620300 @default.
- W2165897980 cites W2139212933 @default.
- W2165897980 cites W2144221002 @default.
- W2165897980 cites W2153635508 @default.
- W2165897980 cites W2166064672 @default.
- W2165897980 cites W2295278941 @default.
- W2165897980 cites W3101660565 @default.
- W2165897980 cites W3104365962 @default.
- W2165897980 cites W4230960895 @default.
- W2165897980 doi "https://doi.org/10.1109/tkde.2007.48" @default.
- W2165897980 hasPublicationYear "2007" @default.
- W2165897980 type Work @default.
- W2165897980 sameAs 2165897980 @default.
- W2165897980 citedByCount "1615" @default.
- W2165897980 countsByYear W21658979802012 @default.
- W2165897980 countsByYear W21658979802013 @default.
- W2165897980 countsByYear W21658979802014 @default.
- W2165897980 countsByYear W21658979802015 @default.
- W2165897980 countsByYear W21658979802016 @default.
- W2165897980 countsByYear W21658979802017 @default.
- W2165897980 countsByYear W21658979802018 @default.
- W2165897980 countsByYear W21658979802019 @default.
- W2165897980 countsByYear W21658979802020 @default.
- W2165897980 countsByYear W21658979802021 @default.
- W2165897980 countsByYear W21658979802022 @default.
- W2165897980 countsByYear W21658979802023 @default.
- W2165897980 crossrefType "journal-article" @default.
- W2165897980 hasAuthorship W2165897980A5008815615 @default.
- W2165897980 hasAuthorship W2165897980A5029628745 @default.
- W2165897980 hasConcept C103278499 @default.
- W2165897980 hasConcept C115961682 @default.
- W2165897980 hasConcept C130318100 @default.
- W2165897980 hasConcept C151730666 @default.
- W2165897980 hasConcept C154945302 @default.
- W2165897980 hasConcept C157659113 @default.
- W2165897980 hasConcept C184337299 @default.
- W2165897980 hasConcept C188338183 @default.
- W2165897980 hasConcept C199360897 @default.
- W2165897980 hasConcept C204321447 @default.
- W2165897980 hasConcept C23123220 @default.
- W2165897980 hasConcept C2776224158 @default.
- W2165897980 hasConcept C2779343474 @default.
- W2165897980 hasConcept C2780403423 @default.
- W2165897980 hasConcept C2780801425 @default.
- W2165897980 hasConcept C34736171 @default.
- W2165897980 hasConcept C41008148 @default.
- W2165897980 hasConcept C73555534 @default.
- W2165897980 hasConcept C86803240 @default.
- W2165897980 hasConceptScore W2165897980C103278499 @default.
- W2165897980 hasConceptScore W2165897980C115961682 @default.
- W2165897980 hasConceptScore W2165897980C130318100 @default.
- W2165897980 hasConceptScore W2165897980C151730666 @default.
- W2165897980 hasConceptScore W2165897980C154945302 @default.
- W2165897980 hasConceptScore W2165897980C157659113 @default.
- W2165897980 hasConceptScore W2165897980C184337299 @default.
- W2165897980 hasConceptScore W2165897980C188338183 @default.
- W2165897980 hasConceptScore W2165897980C199360897 @default.
- W2165897980 hasConceptScore W2165897980C204321447 @default.
- W2165897980 hasConceptScore W2165897980C23123220 @default.
- W2165897980 hasConceptScore W2165897980C2776224158 @default.
- W2165897980 hasConceptScore W2165897980C2779343474 @default.
- W2165897980 hasConceptScore W2165897980C2780403423 @default.
- W2165897980 hasConceptScore W2165897980C2780801425 @default.
- W2165897980 hasConceptScore W2165897980C34736171 @default.
- W2165897980 hasConceptScore W2165897980C41008148 @default.
- W2165897980 hasConceptScore W2165897980C73555534 @default.
- W2165897980 hasConceptScore W2165897980C86803240 @default.
- W2165897980 hasIssue "3" @default.
- W2165897980 hasLocation W21658979801 @default.
- W2165897980 hasOpenAccess W2165897980 @default.
- W2165897980 hasPrimaryLocation W21658979801 @default.
- W2165897980 hasRelatedWork W1020482675 @default.
- W2165897980 hasRelatedWork W2160266024 @default.
- W2165897980 hasRelatedWork W2529667321 @default.
- W2165897980 hasRelatedWork W2564015900 @default.
- W2165897980 hasRelatedWork W2963829519 @default.
- W2165897980 hasRelatedWork W3037446375 @default.