Matches in SemOpenAlex for { <https://semopenalex.org/work/W2397807780> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W2397807780 abstract "Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pages that require authorization or prior registration. In particular, they ignore the tremendous amount of high quality content “hidden” behind search forms, in large searchable electronic databases. In this paper, we address the problem of designing a crawler capable of extracting content from this hidden Web. We introduce a generic operational model of a hidden Web crawler and describe how this model is realized in HiWE (Hidden Web Exposer), a prototype crawler built at Stanford. We introduce a new Layout-based Information Extraction Technique (LITE) and demonstrate its use in automatically extracting semantic information from search forms and response pages. We also present results from experiments conducted to test and validate our techniques." @default.
- W2397807780 created "2016-06-24" @default.
- W2397807780 creator A5015609009 @default.
- W2397807780 creator A5055883336 @default.
- W2397807780 date "2001-01-01" @default.
- W2397807780 modified "2023-09-25" @default.
- W2397807780 title "Crawling the Hidden Web." @default.
- W2397807780 cites W1659541576 @default.
- W2397807780 cites W2170188121 @default.
- W2397807780 hasPublicationYear "2001" @default.
- W2397807780 type Work @default.
- W2397807780 sameAs 2397807780 @default.
- W2397807780 citedByCount "0" @default.
- W2397807780 crossrefType "journal-article" @default.
- W2397807780 hasAuthorship W2397807780A5015609009 @default.
- W2397807780 hasAuthorship W2397807780A5055883336 @default.
- W2397807780 hasConcept C100368936 @default.
- W2397807780 hasConcept C105702510 @default.
- W2397807780 hasConcept C136764020 @default.
- W2397807780 hasConcept C13743948 @default.
- W2397807780 hasConcept C162215914 @default.
- W2397807780 hasConcept C173576120 @default.
- W2397807780 hasConcept C177264268 @default.
- W2397807780 hasConcept C199360897 @default.
- W2397807780 hasConcept C2129575 @default.
- W2397807780 hasConcept C21959979 @default.
- W2397807780 hasConcept C23123220 @default.
- W2397807780 hasConcept C41008148 @default.
- W2397807780 hasConcept C61096286 @default.
- W2397807780 hasConcept C71924100 @default.
- W2397807780 hasConcept C73340581 @default.
- W2397807780 hasConceptScore W2397807780C100368936 @default.
- W2397807780 hasConceptScore W2397807780C105702510 @default.
- W2397807780 hasConceptScore W2397807780C136764020 @default.
- W2397807780 hasConceptScore W2397807780C13743948 @default.
- W2397807780 hasConceptScore W2397807780C162215914 @default.
- W2397807780 hasConceptScore W2397807780C173576120 @default.
- W2397807780 hasConceptScore W2397807780C177264268 @default.
- W2397807780 hasConceptScore W2397807780C199360897 @default.
- W2397807780 hasConceptScore W2397807780C2129575 @default.
- W2397807780 hasConceptScore W2397807780C21959979 @default.
- W2397807780 hasConceptScore W2397807780C23123220 @default.
- W2397807780 hasConceptScore W2397807780C41008148 @default.
- W2397807780 hasConceptScore W2397807780C61096286 @default.
- W2397807780 hasConceptScore W2397807780C71924100 @default.
- W2397807780 hasConceptScore W2397807780C73340581 @default.
- W2397807780 hasLocation W23978077801 @default.
- W2397807780 hasOpenAccess W2397807780 @default.
- W2397807780 hasPrimaryLocation W23978077801 @default.
- W2397807780 hasRelatedWork W1006696051 @default.
- W2397807780 hasRelatedWork W10239605 @default.
- W2397807780 hasRelatedWork W141834718 @default.
- W2397807780 hasRelatedWork W14800140 @default.
- W2397807780 hasRelatedWork W1483775839 @default.
- W2397807780 hasRelatedWork W1964698713 @default.
- W2397807780 hasRelatedWork W2012804264 @default.
- W2397807780 hasRelatedWork W2034085627 @default.
- W2397807780 hasRelatedWork W2101810693 @default.
- W2397807780 hasRelatedWork W2105389965 @default.
- W2397807780 hasRelatedWork W2152072793 @default.
- W2397807780 hasRelatedWork W2256906548 @default.
- W2397807780 hasRelatedWork W2315628598 @default.
- W2397807780 hasRelatedWork W2371725684 @default.
- W2397807780 hasRelatedWork W2383712336 @default.
- W2397807780 hasRelatedWork W2570716469 @default.
- W2397807780 hasRelatedWork W258608066 @default.
- W2397807780 hasRelatedWork W2888930960 @default.
- W2397807780 hasRelatedWork W2920184028 @default.
- W2397807780 hasRelatedWork W757864652 @default.
- W2397807780 isParatext "false" @default.
- W2397807780 isRetracted "false" @default.
- W2397807780 magId "2397807780" @default.
- W2397807780 workType "article" @default.