Matches in SemOpenAlex for { <https://semopenalex.org/work/W2279447319> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W2279447319 abstract "Digital collections are ubiquitous. However, not all digital collections are the same. While most digital collections have limited forms of change—primarily creation and deletion of additional resources—there exists a class of digital collections that undergo additional kinds of change. These collections are made up of resources that are distributed across the Internet and brought together into the collection via hyperlinking. This means the underlying collection members are not controlled by the curator of the collection. Resources can be expected to change as time goes on. To further complicate matters these collections can be hard to maintain when they are large, highly dynamic, or lacking active curation. Part of the difficulty in maintaining these collections is determining if a changed page is still a valid member of the collection. While others have tried to address this problem by measuring change and defining a maximum allowed threshold of change, these methods treat all change as a potential problems and treat web content as a static document despite its intrinsically dynamic nature. Instead, I approach the problem of determining significance of change on the web by embracing it as a normal part of a web document’s lifecycle, Instead of using thresholds to identify abnormal changes, I determine the difference between what a maintainer expects a page to do and what it actually does. These models are created using a variety of feature extractors to find pertinent information in a page, a Kalman filter to model the history of a page and predict a next version and finally classification of results into either expected or unexpected change. I evaluate the different options for extractors and analyzers to determine the best options from my suite of possibilities. This work is informed by a series of studies on both web pages and potential collection maintainers, observations of the NSDL Pathways, and a ground-truth set of blog changes tagged by a human judgment of the kind of change. The results of this work showed a statistically significant improvement over a range of traditional threshold techniques when applied to the collection of tagged blog changes." @default.
- W2279447319 created "2016-06-24" @default.
- W2279447319 creator A5036207141 @default.
- W2279447319 creator A5065474822 @default.
- W2279447319 creator A5077860280 @default.
- W2279447319 date "2011-01-01" @default.
- W2279447319 modified "2023-10-16" @default.
- W2279447319 title "Intelligent information interaction for managing distributed collections of web documents" @default.
- W2279447319 hasPublicationYear "2011" @default.
- W2279447319 type Work @default.
- W2279447319 sameAs 2279447319 @default.
- W2279447319 citedByCount "0" @default.
- W2279447319 crossrefType "journal-article" @default.
- W2279447319 hasAuthorship W2279447319A5036207141 @default.
- W2279447319 hasAuthorship W2279447319A5065474822 @default.
- W2279447319 hasAuthorship W2279447319A5077860280 @default.
- W2279447319 hasConcept C110875604 @default.
- W2279447319 hasConcept C136197465 @default.
- W2279447319 hasConcept C136764020 @default.
- W2279447319 hasConcept C154945302 @default.
- W2279447319 hasConcept C21959979 @default.
- W2279447319 hasConcept C23123220 @default.
- W2279447319 hasConcept C2522767166 @default.
- W2279447319 hasConcept C30088001 @default.
- W2279447319 hasConcept C41008148 @default.
- W2279447319 hasConceptScore W2279447319C110875604 @default.
- W2279447319 hasConceptScore W2279447319C136197465 @default.
- W2279447319 hasConceptScore W2279447319C136764020 @default.
- W2279447319 hasConceptScore W2279447319C154945302 @default.
- W2279447319 hasConceptScore W2279447319C21959979 @default.
- W2279447319 hasConceptScore W2279447319C23123220 @default.
- W2279447319 hasConceptScore W2279447319C2522767166 @default.
- W2279447319 hasConceptScore W2279447319C30088001 @default.
- W2279447319 hasConceptScore W2279447319C41008148 @default.
- W2279447319 hasLocation W22794473191 @default.
- W2279447319 hasOpenAccess W2279447319 @default.
- W2279447319 hasPrimaryLocation W22794473191 @default.
- W2279447319 hasRelatedWork W1589596834 @default.
- W2279447319 hasRelatedWork W1595922228 @default.
- W2279447319 hasRelatedWork W199772383 @default.
- W2279447319 hasRelatedWork W2074419020 @default.
- W2279447319 hasRelatedWork W2244087409 @default.
- W2279447319 hasRelatedWork W2253295712 @default.
- W2279447319 hasRelatedWork W2254639862 @default.
- W2279447319 hasRelatedWork W2292017106 @default.
- W2279447319 hasRelatedWork W2321285851 @default.
- W2279447319 hasRelatedWork W2341503649 @default.
- W2279447319 hasRelatedWork W2362290660 @default.
- W2279447319 hasRelatedWork W2395052005 @default.
- W2279447319 hasRelatedWork W2483887866 @default.
- W2279447319 hasRelatedWork W2763368886 @default.
- W2279447319 hasRelatedWork W2952908938 @default.
- W2279447319 hasRelatedWork W2992170752 @default.
- W2279447319 hasRelatedWork W3112196104 @default.
- W2279447319 hasRelatedWork W3123765219 @default.
- W2279447319 hasRelatedWork W3150975563 @default.
- W2279447319 hasRelatedWork W323252830 @default.
- W2279447319 isParatext "false" @default.
- W2279447319 isRetracted "false" @default.
- W2279447319 magId "2279447319" @default.
- W2279447319 workType "article" @default.