Matches in SemOpenAlex for { <https://semopenalex.org/work/W1965672806> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W1965672806 abstract "Compression of large text corpora has the potential to drastically reduce both storage requirements and per-document access costs. Adaptive methods used for general-purpose compression are ineffective for this application, and historically the most successful methods have been based on word-based dictionaries, which allow use of global properties of the text. However, these are dependent on the text complying with assumptions about content and lead to dictionaries of unpredictable size. In recent work we have described an LZ-like approach in which sampled blocks of a corpus are used as a dictionary against which the complete corpus is compressed, giving compression twice as effective than that of zlib. Here we explore how pre-processing can be used to eliminate redundancy in our sampled dictionary. Our experiments show that dictionary size can be reduced by 50% or more (less than 0.1% of the collection size) with no significant effect on compression or access speed." @default.
- W1965672806 created "2016-06-24" @default.
- W1965672806 creator A5012026692 @default.
- W1965672806 creator A5021097696 @default.
- W1965672806 creator A5041495909 @default.
- W1965672806 date "2011-07-24" @default.
- W1965672806 modified "2023-10-14" @default.
- W1965672806 title "Sample selection for dictionary-based corpus compression" @default.
- W1965672806 cites W1970765735 @default.
- W1965672806 cites W1981420413 @default.
- W1965672806 cites W2135494327 @default.
- W1965672806 doi "https://doi.org/10.1145/2009916.2010087" @default.
- W1965672806 hasPublicationYear "2011" @default.
- W1965672806 type Work @default.
- W1965672806 sameAs 1965672806 @default.
- W1965672806 citedByCount "9" @default.
- W1965672806 countsByYear W19656728062014 @default.
- W1965672806 countsByYear W19656728062015 @default.
- W1965672806 countsByYear W19656728062016 @default.
- W1965672806 crossrefType "proceedings-article" @default.
- W1965672806 hasAuthorship W1965672806A5012026692 @default.
- W1965672806 hasAuthorship W1965672806A5021097696 @default.
- W1965672806 hasAuthorship W1965672806A5041495909 @default.
- W1965672806 hasConcept C111919701 @default.
- W1965672806 hasConcept C127413603 @default.
- W1965672806 hasConcept C152124472 @default.
- W1965672806 hasConcept C154945302 @default.
- W1965672806 hasConcept C159985019 @default.
- W1965672806 hasConcept C171146098 @default.
- W1965672806 hasConcept C180016635 @default.
- W1965672806 hasConcept C192562407 @default.
- W1965672806 hasConcept C204321447 @default.
- W1965672806 hasConcept C23123220 @default.
- W1965672806 hasConcept C2474386 @default.
- W1965672806 hasConcept C2524010 @default.
- W1965672806 hasConcept C25797200 @default.
- W1965672806 hasConcept C28490314 @default.
- W1965672806 hasConcept C33923547 @default.
- W1965672806 hasConcept C41008148 @default.
- W1965672806 hasConcept C511840579 @default.
- W1965672806 hasConcept C78548338 @default.
- W1965672806 hasConcept C81917197 @default.
- W1965672806 hasConcept C90805587 @default.
- W1965672806 hasConceptScore W1965672806C111919701 @default.
- W1965672806 hasConceptScore W1965672806C127413603 @default.
- W1965672806 hasConceptScore W1965672806C152124472 @default.
- W1965672806 hasConceptScore W1965672806C154945302 @default.
- W1965672806 hasConceptScore W1965672806C159985019 @default.
- W1965672806 hasConceptScore W1965672806C171146098 @default.
- W1965672806 hasConceptScore W1965672806C180016635 @default.
- W1965672806 hasConceptScore W1965672806C192562407 @default.
- W1965672806 hasConceptScore W1965672806C204321447 @default.
- W1965672806 hasConceptScore W1965672806C23123220 @default.
- W1965672806 hasConceptScore W1965672806C2474386 @default.
- W1965672806 hasConceptScore W1965672806C2524010 @default.
- W1965672806 hasConceptScore W1965672806C25797200 @default.
- W1965672806 hasConceptScore W1965672806C28490314 @default.
- W1965672806 hasConceptScore W1965672806C33923547 @default.
- W1965672806 hasConceptScore W1965672806C41008148 @default.
- W1965672806 hasConceptScore W1965672806C511840579 @default.
- W1965672806 hasConceptScore W1965672806C78548338 @default.
- W1965672806 hasConceptScore W1965672806C81917197 @default.
- W1965672806 hasConceptScore W1965672806C90805587 @default.
- W1965672806 hasLocation W19656728061 @default.
- W1965672806 hasOpenAccess W1965672806 @default.
- W1965672806 hasPrimaryLocation W19656728061 @default.
- W1965672806 hasRelatedWork W1490458715 @default.
- W1965672806 hasRelatedWork W1670812416 @default.
- W1965672806 hasRelatedWork W1869650270 @default.
- W1965672806 hasRelatedWork W2056256749 @default.
- W1965672806 hasRelatedWork W2375179150 @default.
- W1965672806 hasRelatedWork W2378514695 @default.
- W1965672806 hasRelatedWork W2936485217 @default.
- W1965672806 hasRelatedWork W3107474891 @default.
- W1965672806 hasRelatedWork W3199939423 @default.
- W1965672806 hasRelatedWork W4293584489 @default.
- W1965672806 isParatext "false" @default.
- W1965672806 isRetracted "false" @default.
- W1965672806 magId "1965672806" @default.
- W1965672806 workType "article" @default.