Matches in SemOpenAlex for { <https://semopenalex.org/work/W50614250> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W50614250 endingPage "490" @default.
- W50614250 startingPage "482" @default.
- W50614250 abstract "We describe a compression model for semistructured documents, called Structural Contexts Model, which takes advantage of the context information usually implicit in the structure of the text. The idea is to use a separate semiadaptive model to compress the text that lies inside each different structure type (e.g., different XML tag). The intuition behind the idea is that the distribution of all the texts that belong to a given structure type should be similar, and different from that of other structure types. We test our idea using a word-based Huffman coding, which is the standard for compressing large natural language textual databases, and show that our compression method obtains significant improvements in compression ratios. We also analyze the possibility that storing separate models may not pay of if the distribution of different structure types is not different enough, and present a heuristic to merge models with the aim of minimizing the total size of the compressed database. This technique gives an additional improvement over the plain technique. The comparison against existing prototypes shows that our method is a competitive choice for compressed text databases.KeywordsText CompressionCompression ModelSemistructured DocumentsText Databases" @default.
- W50614250 created "2016-06-24" @default.
- W50614250 creator A5034122674 @default.
- W50614250 creator A5080743153 @default.
- W50614250 creator A5087465110 @default.
- W50614250 date "2003-01-01" @default.
- W50614250 modified "2023-09-27" @default.
- W50614250 title "Compressing Semistructured Text Databases" @default.
- W50614250 cites W1546738280 @default.
- W50614250 cites W1566656449 @default.
- W50614250 cites W1975965284 @default.
- W50614250 cites W2013849299 @default.
- W50614250 cites W2036407715 @default.
- W50614250 cites W2135494327 @default.
- W50614250 cites W2887107689 @default.
- W50614250 cites W4235095233 @default.
- W50614250 doi "https://doi.org/10.1007/3-540-36618-0_34" @default.
- W50614250 hasPublicationYear "2003" @default.
- W50614250 type Work @default.
- W50614250 sameAs 50614250 @default.
- W50614250 citedByCount "0" @default.
- W50614250 crossrefType "book-chapter" @default.
- W50614250 hasAuthorship W50614250A5034122674 @default.
- W50614250 hasAuthorship W50614250A5080743153 @default.
- W50614250 hasAuthorship W50614250A5087465110 @default.
- W50614250 hasConcept C105795698 @default.
- W50614250 hasConcept C111472728 @default.
- W50614250 hasConcept C132010649 @default.
- W50614250 hasConcept C136764020 @default.
- W50614250 hasConcept C138885662 @default.
- W50614250 hasConcept C154945302 @default.
- W50614250 hasConcept C179518139 @default.
- W50614250 hasConcept C197129107 @default.
- W50614250 hasConcept C204321447 @default.
- W50614250 hasConcept C23123220 @default.
- W50614250 hasConcept C33923547 @default.
- W50614250 hasConcept C41008148 @default.
- W50614250 hasConcept C46900642 @default.
- W50614250 hasConcept C77088390 @default.
- W50614250 hasConcept C78548338 @default.
- W50614250 hasConcept C80444323 @default.
- W50614250 hasConcept C8797682 @default.
- W50614250 hasConceptScore W50614250C105795698 @default.
- W50614250 hasConceptScore W50614250C111472728 @default.
- W50614250 hasConceptScore W50614250C132010649 @default.
- W50614250 hasConceptScore W50614250C136764020 @default.
- W50614250 hasConceptScore W50614250C138885662 @default.
- W50614250 hasConceptScore W50614250C154945302 @default.
- W50614250 hasConceptScore W50614250C179518139 @default.
- W50614250 hasConceptScore W50614250C197129107 @default.
- W50614250 hasConceptScore W50614250C204321447 @default.
- W50614250 hasConceptScore W50614250C23123220 @default.
- W50614250 hasConceptScore W50614250C33923547 @default.
- W50614250 hasConceptScore W50614250C41008148 @default.
- W50614250 hasConceptScore W50614250C46900642 @default.
- W50614250 hasConceptScore W50614250C77088390 @default.
- W50614250 hasConceptScore W50614250C78548338 @default.
- W50614250 hasConceptScore W50614250C80444323 @default.
- W50614250 hasConceptScore W50614250C8797682 @default.
- W50614250 hasLocation W506142501 @default.
- W50614250 hasOpenAccess W50614250 @default.
- W50614250 hasPrimaryLocation W506142501 @default.
- W50614250 hasRelatedWork W1605476942 @default.
- W50614250 hasRelatedWork W1667554445 @default.
- W50614250 hasRelatedWork W1775720859 @default.
- W50614250 hasRelatedWork W1788528807 @default.
- W50614250 hasRelatedWork W188567224 @default.
- W50614250 hasRelatedWork W1988859703 @default.
- W50614250 hasRelatedWork W2101955803 @default.
- W50614250 hasRelatedWork W2101966962 @default.
- W50614250 hasRelatedWork W2361533086 @default.
- W50614250 hasRelatedWork W3107474891 @default.
- W50614250 isParatext "false" @default.
- W50614250 isRetracted "false" @default.
- W50614250 magId "50614250" @default.
- W50614250 workType "book-chapter" @default.