Matches in SemOpenAlex for { <https://semopenalex.org/work/W1578487916> ?p ?o ?g. }
- W1578487916 abstract "The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents." @default.
- W1578487916 created "2016-06-24" @default.
- W1578487916 creator A5068181614 @default.
- W1578487916 date "2017-11-13" @default.
- W1578487916 modified "2023-09-26" @default.
- W1578487916 title "Efficient Storage and Domain-Specific Information Discovery on Semistructured Documents" @default.
- W1578487916 cites W135761774 @default.
- W1578487916 cites W1486753193 @default.
- W1578487916 cites W1488257135 @default.
- W1578487916 cites W1519066168 @default.
- W1578487916 cites W1519310261 @default.
- W1578487916 cites W1529377943 @default.
- W1578487916 cites W1538246304 @default.
- W1578487916 cites W1539585011 @default.
- W1578487916 cites W1544610736 @default.
- W1578487916 cites W1548569630 @default.
- W1578487916 cites W1554870463 @default.
- W1578487916 cites W1555563750 @default.
- W1578487916 cites W1571308995 @default.
- W1578487916 cites W1592239731 @default.
- W1578487916 cites W1601749143 @default.
- W1578487916 cites W1628571627 @default.
- W1578487916 cites W1655092483 @default.
- W1578487916 cites W1659833910 @default.
- W1578487916 cites W1660390307 @default.
- W1578487916 cites W172004165 @default.
- W1578487916 cites W1720090307 @default.
- W1578487916 cites W1759313566 @default.
- W1578487916 cites W1767299350 @default.
- W1578487916 cites W1769314122 @default.
- W1578487916 cites W1833785989 @default.
- W1578487916 cites W1860420205 @default.
- W1578487916 cites W1873894770 @default.
- W1578487916 cites W1964013848 @default.
- W1578487916 cites W1964013940 @default.
- W1578487916 cites W1965014786 @default.
- W1578487916 cites W1965452491 @default.
- W1578487916 cites W1970348112 @default.
- W1578487916 cites W1973828215 @default.
- W1578487916 cites W1975009259 @default.
- W1578487916 cites W1976373002 @default.
- W1578487916 cites W1977097246 @default.
- W1578487916 cites W1977222363 @default.
- W1578487916 cites W1978478796 @default.
- W1578487916 cites W1979459060 @default.
- W1578487916 cites W1981387614 @default.
- W1578487916 cites W1985720360 @default.
- W1578487916 cites W1986029135 @default.
- W1578487916 cites W1986727175 @default.
- W1578487916 cites W1996468759 @default.
- W1578487916 cites W2000672666 @default.
- W1578487916 cites W2000775051 @default.
- W1578487916 cites W2014415866 @default.
- W1578487916 cites W2021589354 @default.
- W1578487916 cites W2023182948 @default.
- W1578487916 cites W2027752285 @default.
- W1578487916 cites W2036014579 @default.
- W1578487916 cites W2036216970 @default.
- W1578487916 cites W2039819454 @default.
- W1578487916 cites W2040374677 @default.
- W1578487916 cites W2042281172 @default.
- W1578487916 cites W2043035982 @default.
- W1578487916 cites W2046020929 @default.
- W1578487916 cites W2047342127 @default.
- W1578487916 cites W2051834357 @default.
- W1578487916 cites W2055043387 @default.
- W1578487916 cites W2056729522 @default.
- W1578487916 cites W2065094868 @default.
- W1578487916 cites W2065290081 @default.
- W1578487916 cites W2066636486 @default.
- W1578487916 cites W2067566391 @default.
- W1578487916 cites W2070895734 @default.
- W1578487916 cites W2074593398 @default.
- W1578487916 cites W2077911025 @default.
- W1578487916 cites W2084243240 @default.
- W1578487916 cites W2087739686 @default.
- W1578487916 cites W2089005131 @default.
- W1578487916 cites W2092883046 @default.
- W1578487916 cites W2094260906 @default.
- W1578487916 cites W2096064090 @default.
- W1578487916 cites W2096945916 @default.
- W1578487916 cites W2098388305 @default.
- W1578487916 cites W2100674109 @default.
- W1578487916 cites W2102695221 @default.
- W1578487916 cites W2104476081 @default.
- W1578487916 cites W2105484782 @default.
- W1578487916 cites W2105819430 @default.
- W1578487916 cites W2106759927 @default.
- W1578487916 cites W2107412086 @default.
- W1578487916 cites W2108014443 @default.
- W1578487916 cites W2108382881 @default.
- W1578487916 cites W2108572131 @default.
- W1578487916 cites W2110459974 @default.
- W1578487916 cites W2110474297 @default.
- W1578487916 cites W2110631345 @default.
- W1578487916 cites W2110918591 @default.
- W1578487916 cites W2111110587 @default.
- W1578487916 cites W2111625757 @default.
- W1578487916 cites W2112528798 @default.
- W1578487916 cites W2113112851 @default.