Matches in SemOpenAlex for { <https://semopenalex.org/work/W1567376665> ?p ?o ?g. }
- W1567376665 abstract "Given the phenomenal rate by which the World Wide Web is changing, retrieval methods and quality assurance have become bottleneck issues for many information retrieval services on the Internet, e.g. Web search engine designs. In this thesis, approaches that increase the efficiency of information retrieval methods, and provide quality assurance of information obtained from the Web, are developed through the implementation of a quality-focused information retrieval system. A novel approach to the retrieval of quality information from the Internet is introduced. Implemented as a component of a vertical search application, this results in a focused crawler which is capable of retrieving quality information from the Internet. The three main contributions of this research are: (1) An effective and flexible crawling application that is well-suited for information retrieving tasks on the dynamic World Wide Web (WWW) is implemented. The resulting crawling application (crawler) is designed after having observed the dynamics of the web evolution through regular monitoring of the WWW; it also addresses the shortcomings of some existing crawlers, therefore presenting itself as a practical implementation. (2) A mechanism that converts human quality judgement through user surveys into an algorithm is developed, so that user perceptions of a set of criteria which may lead to determination of the quality content on the web pages concerned, can be applied to a large number of Web documents with minimal manual effort. This was obtained through a relatively large user survey which was conducted in a collaborative research work with Dr Shirlee-Ann Knight of Edith Cowan University. The survey was conducted to determine what criteria Web documents are perceived to meet to qualify as a quality document. This results in an aggregate numeric score for each web page between 0 and 1 respectively indicating that it does not meet any quality criteria, or that it meets all quality criteria perfectly. (3) This research proposes an approach to predict the quality of a web page before it is retrieved by a crawler. The approach allows its incorporation into a vertical search application which focuses on the retrieval of quality information. Experimental results on real world data show that the proposed approach is more effective than any other brute force approaches which have been published so far. The proposed methods produce a numerical quality score for any text based Web document. This thesis will show that such a score can also be used as a web page ranking criterion for horizontal search engines. As part of this research project, this ranking scheme has been implemented and embedded into a working search engine. The observed user feedback confirms that search" @default.
- W1567376665 created "2016-06-24" @default.
- W1567376665 creator A5043925408 @default.
- W1567376665 date "2009-01-01" @default.
- W1567376665 modified "2023-09-27" @default.
- W1567376665 title "Building a prototype for quality information retrieval from the World Wide Web" @default.
- W1567376665 cites W144900376 @default.
- W1567376665 cites W1486167854 @default.
- W1567376665 cites W1505358417 @default.
- W1567376665 cites W1526456992 @default.
- W1567376665 cites W1537691903 @default.
- W1567376665 cites W1553999085 @default.
- W1567376665 cites W1565362320 @default.
- W1567376665 cites W1566984846 @default.
- W1567376665 cites W1567491469 @default.
- W1567376665 cites W1572605355 @default.
- W1567376665 cites W1573985123 @default.
- W1567376665 cites W1574586036 @default.
- W1567376665 cites W1589025345 @default.
- W1567376665 cites W1598759141 @default.
- W1567376665 cites W1613836731 @default.
- W1567376665 cites W1624977 @default.
- W1567376665 cites W1650267389 @default.
- W1567376665 cites W1660540534 @default.
- W1567376665 cites W1679913846 @default.
- W1567376665 cites W1963825835 @default.
- W1567376665 cites W1964579932 @default.
- W1567376665 cites W1971605801 @default.
- W1567376665 cites W1981202432 @default.
- W1567376665 cites W1992921287 @default.
- W1567376665 cites W1994129769 @default.
- W1567376665 cites W1995199550 @default.
- W1567376665 cites W2000273502 @default.
- W1567376665 cites W2005200579 @default.
- W1567376665 cites W2006119904 @default.
- W1567376665 cites W2010463775 @default.
- W1567376665 cites W2011482581 @default.
- W1567376665 cites W2014123073 @default.
- W1567376665 cites W2014478203 @default.
- W1567376665 cites W2014754265 @default.
- W1567376665 cites W2014976296 @default.
- W1567376665 cites W2018928332 @default.
- W1567376665 cites W2021008035 @default.
- W1567376665 cites W2026875114 @default.
- W1567376665 cites W2028014545 @default.
- W1567376665 cites W2035314340 @default.
- W1567376665 cites W2036543443 @default.
- W1567376665 cites W2037858832 @default.
- W1567376665 cites W2039499764 @default.
- W1567376665 cites W2044102057 @default.
- W1567376665 cites W2046325278 @default.
- W1567376665 cites W2047008218 @default.
- W1567376665 cites W2056206543 @default.
- W1567376665 cites W2066055909 @default.
- W1567376665 cites W2070204737 @default.
- W1567376665 cites W2076452960 @default.
- W1567376665 cites W2077513286 @default.
- W1567376665 cites W2080676333 @default.
- W1567376665 cites W2087918852 @default.
- W1567376665 cites W2092815217 @default.
- W1567376665 cites W2095976990 @default.
- W1567376665 cites W2100655604 @default.
- W1567376665 cites W2102942431 @default.
- W1567376665 cites W2105396147 @default.
- W1567376665 cites W2109840511 @default.
- W1567376665 cites W2114874879 @default.
- W1567376665 cites W2116707358 @default.
- W1567376665 cites W2118009636 @default.
- W1567376665 cites W2118131693 @default.
- W1567376665 cites W2120308175 @default.
- W1567376665 cites W2121715962 @default.
- W1567376665 cites W2124575832 @default.
- W1567376665 cites W2124776405 @default.
- W1567376665 cites W2127188638 @default.
- W1567376665 cites W2129632406 @default.
- W1567376665 cites W2133357323 @default.
- W1567376665 cites W2134308336 @default.
- W1567376665 cites W2136219771 @default.
- W1567376665 cites W2136248485 @default.
- W1567376665 cites W2137175628 @default.
- W1567376665 cites W2141825421 @default.
- W1567376665 cites W2142887429 @default.
- W1567376665 cites W2143079043 @default.
- W1567376665 cites W2144200917 @default.
- W1567376665 cites W2145780906 @default.
- W1567376665 cites W2145990704 @default.
- W1567376665 cites W2146206110 @default.
- W1567376665 cites W2146373084 @default.
- W1567376665 cites W2147809846 @default.
- W1567376665 cites W2150468298 @default.
- W1567376665 cites W2151007976 @default.
- W1567376665 cites W2151691570 @default.
- W1567376665 cites W2153252192 @default.
- W1567376665 cites W2153732552 @default.
- W1567376665 cites W2154324327 @default.
- W1567376665 cites W2154610494 @default.
- W1567376665 cites W2156037541 @default.
- W1567376665 cites W2157748587 @default.
- W1567376665 cites W2157943203 @default.
- W1567376665 cites W2164052363 @default.