Matches in SemOpenAlex for { <https://semopenalex.org/work/W99689262> ?p ?o ?g. }
- W99689262 abstract "An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM)." @default.
- W99689262 created "2016-06-24" @default.
- W99689262 creator A5056799282 @default.
- W99689262 date "2008-01-01" @default.
- W99689262 modified "2023-09-27" @default.
- W99689262 title "Rough set-based reasoning and pattern mining for information filtering" @default.
- W99689262 cites W110175884 @default.
- W99689262 cites W1480827513 @default.
- W99689262 cites W1483813729 @default.
- W99689262 cites W1495821370 @default.
- W99689262 cites W1497613135 @default.
- W99689262 cites W1506285740 @default.
- W99689262 cites W1506668827 @default.
- W99689262 cites W1515778651 @default.
- W99689262 cites W1520890006 @default.
- W99689262 cites W1523969031 @default.
- W99689262 cites W1529248392 @default.
- W99689262 cites W1533960661 @default.
- W99689262 cites W1537336823 @default.
- W99689262 cites W1553682320 @default.
- W99689262 cites W1554394657 @default.
- W99689262 cites W1557757161 @default.
- W99689262 cites W1557923305 @default.
- W99689262 cites W1570860624 @default.
- W99689262 cites W1585646276 @default.
- W99689262 cites W1605873790 @default.
- W99689262 cites W1608194207 @default.
- W99689262 cites W1610836425 @default.
- W99689262 cites W1641039719 @default.
- W99689262 cites W1660390307 @default.
- W99689262 cites W1771010287 @default.
- W99689262 cites W1813925448 @default.
- W99689262 cites W1838515187 @default.
- W99689262 cites W1952843833 @default.
- W99689262 cites W1956559956 @default.
- W99689262 cites W1969463949 @default.
- W99689262 cites W1969572066 @default.
- W99689262 cites W1973098080 @default.
- W99689262 cites W1975614788 @default.
- W99689262 cites W1976682404 @default.
- W99689262 cites W1978394996 @default.
- W99689262 cites W1983078185 @default.
- W99689262 cites W1985048767 @default.
- W99689262 cites W1986913017 @default.
- W99689262 cites W1988931981 @default.
- W99689262 cites W1993934121 @default.
- W99689262 cites W1994556419 @default.
- W99689262 cites W1994661362 @default.
- W99689262 cites W1997600320 @default.
- W99689262 cites W1997841190 @default.
- W99689262 cites W2000672666 @default.
- W99689262 cites W2002857471 @default.
- W99689262 cites W2006551346 @default.
- W99689262 cites W2009155701 @default.
- W99689262 cites W2010652031 @default.
- W99689262 cites W2011516515 @default.
- W99689262 cites W2011685732 @default.
- W99689262 cites W2014728371 @default.
- W99689262 cites W2015911632 @default.
- W99689262 cites W2018804365 @default.
- W99689262 cites W2019976352 @default.
- W99689262 cites W2020316999 @default.
- W99689262 cites W2022679416 @default.
- W99689262 cites W2024932032 @default.
- W99689262 cites W2027827699 @default.
- W99689262 cites W2027876745 @default.
- W99689262 cites W2028095785 @default.
- W99689262 cites W2029526940 @default.
- W99689262 cites W2034701578 @default.
- W99689262 cites W2035470233 @default.
- W99689262 cites W2036482831 @default.
- W99689262 cites W2039242902 @default.
- W99689262 cites W2039735811 @default.
- W99689262 cites W2043772506 @default.
- W99689262 cites W2043909051 @default.
- W99689262 cites W2044306058 @default.
- W99689262 cites W2046705420 @default.
- W99689262 cites W2048045485 @default.
- W99689262 cites W2053463056 @default.
- W99689262 cites W2054340289 @default.
- W99689262 cites W2056646133 @default.
- W99689262 cites W2060216474 @default.
- W99689262 cites W2061310592 @default.
- W99689262 cites W2061503185 @default.
- W99689262 cites W2064853889 @default.
- W99689262 cites W2069356553 @default.
- W99689262 cites W2070620842 @default.
- W99689262 cites W2071664212 @default.
- W99689262 cites W2073722401 @default.
- W99689262 cites W2075006521 @default.
- W99689262 cites W2077019270 @default.
- W99689262 cites W2078064242 @default.
- W99689262 cites W2082729696 @default.
- W99689262 cites W2082941690 @default.
- W99689262 cites W2083021953 @default.
- W99689262 cites W2085030399 @default.
- W99689262 cites W2093392641 @default.
- W99689262 cites W2097670688 @default.
- W99689262 cites W2098162425 @default.
- W99689262 cites W2098766443 @default.