Matches in SemOpenAlex for { <https://semopenalex.org/work/W2642364513> ?p ?o ?g. }
- W2642364513 abstract "Recent advances in sensing and storage technology have created many high-volume, high-dimensional data sets in pattern recognition, machine learning, and data mining. Unsupervised learning can provide generic tools for analyzing and summarizing these data sets when there is no well-defined notion of classes. The purpose of this thesis is to study some of the open problems in two main areas of unsupervised learning, namely clustering and (unsupervised) dimensionality reduction. Instance-level constraint on objects, an example of side-information, is also considered to improve the clustering results. Our first contribution is a modification to the isometric feature mapping (ISOMAP) algorithm when the input data, instead of being all available simultaneously, arrive sequentially from a data stream. ISOMAP is representative of a class of nonlinear dimensionality reduction algorithms that are based on the notion of a manifold. Both the standard ISOMAP and the landmark version of ISOMAP are considered. Experimental results on synthetic data as well as real world images demonstrate that the modified algorithm can maintain an accurate low-dimensional representation of the data in an efficient manner. We study the problem of feature selection in model-based clustering when the number of clusters is unknown. We propose the concept of feature saliency and introduce an expectation-maximization (EM) algorithm for its estimation. By using the minimum message length (MML) model selection criterion, the saliency of irrelevant features is driven towards zero, which corresponds to performing feature selection. The use of MML can also determine the number of clusters automatically by pruning away the weak clusters. The proposed algorithm is validated on both synthetic data and data sets from the UCI machine learning repository. We have also developed a new algorithm for incorporating instance-level constraints in model-based clustering. Its main idea is that we require the cluster label of an object to be determined only by its feature vector and the cluster parameters. In particular, the constraints should not have any direct influence. This consideration leads to a new objective function that considers both the fit to the data and the satisfaction of the constraints simultaneously. The line-search Newton algorithm is used to find the cluster parameter vector that optimizes this objective function. This approach is extended to simultaneously perform feature extraction and clustering under constraints. Comparison of the proposed algorithm with competitive algorithms over eighteen data sets from different domains, including text categorization, low-level image segmentation, appearance-based vision, and benchmark data sets from the UCI machine learning repository, shows the superiority of the proposed approach." @default.
- W2642364513 created "2017-06-30" @default.
- W2642364513 creator A5007029060 @default.
- W2642364513 creator A5064310689 @default.
- W2642364513 date "2006-01-01" @default.
- W2642364513 modified "2023-09-24" @default.
- W2642364513 title "Clustering, dimensionality reduction, and side information" @default.
- W2642364513 cites W1487720485 @default.
- W2642364513 cites W1489608363 @default.
- W2642364513 cites W1493217831 @default.
- W2642364513 cites W1498782169 @default.
- W2642364513 cites W1499673022 @default.
- W2642364513 cites W1500752107 @default.
- W2642364513 cites W1511160855 @default.
- W2642364513 cites W1515277467 @default.
- W2642364513 cites W1526719228 @default.
- W2642364513 cites W1548390947 @default.
- W2642364513 cites W1549267002 @default.
- W2642364513 cites W1553244859 @default.
- W2642364513 cites W1554067589 @default.
- W2642364513 cites W1554663460 @default.
- W2642364513 cites W1560107318 @default.
- W2642364513 cites W1578099820 @default.
- W2642364513 cites W1578316706 @default.
- W2642364513 cites W1579271636 @default.
- W2642364513 cites W1581499779 @default.
- W2642364513 cites W1588347817 @default.
- W2642364513 cites W1590246454 @default.
- W2642364513 cites W1591385104 @default.
- W2642364513 cites W1592785605 @default.
- W2642364513 cites W1595132914 @default.
- W2642364513 cites W1595613095 @default.
- W2642364513 cites W1606637515 @default.
- W2642364513 cites W1651008648 @default.
- W2642364513 cites W1661871015 @default.
- W2642364513 cites W1673310716 @default.
- W2642364513 cites W1676212923 @default.
- W2642364513 cites W1679913846 @default.
- W2642364513 cites W1762853090 @default.
- W2642364513 cites W1792341371 @default.
- W2642364513 cites W1808644423 @default.
- W2642364513 cites W1821148229 @default.
- W2642364513 cites W1827261456 @default.
- W2642364513 cites W1840019730 @default.
- W2642364513 cites W1849729440 @default.
- W2642364513 cites W190437827 @default.
- W2642364513 cites W1942764728 @default.
- W2642364513 cites W1966347620 @default.
- W2642364513 cites W1966949944 @default.
- W2642364513 cites W1969423031 @default.
- W2642364513 cites W1971784203 @default.
- W2642364513 cites W1983297873 @default.
- W2642364513 cites W1991448974 @default.
- W2642364513 cites W1992402718 @default.
- W2642364513 cites W1992419399 @default.
- W2642364513 cites W1997648776 @default.
- W2642364513 cites W2001141328 @default.
- W2642364513 cites W2002626680 @default.
- W2642364513 cites W2002645541 @default.
- W2642364513 cites W2002857471 @default.
- W2642364513 cites W2005422315 @default.
- W2642364513 cites W2008567927 @default.
- W2642364513 cites W2014915963 @default.
- W2642364513 cites W2015245929 @default.
- W2642364513 cites W2016381774 @default.
- W2642364513 cites W2017337590 @default.
- W2642364513 cites W2017512208 @default.
- W2642364513 cites W2022686119 @default.
- W2642364513 cites W202383464 @default.
- W2642364513 cites W2024765216 @default.
- W2642364513 cites W2026513874 @default.
- W2642364513 cites W2029051343 @default.
- W2642364513 cites W2030991373 @default.
- W2642364513 cites W2032407804 @default.
- W2642364513 cites W2033709196 @default.
- W2642364513 cites W2034331023 @default.
- W2642364513 cites W2040884411 @default.
- W2642364513 cites W2041837861 @default.
- W2642364513 cites W2042383323 @default.
- W2642364513 cites W2048695473 @default.
- W2642364513 cites W2049694710 @default.
- W2642364513 cites W2050834445 @default.
- W2642364513 cites W2053186076 @default.
- W2642364513 cites W2055323881 @default.
- W2642364513 cites W2055522016 @default.
- W2642364513 cites W2060314721 @default.
- W2642364513 cites W2060542593 @default.
- W2642364513 cites W2062102668 @default.
- W2642364513 cites W2063532964 @default.
- W2642364513 cites W2064580901 @default.
- W2642364513 cites W2067191022 @default.
- W2642364513 cites W2077658674 @default.
- W2642364513 cites W2077990749 @default.
- W2642364513 cites W2078626246 @default.
- W2642364513 cites W2082612735 @default.
- W2642364513 cites W2087327261 @default.
- W2642364513 cites W2087542520 @default.
- W2642364513 cites W2091886411 @default.
- W2642364513 cites W2092611630 @default.
- W2642364513 cites W2095869646 @default.