Matches in SemOpenAlex for { <https://semopenalex.org/work/W3013039448> ?p ?o ?g. }
Showing items 1 to 69 of
69
with 100 items per page.
- W3013039448 abstract "Author(s): Spaen, Quico Pepijn | Advisor(s): Hochbaum, Dorit S | Abstract: Similarity-based machine learning methods differ from traditional machine learning methods in that they also use pairwise similarity relations between objects to infer the labels of unlabeled objects. A recent comparative study for classification problems by Baumann et al. [2019] demonstrated that similarity-based techniques have superior performance and robustness when compared to well-established machine learning techniques. Similarity-based machine learning methods benefit from two advantages that could explain superior their performance: They can make use of the pairwise relations between unlabeled objects, and they are robust due to the transitive property of pairwise similarities. A challenge for similarity-based machine learning methods on large datasets is that the number of pairwise similarity grows quadratically in the size of the dataset. For large datasets, it thus becomes practically impossible to compute all possible pairwise similarities. In 2016, Hochbaum and Baumann proposed the technique of sparse computation to address this growth by computing only those pairwise similarities that are relevant. Their proposed implementation of sparse computation is still difficult to scale to millions objects. This dissertation focuses on advancing the practical implementations of sparse computation to larger datasets and on two applications for which similarity-based machine learning was particularly effective. The applications that are studied here are cell identification in calcium-imaging movies and detecting aberrant linking behavior in directed networks. For sparse computation we present faster, geometric algorithms and a technique, named sparse-reduced computation, that combines sparse computation with compression. The geometric algorithms compute the exact same output as the original implementation of sparse computation, but identify the relevant pairwise similarities faster by using the concept of data shifting for identifying objects in the same or neighboring blocks. Empirical results on datasets with up to 10 million objects show a significant reduction in running time. Sparse-reduced computation combines sparse computation with a technique for compressing highly-similar or identical objects, enabling the use of similarity-based machine learning on massively-large datasets. The computational results demonstrate that sparse-reduced computation provides a significant reduction in running time with a minute loss in accuracy.A major problem facing neuroscientists today is cell identification in calcium-imaging movies. These movies are in-vivo recordings of thousands of neurons at cellular resolution. There is a great need for automated approaches to extract the activity of single neurons from these movies since manual post-processing takes tens of hours per dataset. We present the HNCcorr algorithm for cell identification in calcium-imaging movies. The name HNCcorr is derived from its use of the similarity-based Hochbaum's Normalized Cut (HNC) model with pairwise similarities derived from correlation. In HNCcorr, the task of cell detection is approached as a clustering problem. HNCcorr utilizes HNC to detect cells in these movies as coherent clusters of pixels that are highly distinct from the remaining pixels. HNCcorr guarantees, unlike existing methodologies for cell identification, a globally optimal solution to the underlying optimization problem. Of independent interest is a novel method, named similarity-squared, that we devised for measuring similarity between pixels. We provide an experimental study and demonstrate that HNCcorr is a top performer on the Neurofinder cell identification benchmark and that it improves over algorithms based on matrix factorization.The second application is detecting aberrant agents, such as fake news sources or spam websites, based on their link behavior in networks. Across contexts, a distinguishing characteristic between normal and aberrant agents is that normal agents rarely link to aberrant ones. We refer to this phenomenon as aberrant linking behavior. We present an Markov Random Fields (MRF) formulation, with links as the pairwise similarities, that detects aberrant agents based on aberrant linking behavior and any prior information (if given). This MRF formulation is solved optimally and in polynomial time. We compare the optimal solution for the MRF formulation to well-known algorithms based on random walks. In our empirical experiment with twenty-three different datasets, the MRF method outperforms the other detection algorithms. This work represents the first use of optimization methods for detecting aberrant agents as well as the first time that MRF is applied to directed graphs." @default.
- W3013039448 created "2020-04-03" @default.
- W3013039448 creator A5032198398 @default.
- W3013039448 date "2019-01-01" @default.
- W3013039448 modified "2023-09-26" @default.
- W3013039448 title "Applications and Advances in Similarity-based Machine Learning" @default.
- W3013039448 cites W2080375462 @default.
- W3013039448 cites W2117138276 @default.
- W3013039448 cites W2169278451 @default.
- W3013039448 cites W2469594461 @default.
- W3013039448 hasPublicationYear "2019" @default.
- W3013039448 type Work @default.
- W3013039448 sameAs 3013039448 @default.
- W3013039448 citedByCount "0" @default.
- W3013039448 crossrefType "journal-article" @default.
- W3013039448 hasAuthorship W3013039448A5032198398 @default.
- W3013039448 hasConcept C103278499 @default.
- W3013039448 hasConcept C104317684 @default.
- W3013039448 hasConcept C11413529 @default.
- W3013039448 hasConcept C115961682 @default.
- W3013039448 hasConcept C119857082 @default.
- W3013039448 hasConcept C153180895 @default.
- W3013039448 hasConcept C154945302 @default.
- W3013039448 hasConcept C184898388 @default.
- W3013039448 hasConcept C185592680 @default.
- W3013039448 hasConcept C41008148 @default.
- W3013039448 hasConcept C45374587 @default.
- W3013039448 hasConcept C55493867 @default.
- W3013039448 hasConcept C63479239 @default.
- W3013039448 hasConceptScore W3013039448C103278499 @default.
- W3013039448 hasConceptScore W3013039448C104317684 @default.
- W3013039448 hasConceptScore W3013039448C11413529 @default.
- W3013039448 hasConceptScore W3013039448C115961682 @default.
- W3013039448 hasConceptScore W3013039448C119857082 @default.
- W3013039448 hasConceptScore W3013039448C153180895 @default.
- W3013039448 hasConceptScore W3013039448C154945302 @default.
- W3013039448 hasConceptScore W3013039448C184898388 @default.
- W3013039448 hasConceptScore W3013039448C185592680 @default.
- W3013039448 hasConceptScore W3013039448C41008148 @default.
- W3013039448 hasConceptScore W3013039448C45374587 @default.
- W3013039448 hasConceptScore W3013039448C55493867 @default.
- W3013039448 hasConceptScore W3013039448C63479239 @default.
- W3013039448 hasLocation W30130394481 @default.
- W3013039448 hasOpenAccess W3013039448 @default.
- W3013039448 hasPrimaryLocation W30130394481 @default.
- W3013039448 hasRelatedWork W129965406 @default.
- W3013039448 hasRelatedWork W2028337737 @default.
- W3013039448 hasRelatedWork W2294263464 @default.
- W3013039448 hasRelatedWork W2337835375 @default.
- W3013039448 hasRelatedWork W2407332982 @default.
- W3013039448 hasRelatedWork W2765199202 @default.
- W3013039448 hasRelatedWork W2787536574 @default.
- W3013039448 hasRelatedWork W2795021089 @default.
- W3013039448 hasRelatedWork W2801446373 @default.
- W3013039448 hasRelatedWork W2807587809 @default.
- W3013039448 hasRelatedWork W2907574043 @default.
- W3013039448 hasRelatedWork W2949118751 @default.
- W3013039448 hasRelatedWork W2981433620 @default.
- W3013039448 hasRelatedWork W2990771307 @default.
- W3013039448 hasRelatedWork W3104876597 @default.
- W3013039448 hasRelatedWork W3109275379 @default.
- W3013039448 hasRelatedWork W3165527726 @default.
- W3013039448 hasRelatedWork W3172344385 @default.
- W3013039448 hasRelatedWork W3184609690 @default.
- W3013039448 hasRelatedWork W2166986236 @default.
- W3013039448 isParatext "false" @default.
- W3013039448 isRetracted "false" @default.
- W3013039448 magId "3013039448" @default.
- W3013039448 workType "article" @default.