Matches in SemOpenAlex for { <https://semopenalex.org/work/W2342385238> ?p ?o ?g. }
- W2342385238 abstract "An important problem that appears often when analyzing data involves identifying irregular or abnormal data points called outliers. This problem broadly arises under two scenarios: when outliers are to be removed from the data before analysis, and when useful information or knowledge can be extracted by the outliers themselves. Outlier Detection in the context of the second scenario is a research field that has attracted significant attention in a broad range of useful applications. For example, in credit card transaction data, outliers might indicate potential fraud; in network traffic data, outliers might represent potential intrusion attempts.The basis of deciding if a data point is an outlier is often some measure or notion of dissimilarity between the data point under consideration and the rest. Traditional outlier detection methods assume numerical or ordinal data, and compute pair-wise distances between data points. However, the notion of distance or similarity for categorical data is more difficult to define. Moreover, the size of currently available data sets dictates the need for fast and scalable outlier detection methods, thus precluding distance computations. Additionally, these methods must be applicable to data which might be distributed among different locations.In this work, we propose novel strategies to efficiently deal with large distributed data containing mixed-type attributes. Specifically, we first propose a fast and scalable algorithm for categorical data (AVF), and its parallel version based on MapReduce (MR-AVF). We extend AVF and introduce a fast outlier detection algorithm for large distributed data with mixed-type attributes (ODMAD). Finally, we modify ODMAD in order to deal with very high-dimensional categorical data. Experiments with large real-world and synthetic data show that the proposed methods exhibit large performance gains and high scalability compared to the state-of-the-art, while achieving similar accuracy detection rates." @default.
- W2342385238 created "2016-06-24" @default.
- W2342385238 creator A5035774745 @default.
- W2342385238 creator A5036799400 @default.
- W2342385238 date "2009-01-01" @default.
- W2342385238 modified "2023-09-24" @default.
- W2342385238 title "Scalable and efficient outlier detection in large distributed data sets with mixed-type attributes" @default.
- W2342385238 cites W126266486 @default.
- W2342385238 cites W1484413656 @default.
- W2342385238 cites W1506285740 @default.
- W2342385238 cites W1515158606 @default.
- W2342385238 cites W1542978828 @default.
- W2342385238 cites W1552339598 @default.
- W2342385238 cites W1565377632 @default.
- W2342385238 cites W1570748171 @default.
- W2342385238 cites W1585646276 @default.
- W2342385238 cites W1602011302 @default.
- W2342385238 cites W1672197616 @default.
- W2342385238 cites W1874523587 @default.
- W2342385238 cites W1876967670 @default.
- W2342385238 cites W1887038067 @default.
- W2342385238 cites W1903475782 @default.
- W2342385238 cites W1944649467 @default.
- W2342385238 cites W1966147156 @default.
- W2342385238 cites W1970655212 @default.
- W2342385238 cites W1995875735 @default.
- W2342385238 cites W2008298724 @default.
- W2342385238 cites W2023075593 @default.
- W2342385238 cites W2030969394 @default.
- W2342385238 cites W2037965136 @default.
- W2342385238 cites W2049058890 @default.
- W2342385238 cites W2061122559 @default.
- W2342385238 cites W2061240327 @default.
- W2342385238 cites W2064596869 @default.
- W2342385238 cites W2069356553 @default.
- W2342385238 cites W2079557269 @default.
- W2342385238 cites W2098682382 @default.
- W2342385238 cites W2099019567 @default.
- W2342385238 cites W2105309441 @default.
- W2342385238 cites W2109722477 @default.
- W2342385238 cites W2110784166 @default.
- W2342385238 cites W2115482638 @default.
- W2342385238 cites W2117891917 @default.
- W2342385238 cites W2118843309 @default.
- W2342385238 cites W2119565742 @default.
- W2342385238 cites W2122646361 @default.
- W2342385238 cites W2124316743 @default.
- W2342385238 cites W2124536999 @default.
- W2342385238 cites W2127005418 @default.
- W2342385238 cites W2129249398 @default.
- W2342385238 cites W2129312546 @default.
- W2342385238 cites W2131626681 @default.
- W2342385238 cites W2135689168 @default.
- W2342385238 cites W2137130182 @default.
- W2342385238 cites W2140740533 @default.
- W2342385238 cites W2144182447 @default.
- W2342385238 cites W2146225536 @default.
- W2342385238 cites W2153039034 @default.
- W2342385238 cites W2158454296 @default.
- W2342385238 cites W2160643507 @default.
- W2342385238 cites W2161033190 @default.
- W2342385238 cites W2165047624 @default.
- W2342385238 cites W2166785535 @default.
- W2342385238 cites W2168196587 @default.
- W2342385238 cites W2169217090 @default.
- W2342385238 cites W2170314592 @default.
- W2342385238 cites W2171479472 @default.
- W2342385238 cites W2171612826 @default.
- W2342385238 cites W2173213060 @default.
- W2342385238 cites W2396903734 @default.
- W2342385238 cites W3041834803 @default.
- W2342385238 cites W3112596628 @default.
- W2342385238 cites W33383766 @default.
- W2342385238 cites W42722137 @default.
- W2342385238 cites W52010771 @default.
- W2342385238 cites W69415620 @default.
- W2342385238 cites W80917968 @default.
- W2342385238 hasPublicationYear "2009" @default.
- W2342385238 type Work @default.
- W2342385238 sameAs 2342385238 @default.
- W2342385238 citedByCount "3" @default.
- W2342385238 countsByYear W23423852382014 @default.
- W2342385238 countsByYear W23423852382016 @default.
- W2342385238 countsByYear W23423852382020 @default.
- W2342385238 crossrefType "journal-article" @default.
- W2342385238 hasAuthorship W2342385238A5035774745 @default.
- W2342385238 hasAuthorship W2342385238A5036799400 @default.
- W2342385238 hasConcept C119857082 @default.
- W2342385238 hasConcept C124101348 @default.
- W2342385238 hasConcept C127722929 @default.
- W2342385238 hasConcept C138958017 @default.
- W2342385238 hasConcept C151730666 @default.
- W2342385238 hasConcept C154945302 @default.
- W2342385238 hasConcept C199360897 @default.
- W2342385238 hasConcept C2779343474 @default.
- W2342385238 hasConcept C41008148 @default.
- W2342385238 hasConcept C48044578 @default.
- W2342385238 hasConcept C5274069 @default.
- W2342385238 hasConcept C739882 @default.
- W2342385238 hasConcept C75949130 @default.