Matches in SemOpenAlex for { <https://semopenalex.org/work/W3200919809> ?p ?o ?g. }
- W3200919809 endingPage "64" @default.
- W3200919809 startingPage "54" @default.
- W3200919809 abstract "The paper Aims to examine various approaches to the ways of improving the quality of predictions and classification of unbalanced data that allow improving the accuracy of rare event classification. When predicting the onset of rare events using machine learning techniques, researchers face the problem of inconsistency between the quality of trained models and their actual ability to correctly predict the occurrence of a rare event. The paper examines model training under unbalanced initial data. The subject of research is the information on incidents and hazardous events at railway power supply facilities. The problem of unbalanced data is expressed in the noticeable imbalance between the types of observed events, i.e., the numbers of instances. Methods. While handling unbalanced data, depending on the nature of the problem at hand, the quality and size of the initial data, various Data Science-based techniques of improving the quality of classification models and prediction are used. Some of those methods are focused on attributes and parameters of classification models. Those include FAST, CFS, fuzzy classifiers, GridSearchCV, etc. Another group of methods is oriented towards generating representative subsets out of initial datasets, i.e., samples. Data sampling techniques allow examining the effect of class proportions on the quality of machine learning. In particular, in this paper, the NearMiss method is considered in detail. Results. The problem of class imbalance in respect to the analysis of the number of incidents at railway facilities has existed since 2015. Despite the decreasing share of hazardous events at railway power supply facilities in the three years since 2018, an increase in the number of such events cannot be ruled out. Monthly statistics of hazardous event distribution exhibit no trend for declines and peaks. In this context, the optimal period of observation of the number of incidents and hazardous events is a month. A visualization of the class ratio has shown the absence of a clear boundary between the members of the majority class (incidents) and those of the minority class (hazardous events). The class ratio was studied in two and three dimensions, in actual values and using the method of main components. Such “proximity” of classes is one of the causes of wrong predictions. In this paper, the authors analysed past research of the ways of improving the quality of machine learning based on unbalanced data. The terms that describe the degree of class imbalances have been defined and clarified. The strengths and weaknesses of 50 various methods of handling such data were studied and set forth. Out of the set of methods of handling the numbers of class members as part of the classification (prediction of the occurrence) of rare hazardous events in railway transportation, the NearMiss method was chosen. It allows experimenting with the ratios and methods of selecting class members. As the results of a series of experiments, the accuracy of rare hazardous event classification was improved from 0 to 70-90%." @default.
- W3200919809 created "2021-09-27" @default.
- W3200919809 creator A5036158793 @default.
- W3200919809 creator A5058152482 @default.
- W3200919809 date "2021-09-21" @default.
- W3200919809 modified "2023-10-18" @default.
- W3200919809 title "Intelligent methods for improving the accuracy of prediction of rare hazardous events in railway transportation" @default.
- W3200919809 cites W1533320424 @default.
- W3200919809 cites W1548505798 @default.
- W3200919809 cites W1563938718 @default.
- W3200919809 cites W1581587400 @default.
- W3200919809 cites W1591261915 @default.
- W3200919809 cites W1958433089 @default.
- W3200919809 cites W1974079881 @default.
- W3200919809 cites W1975916424 @default.
- W3200919809 cites W1978514892 @default.
- W3200919809 cites W1980960407 @default.
- W3200919809 cites W1994410331 @default.
- W3200919809 cites W1997653576 @default.
- W3200919809 cites W2012759654 @default.
- W3200919809 cites W2020667369 @default.
- W3200919809 cites W2023450550 @default.
- W3200919809 cites W2023473787 @default.
- W3200919809 cites W2032867948 @default.
- W3200919809 cites W2085661130 @default.
- W3200919809 cites W2090239801 @default.
- W3200919809 cites W2096945460 @default.
- W3200919809 cites W2103614420 @default.
- W3200919809 cites W2105340608 @default.
- W3200919809 cites W2106479238 @default.
- W3200919809 cites W2106756598 @default.
- W3200919809 cites W2107657787 @default.
- W3200919809 cites W2107686700 @default.
- W3200919809 cites W2120814856 @default.
- W3200919809 cites W2124685890 @default.
- W3200919809 cites W2131391419 @default.
- W3200919809 cites W2138776277 @default.
- W3200919809 cites W2145962650 @default.
- W3200919809 cites W2148143831 @default.
- W3200919809 cites W2156530876 @default.
- W3200919809 cites W2166045895 @default.
- W3200919809 cites W2187523974 @default.
- W3200919809 cites W2254005892 @default.
- W3200919809 cites W2532050031 @default.
- W3200919809 cites W2809658264 @default.
- W3200919809 cites W3005484410 @default.
- W3200919809 cites W3099514962 @default.
- W3200919809 cites W728297 @default.
- W3200919809 doi "https://doi.org/10.21683/1729-2646-2021-21-3-54-65" @default.
- W3200919809 hasPublicationYear "2021" @default.
- W3200919809 type Work @default.
- W3200919809 sameAs 3200919809 @default.
- W3200919809 citedByCount "2" @default.
- W3200919809 countsByYear W32009198092022 @default.
- W3200919809 crossrefType "journal-article" @default.
- W3200919809 hasAuthorship W3200919809A5036158793 @default.
- W3200919809 hasAuthorship W3200919809A5058152482 @default.
- W3200919809 hasBestOaLocation W32009198091 @default.
- W3200919809 hasConcept C105795698 @default.
- W3200919809 hasConcept C111472728 @default.
- W3200919809 hasConcept C119857082 @default.
- W3200919809 hasConcept C121332964 @default.
- W3200919809 hasConcept C124101348 @default.
- W3200919809 hasConcept C127413603 @default.
- W3200919809 hasConcept C138885662 @default.
- W3200919809 hasConcept C144024400 @default.
- W3200919809 hasConcept C154945302 @default.
- W3200919809 hasConcept C176217482 @default.
- W3200919809 hasConcept C21547014 @default.
- W3200919809 hasConcept C22507642 @default.
- W3200919809 hasConcept C24756922 @default.
- W3200919809 hasConcept C2777212361 @default.
- W3200919809 hasConcept C2777317252 @default.
- W3200919809 hasConcept C2779304628 @default.
- W3200919809 hasConcept C2779530757 @default.
- W3200919809 hasConcept C2779662365 @default.
- W3200919809 hasConcept C33923547 @default.
- W3200919809 hasConcept C36289849 @default.
- W3200919809 hasConcept C41008148 @default.
- W3200919809 hasConcept C548081761 @default.
- W3200919809 hasConcept C58166 @default.
- W3200919809 hasConcept C62520636 @default.
- W3200919809 hasConceptScore W3200919809C105795698 @default.
- W3200919809 hasConceptScore W3200919809C111472728 @default.
- W3200919809 hasConceptScore W3200919809C119857082 @default.
- W3200919809 hasConceptScore W3200919809C121332964 @default.
- W3200919809 hasConceptScore W3200919809C124101348 @default.
- W3200919809 hasConceptScore W3200919809C127413603 @default.
- W3200919809 hasConceptScore W3200919809C138885662 @default.
- W3200919809 hasConceptScore W3200919809C144024400 @default.
- W3200919809 hasConceptScore W3200919809C154945302 @default.
- W3200919809 hasConceptScore W3200919809C176217482 @default.
- W3200919809 hasConceptScore W3200919809C21547014 @default.
- W3200919809 hasConceptScore W3200919809C22507642 @default.
- W3200919809 hasConceptScore W3200919809C24756922 @default.
- W3200919809 hasConceptScore W3200919809C2777212361 @default.
- W3200919809 hasConceptScore W3200919809C2777317252 @default.
- W3200919809 hasConceptScore W3200919809C2779304628 @default.