Matches in SemOpenAlex for { <https://semopenalex.org/work/W2895423624> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W2895423624 abstract "Variable selection is an essential part of the process of model-building for classification or prediction. Some of the challenges of variable selection are heterogeneous variance-covariance matrices, differing scales of variables, non-normally distributed data and missing data. Statistical methods exist for variable selection however these are often univariate, make restrictive assumptions about the distribution of data or are expensive in terms of the computational power required. In this thesis I focus on filter methods of variable selection that are computationally fast and propose a metric of discrimination. The main objectives of this thesis are (1) to propose a novel Signal-to-Noise Ratio (SNR) discrimination metric accommodating heterogeneous variance-covariance matrices, (2) to develop a multiple forward selection (MFS) algorithm employing the novel SNR metric, (3) to assess the performance of the MFS-SNR algorithm compared to alternative methods of variable selection, (4) to investigate the ability of the MFS-SNR algorithm to carry out variable selection when data are not normally distributed and (5) to apply the MFS-SNR algorithm to the task of variable selection from real datasets. The MFS-SNR algorithm was implemented in the R programming environment. It calculates the SNR for subsets of variables, identifying the optimal variable during each round of selection as whichever causes the largest increase in SNR. A dataset was simulated comprising 10 variables: 2 discriminating variables, 7 non-discriminating variables and one non-discriminating variable which enhanced the discriminatory performance of other variables. In simulations the frequency of each variable’s selection was recorded. The probability of correct classification (PCC) and area under the curve (AUC) were calculated for sets of selected variables. I assessed the ability of the MFS-SNR algorithm to select variables when data are not normally distributed using simulated data. I compared the MFS-SNR algorithm to filter methods utilising information gain, chi-square statistics and the Relief-F algorithm as well as a support vector machines and an embedded method using random forests. A version of the MFS algorithm utilising Hotelling’s T2 statistic (MFS-T2) was included in this comparison. The MFS-SNR algorithm selected all 3 variables relevant to discrimination with higher or equivalent frequencies to competing methods in all scenarios. Following non-normal variable transformation the MFS-SNR algorithm still selected the variables known to be relevant to discrimination in the simulated scenarios. Finally, I studied both the MFS-SNR and MFS-T2 algorithm’s ability to carry out variable selection for disease classification using several clinical datasets from ophthalmology. These datasets represented a spectrum of quality issues such as missingness, imbalanced group sizes, heterogeneous variance-covariance matrices and differing variable scales. In 3 out of 4 datasets the MFS-SNR algorithm out-performed the MFS-T2 algorithm. In the fourth study both MFS-T2 and MFS-SNR produced the same variable selection results. In conclusion I have demonstrated that the novel SNR is an extension of Hotelling’s T2 statistic accommodating heterogeneity of variance-covariance matrices. The MFS-SNR algorithm is capable of selecting the relevant variables whether data are normally distributed or not. In the simulated scenarios the MFS-SNR algorithm performs at least as well as competing methods and outperforms the MFS-T2 algorithm when selecting variables from real clinical datasets." @default.
- W2895423624 created "2018-10-12" @default.
- W2895423624 creator A5061225231 @default.
- W2895423624 date "2017-09-27" @default.
- W2895423624 modified "2023-09-26" @default.
- W2895423624 title "Variable selection for classification in complex ophthalmic data : a multivariate statistical framework" @default.
- W2895423624 doi "https://doi.org/10.17638/03019718" @default.
- W2895423624 hasPublicationYear "2017" @default.
- W2895423624 type Work @default.
- W2895423624 sameAs 2895423624 @default.
- W2895423624 citedByCount "0" @default.
- W2895423624 crossrefType "dissertation" @default.
- W2895423624 hasAuthorship W2895423624A5061225231 @default.
- W2895423624 hasConcept C105795698 @default.
- W2895423624 hasConcept C11413529 @default.
- W2895423624 hasConcept C121955636 @default.
- W2895423624 hasConcept C124101348 @default.
- W2895423624 hasConcept C127413603 @default.
- W2895423624 hasConcept C134306372 @default.
- W2895423624 hasConcept C144133560 @default.
- W2895423624 hasConcept C148483581 @default.
- W2895423624 hasConcept C153180895 @default.
- W2895423624 hasConcept C154945302 @default.
- W2895423624 hasConcept C161584116 @default.
- W2895423624 hasConcept C176217482 @default.
- W2895423624 hasConcept C178650346 @default.
- W2895423624 hasConcept C182365436 @default.
- W2895423624 hasConcept C196083921 @default.
- W2895423624 hasConcept C199163554 @default.
- W2895423624 hasConcept C21547014 @default.
- W2895423624 hasConcept C33923547 @default.
- W2895423624 hasConcept C41008148 @default.
- W2895423624 hasConcept C81917197 @default.
- W2895423624 hasConceptScore W2895423624C105795698 @default.
- W2895423624 hasConceptScore W2895423624C11413529 @default.
- W2895423624 hasConceptScore W2895423624C121955636 @default.
- W2895423624 hasConceptScore W2895423624C124101348 @default.
- W2895423624 hasConceptScore W2895423624C127413603 @default.
- W2895423624 hasConceptScore W2895423624C134306372 @default.
- W2895423624 hasConceptScore W2895423624C144133560 @default.
- W2895423624 hasConceptScore W2895423624C148483581 @default.
- W2895423624 hasConceptScore W2895423624C153180895 @default.
- W2895423624 hasConceptScore W2895423624C154945302 @default.
- W2895423624 hasConceptScore W2895423624C161584116 @default.
- W2895423624 hasConceptScore W2895423624C176217482 @default.
- W2895423624 hasConceptScore W2895423624C178650346 @default.
- W2895423624 hasConceptScore W2895423624C182365436 @default.
- W2895423624 hasConceptScore W2895423624C196083921 @default.
- W2895423624 hasConceptScore W2895423624C199163554 @default.
- W2895423624 hasConceptScore W2895423624C21547014 @default.
- W2895423624 hasConceptScore W2895423624C33923547 @default.
- W2895423624 hasConceptScore W2895423624C41008148 @default.
- W2895423624 hasConceptScore W2895423624C81917197 @default.
- W2895423624 hasLocation W28954236241 @default.
- W2895423624 hasOpenAccess W2895423624 @default.
- W2895423624 hasPrimaryLocation W28954236241 @default.
- W2895423624 hasRelatedWork W1567784974 @default.
- W2895423624 hasRelatedWork W162175776 @default.
- W2895423624 hasRelatedWork W190208287 @default.
- W2895423624 hasRelatedWork W1955399487 @default.
- W2895423624 hasRelatedWork W1966399223 @default.
- W2895423624 hasRelatedWork W1976151630 @default.
- W2895423624 hasRelatedWork W1980755946 @default.
- W2895423624 hasRelatedWork W2013148675 @default.
- W2895423624 hasRelatedWork W2018471977 @default.
- W2895423624 hasRelatedWork W2058193528 @default.
- W2895423624 hasRelatedWork W2100214682 @default.
- W2895423624 hasRelatedWork W2137565990 @default.
- W2895423624 hasRelatedWork W2758158523 @default.
- W2895423624 hasRelatedWork W2940418775 @default.
- W2895423624 hasRelatedWork W2963200634 @default.
- W2895423624 hasRelatedWork W2982581102 @default.
- W2895423624 hasRelatedWork W3028308428 @default.
- W2895423624 hasRelatedWork W3080249277 @default.
- W2895423624 hasRelatedWork W3126269962 @default.
- W2895423624 hasRelatedWork W3181823153 @default.
- W2895423624 isParatext "false" @default.
- W2895423624 isRetracted "false" @default.
- W2895423624 magId "2895423624" @default.
- W2895423624 workType "dissertation" @default.