Matches in SemOpenAlex for { <https://semopenalex.org/work/W3086596315> ?p ?o ?g. }
Showing items 1 to 82 of
82
with 100 items per page.
- W3086596315 endingPage "A03" @default.
- W3086596315 startingPage "A03" @default.
- W3086596315 abstract "Abstract Background: The precision medicine initiative calls for the study of genes, behaviors, and environment to improve disease prevention. There is a growing body of research supporting the role of social environment (i.e., the neighborhood in which one lives) in cancer health disparities. However, recent efforts have focused on applying empiric, high-dimensional computing approaches to genetic data, with less of an emphasis on environment. In this study, we adapted and applied empiric machine learning approaches to identify which method would be most effective at evaluating the effects of social environment on advanced prostate cancer in a simulated dataset. As is common in high-dimensional data, we encountered (and will present) statistical challenges that arose during analysis, specifically related to multicollinearity. Methods: Pennsylvania Prostate Cancer Registry data from 1995-2005 were linked to publicly available social environmental data from the 2000 U.S. Census via a geocode at the census tract level using ArcGIS software. This primary data consisted of 86,629 prostate cancer cases and 14,663 census variables. U.S. Census variables, which are defined in terms of neighborhood socioeconomic variables, such as education, income, employment, etc., are known to be highly correlated. A simulated dataset was created using the data structure of our primary dataset, where a set of 10 prespecified variables were independent predictors of a binary outcome, and the remaining 990 variables had no effect. Test and training sets were created and various machine learning approaches were applied and compared: standard regression models (REG), Lasso penalized regression (LASSO), elastic net regression (ELNET), and random forest (RF). The most successful method at identifying “true” variables (or highly correlated surrogates), limiting false-positive results, and consistently replicating findings was considered the most effective approach. Simulations were repeated 500 times, and results summarized. Results: Over the 500 simulations, the methods identified 6.3 (REG), 6.4 (LASSO), 8.2 (ELNET), and 10 (RF) of the 10 true (or highly correlated surrogate) variables. In addition, 38.8 (REG), 13.3 (LASSO), 49.9 (ELNET), and 65 (RF) false positive variables were identified. RF consistently replicated the selection of all 10 variables across simulations 100% of the time, whereas LASSO was consistently unable to identify 2 of the 10 true variables. Conclusions: Preliminary findings suggest a combination of RF and LASSO may be the most effective approach; LASSO has the best overall ability to identify true variables while avoiding false positives; RF identifies true variables consistently. Given that Lasso was unable to detect 2 of the true variables, we will also present findings from multivariate models to allow for adjustment due to residual confounding. Final results should be tested in a real data setting where additional considerations for multicollinearity would need to be explored. Citation Format: Shannon. M Lynch, Yinuo Yin, Elizabeth Handorf. Applying machine learning approaches to social environmental data from the U.S. Census in cancer studies: Challenges and considerations [abstract]. In: Proceedings of the AACR Special Conference on Modernizing Population Sciences in the Digital Age; 2019 Feb 19-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2020;29(9 Suppl):Abstract nr A03." @default.
- W3086596315 created "2020-09-21" @default.
- W3086596315 creator A5033650513 @default.
- W3086596315 creator A5067738211 @default.
- W3086596315 creator A5078914789 @default.
- W3086596315 date "2020-09-01" @default.
- W3086596315 modified "2023-09-27" @default.
- W3086596315 title "Abstract A03: Applying machine learning approaches to social environmental data from the U.S. Census in cancer studies: Challenges and considerations" @default.
- W3086596315 doi "https://doi.org/10.1158/1538-7755.modpop19-a03" @default.
- W3086596315 hasPublicationYear "2020" @default.
- W3086596315 type Work @default.
- W3086596315 sameAs 3086596315 @default.
- W3086596315 citedByCount "0" @default.
- W3086596315 crossrefType "journal-article" @default.
- W3086596315 hasAuthorship W3086596315A5033650513 @default.
- W3086596315 hasAuthorship W3086596315A5067738211 @default.
- W3086596315 hasAuthorship W3086596315A5078914789 @default.
- W3086596315 hasConcept C119857082 @default.
- W3086596315 hasConcept C121608353 @default.
- W3086596315 hasConcept C124101348 @default.
- W3086596315 hasConcept C126322002 @default.
- W3086596315 hasConcept C136764020 @default.
- W3086596315 hasConcept C152877465 @default.
- W3086596315 hasConcept C154945302 @default.
- W3086596315 hasConcept C17744445 @default.
- W3086596315 hasConcept C189285262 @default.
- W3086596315 hasConcept C199539241 @default.
- W3086596315 hasConcept C205649164 @default.
- W3086596315 hasConcept C2522767166 @default.
- W3086596315 hasConcept C2776356880 @default.
- W3086596315 hasConcept C2780192828 @default.
- W3086596315 hasConcept C2908647359 @default.
- W3086596315 hasConcept C37616216 @default.
- W3086596315 hasConcept C41008148 @default.
- W3086596315 hasConcept C42629822 @default.
- W3086596315 hasConcept C52130261 @default.
- W3086596315 hasConcept C58640448 @default.
- W3086596315 hasConcept C71924100 @default.
- W3086596315 hasConcept C99454951 @default.
- W3086596315 hasConceptScore W3086596315C119857082 @default.
- W3086596315 hasConceptScore W3086596315C121608353 @default.
- W3086596315 hasConceptScore W3086596315C124101348 @default.
- W3086596315 hasConceptScore W3086596315C126322002 @default.
- W3086596315 hasConceptScore W3086596315C136764020 @default.
- W3086596315 hasConceptScore W3086596315C152877465 @default.
- W3086596315 hasConceptScore W3086596315C154945302 @default.
- W3086596315 hasConceptScore W3086596315C17744445 @default.
- W3086596315 hasConceptScore W3086596315C189285262 @default.
- W3086596315 hasConceptScore W3086596315C199539241 @default.
- W3086596315 hasConceptScore W3086596315C205649164 @default.
- W3086596315 hasConceptScore W3086596315C2522767166 @default.
- W3086596315 hasConceptScore W3086596315C2776356880 @default.
- W3086596315 hasConceptScore W3086596315C2780192828 @default.
- W3086596315 hasConceptScore W3086596315C2908647359 @default.
- W3086596315 hasConceptScore W3086596315C37616216 @default.
- W3086596315 hasConceptScore W3086596315C41008148 @default.
- W3086596315 hasConceptScore W3086596315C42629822 @default.
- W3086596315 hasConceptScore W3086596315C52130261 @default.
- W3086596315 hasConceptScore W3086596315C58640448 @default.
- W3086596315 hasConceptScore W3086596315C71924100 @default.
- W3086596315 hasConceptScore W3086596315C99454951 @default.
- W3086596315 hasIssue "9_Supplement" @default.
- W3086596315 hasLocation W30865963151 @default.
- W3086596315 hasOpenAccess W3086596315 @default.
- W3086596315 hasPrimaryLocation W30865963151 @default.
- W3086596315 hasRelatedWork W2550368658 @default.
- W3086596315 hasRelatedWork W2940473938 @default.
- W3086596315 hasRelatedWork W3086596315 @default.
- W3086596315 hasRelatedWork W3129804828 @default.
- W3086596315 hasRelatedWork W3160348567 @default.
- W3086596315 hasRelatedWork W3163003763 @default.
- W3086596315 hasRelatedWork W3174196512 @default.
- W3086596315 hasRelatedWork W4312562135 @default.
- W3086596315 hasRelatedWork W4317475364 @default.
- W3086596315 hasRelatedWork W4365397292 @default.
- W3086596315 hasVolume "29" @default.
- W3086596315 isParatext "false" @default.
- W3086596315 isRetracted "false" @default.
- W3086596315 magId "3086596315" @default.
- W3086596315 workType "article" @default.