Matches in SemOpenAlex for { <https://semopenalex.org/work/W3102889971> ?p ?o ?g. }
- W3102889971 endingPage "578" @default.
- W3102889971 startingPage "565" @default.
- W3102889971 abstract "Abstract. The number of samples used in the calibration data set affects the quality of the generated predictive models using visible, near and shortwave infrared (VIS–NIR–SWIR) spectroscopy for soil attributes. Recently, the convolutional neural network (CNN) has been regarded as a highly accurate model for predicting soil properties on a large database. However, it has not yet been ascertained how large the sample size should be for CNN model to be effective. This paper investigates the effect of the training sample size on the accuracy of deep learning and machine learning models. It aims at providing an estimate of how many calibration samples are needed to improve the model performance of soil properties predictions with CNN as compared to conventional machine learning models. In addition, this paper also looks at a way to interpret the CNN models, which are commonly labelled as a black box. It is hypothesised that the performance of machine learning models will increase with an increasing number of training samples, but it will plateau when it reaches a certain number, while the performance of CNN will keep improving. The performances of two machine learning models (partial least squares regression – PLSR; Cubist) are compared against the CNN model. A VIS–NIR–SWIR spectra library from Brazil, containing 4251 unique sites with averages of two to three samples per depth (a total of 12 044 samples), was divided into calibration (3188 sites) and validation (1063 sites) sets. A subset of the calibration data set was then created to represent a smaller calibration data set ranging from 125, 300, 500, 1000, 1500, 2000, 2500 and 2700 unique sites, which is equivalent to a sample size of approximately 350, 840, 1400, 2800, 4200, 5600, 7000 and 7650. All three models (PLSR, Cubist and CNN) were generated for each sample size of the unique sites for the prediction of five different soil properties, i.e. cation exchange capacity, organic carbon, sand, silt and clay content. These calibration subset sampling processes and modelling were repeated 10 times to provide a better representation of the model performances. Learning curves showed that the accuracy increased with an increasing number of training samples. At a lower number of samples (< 1000), PLSR and Cubist performed better than CNN. The performance of CNN outweighed the PLSR and Cubist model at a sample size of 1500 and 1800, respectively. It can be recommended that deep learning is most efficient for spectra modelling for sample sizes above 2000. The accuracy of the PLSR and Cubist model seems to reach a plateau above sample sizes of 4200 and 5000, respectively, while the accuracy of CNN has not plateaued. A sensitivity analysis of the CNN model demonstrated its ability to determine important wavelengths region that affected the predictions of various soil attributes." @default.
- W3102889971 created "2020-11-23" @default.
- W3102889971 creator A5051455406 @default.
- W3102889971 creator A5056213122 @default.
- W3102889971 creator A5061651690 @default.
- W3102889971 creator A5066817671 @default.
- W3102889971 date "2020-11-17" @default.
- W3102889971 modified "2023-10-15" @default.
- W3102889971 title "The influence of training sample size on the accuracy of deep learning models for the prediction of soil properties with near-infrared spectroscopy data" @default.
- W3102889971 cites W1592341820 @default.
- W3102889971 cites W1973273412 @default.
- W3102889971 cites W1998053851 @default.
- W3102889971 cites W2007365900 @default.
- W3102889971 cites W2012358846 @default.
- W3102889971 cites W2016090370 @default.
- W3102889971 cites W2027368520 @default.
- W3102889971 cites W2064345732 @default.
- W3102889971 cites W2074146906 @default.
- W3102889971 cites W2091160252 @default.
- W3102889971 cites W2101113206 @default.
- W3102889971 cites W2101546648 @default.
- W3102889971 cites W2109606373 @default.
- W3102889971 cites W2125949024 @default.
- W3102889971 cites W2132886902 @default.
- W3102889971 cites W2142606975 @default.
- W3102889971 cites W2158551840 @default.
- W3102889971 cites W2165993842 @default.
- W3102889971 cites W2194775991 @default.
- W3102889971 cites W2414889478 @default.
- W3102889971 cites W2564339002 @default.
- W3102889971 cites W2602766315 @default.
- W3102889971 cites W2784062288 @default.
- W3102889971 cites W2804146571 @default.
- W3102889971 cites W2883273084 @default.
- W3102889971 cites W2889458880 @default.
- W3102889971 cites W2891747104 @default.
- W3102889971 cites W2895473117 @default.
- W3102889971 cites W2903091095 @default.
- W3102889971 cites W2913854917 @default.
- W3102889971 cites W2919115771 @default.
- W3102889971 cites W2951230751 @default.
- W3102889971 cites W2952266823 @default.
- W3102889971 cites W3010310735 @default.
- W3102889971 cites W3015778985 @default.
- W3102889971 cites W3125817276 @default.
- W3102889971 cites W3644042 @default.
- W3102889971 doi "https://doi.org/10.5194/soil-6-565-2020" @default.
- W3102889971 hasPublicationYear "2020" @default.
- W3102889971 type Work @default.
- W3102889971 sameAs 3102889971 @default.
- W3102889971 citedByCount "57" @default.
- W3102889971 countsByYear W31028899712021 @default.
- W3102889971 countsByYear W31028899712022 @default.
- W3102889971 countsByYear W31028899712023 @default.
- W3102889971 crossrefType "journal-article" @default.
- W3102889971 hasAuthorship W3102889971A5051455406 @default.
- W3102889971 hasAuthorship W3102889971A5056213122 @default.
- W3102889971 hasAuthorship W3102889971A5061651690 @default.
- W3102889971 hasAuthorship W3102889971A5066817671 @default.
- W3102889971 hasBestOaLocation W31028899711 @default.
- W3102889971 hasConcept C105795698 @default.
- W3102889971 hasConcept C119857082 @default.
- W3102889971 hasConcept C120665830 @default.
- W3102889971 hasConcept C121332964 @default.
- W3102889971 hasConcept C129848803 @default.
- W3102889971 hasConcept C153180895 @default.
- W3102889971 hasConcept C154945302 @default.
- W3102889971 hasConcept C165838908 @default.
- W3102889971 hasConcept C177264268 @default.
- W3102889971 hasConcept C185592680 @default.
- W3102889971 hasConcept C198531522 @default.
- W3102889971 hasConcept C199360897 @default.
- W3102889971 hasConcept C205649164 @default.
- W3102889971 hasConcept C22354355 @default.
- W3102889971 hasConcept C33923547 @default.
- W3102889971 hasConcept C41008148 @default.
- W3102889971 hasConcept C43571822 @default.
- W3102889971 hasConcept C43617362 @default.
- W3102889971 hasConcept C45804977 @default.
- W3102889971 hasConcept C50644808 @default.
- W3102889971 hasConcept C58489278 @default.
- W3102889971 hasConcept C62649853 @default.
- W3102889971 hasConcept C81363708 @default.
- W3102889971 hasConceptScore W3102889971C105795698 @default.
- W3102889971 hasConceptScore W3102889971C119857082 @default.
- W3102889971 hasConceptScore W3102889971C120665830 @default.
- W3102889971 hasConceptScore W3102889971C121332964 @default.
- W3102889971 hasConceptScore W3102889971C129848803 @default.
- W3102889971 hasConceptScore W3102889971C153180895 @default.
- W3102889971 hasConceptScore W3102889971C154945302 @default.
- W3102889971 hasConceptScore W3102889971C165838908 @default.
- W3102889971 hasConceptScore W3102889971C177264268 @default.
- W3102889971 hasConceptScore W3102889971C185592680 @default.
- W3102889971 hasConceptScore W3102889971C198531522 @default.
- W3102889971 hasConceptScore W3102889971C199360897 @default.
- W3102889971 hasConceptScore W3102889971C205649164 @default.
- W3102889971 hasConceptScore W3102889971C22354355 @default.
- W3102889971 hasConceptScore W3102889971C33923547 @default.