Matches in SemOpenAlex for { <https://semopenalex.org/work/W2182091212> ?p ?o ?g. }
- W2182091212 abstract "One approach to the study of learning in neural networks within the physics community has been to use statistical mechanics to calculate the expected error that a network will make on a typical novel example, termed the generalisation error. Such average case analyses have been mainly carried out with recourse to the thermodynamic limit in which the size of the network is taken to infinity. For the case of a finite sized network, however, the error is not self-averaging i.e., it remains dependent upon the actual set of examples used to train and test the network. The error estimated on a specific test set realisation, termed the test error, forms a finite sample approximation to the generalisation error. We present in this thesis a systematic examination of test error variances in finite sized networks trained by stochastic learning algorithms. Beginning with simple single layer systems, in particular, the linear perceptron, we calculate the test error variance arising from randomness in both the training examples and the stochastic Gibbs learning algorithm. This quantity enables us to examine the performance of networks in a limited data scenario, including the optimal partitioning of a data set into a training and testing set in order to minimize the average error that the network makes, whilst remaining confident that the average test error is representative. A detailed study of the variance of cross-validation errors is carried out, and a comparison made between different cross-validation schemes. We examine also the test error variance of the binary perceptron, comparing the results to the linear case. In addition, we study the effect of a finite system size on the on-line training of multilayer networks, in which we track the dynamic evolution of the error variance under the stochastic gradient descent algorithm used to train the network on an increasing amount of data. We find that the hidden unit symmetries of the multi-layer network give rise to relatively large finite size effects around the point at which the symmetries are broken. As the degree of symmetry in the initial conditions is increased, divergent finite size effects herald a phase transition in which the average case analysis breaks down. By including an easily implementable extra constraint on the training dynamics to encourage hidden unit asymmetry, we show that one can generically reduce both" @default.
- W2182091212 created "2016-06-24" @default.
- W2182091212 creator A5053299181 @default.
- W2182091212 date "1996-01-01" @default.
- W2182091212 modified "2023-09-28" @default.
- W2182091212 title "Finite size effects in neural network algorithms" @default.
- W2182091212 cites W130159071 @default.
- W2182091212 cites W1572401739 @default.
- W2182091212 cites W1594031697 @default.
- W2182091212 cites W1948236408 @default.
- W2182091212 cites W1964862779 @default.
- W2182091212 cites W1973030322 @default.
- W2182091212 cites W1973103003 @default.
- W2182091212 cites W1976422186 @default.
- W2182091212 cites W1987864296 @default.
- W2182091212 cites W1990359011 @default.
- W2182091212 cites W1990756387 @default.
- W2182091212 cites W1995431682 @default.
- W2182091212 cites W1995842804 @default.
- W2182091212 cites W1999545727 @default.
- W2182091212 cites W2007431958 @default.
- W2182091212 cites W2010581677 @default.
- W2182091212 cites W2019363670 @default.
- W2182091212 cites W2024325264 @default.
- W2182091212 cites W2029538739 @default.
- W2182091212 cites W2030450972 @default.
- W2182091212 cites W2033872649 @default.
- W2182091212 cites W2037985840 @default.
- W2182091212 cites W2038390905 @default.
- W2182091212 cites W2038774192 @default.
- W2182091212 cites W2040615655 @default.
- W2182091212 cites W2049387919 @default.
- W2182091212 cites W2058047059 @default.
- W2182091212 cites W2069129925 @default.
- W2182091212 cites W2077383040 @default.
- W2182091212 cites W2080531309 @default.
- W2182091212 cites W2080792322 @default.
- W2182091212 cites W2086631578 @default.
- W2182091212 cites W2090614046 @default.
- W2182091212 cites W2091574922 @default.
- W2182091212 cites W2100508793 @default.
- W2182091212 cites W2109049531 @default.
- W2182091212 cites W2112081648 @default.
- W2182091212 cites W2118649542 @default.
- W2182091212 cites W2133259161 @default.
- W2182091212 cites W2133671888 @default.
- W2182091212 cites W2143684265 @default.
- W2182091212 cites W2150872430 @default.
- W2182091212 cites W2200630309 @default.
- W2182091212 cites W22297218 @default.
- W2182091212 cites W2293063825 @default.
- W2182091212 cites W2322002063 @default.
- W2182091212 cites W2336612000 @default.
- W2182091212 cites W2400843461 @default.
- W2182091212 cites W2751862591 @default.
- W2182091212 cites W2912633461 @default.
- W2182091212 cites W2913713654 @default.
- W2182091212 cites W3085162807 @default.
- W2182091212 cites W3160562331 @default.
- W2182091212 cites W2154415584 @default.
- W2182091212 hasPublicationYear "1996" @default.
- W2182091212 type Work @default.
- W2182091212 sameAs 2182091212 @default.
- W2182091212 citedByCount "0" @default.
- W2182091212 crossrefType "dissertation" @default.
- W2182091212 hasAuthorship W2182091212A5053299181 @default.
- W2182091212 hasConcept C105795698 @default.
- W2182091212 hasConcept C11413529 @default.
- W2182091212 hasConcept C119857082 @default.
- W2182091212 hasConcept C121955636 @default.
- W2182091212 hasConcept C125112378 @default.
- W2182091212 hasConcept C144133560 @default.
- W2182091212 hasConcept C154945302 @default.
- W2182091212 hasConcept C169903167 @default.
- W2182091212 hasConcept C177264268 @default.
- W2182091212 hasConcept C196083921 @default.
- W2182091212 hasConcept C199360897 @default.
- W2182091212 hasConcept C33923547 @default.
- W2182091212 hasConcept C41008148 @default.
- W2182091212 hasConcept C50644808 @default.
- W2182091212 hasConcept C60908668 @default.
- W2182091212 hasConceptScore W2182091212C105795698 @default.
- W2182091212 hasConceptScore W2182091212C11413529 @default.
- W2182091212 hasConceptScore W2182091212C119857082 @default.
- W2182091212 hasConceptScore W2182091212C121955636 @default.
- W2182091212 hasConceptScore W2182091212C125112378 @default.
- W2182091212 hasConceptScore W2182091212C144133560 @default.
- W2182091212 hasConceptScore W2182091212C154945302 @default.
- W2182091212 hasConceptScore W2182091212C169903167 @default.
- W2182091212 hasConceptScore W2182091212C177264268 @default.
- W2182091212 hasConceptScore W2182091212C196083921 @default.
- W2182091212 hasConceptScore W2182091212C199360897 @default.
- W2182091212 hasConceptScore W2182091212C33923547 @default.
- W2182091212 hasConceptScore W2182091212C41008148 @default.
- W2182091212 hasConceptScore W2182091212C50644808 @default.
- W2182091212 hasConceptScore W2182091212C60908668 @default.
- W2182091212 hasLocation W21820912121 @default.
- W2182091212 hasOpenAccess W2182091212 @default.
- W2182091212 hasPrimaryLocation W21820912121 @default.
- W2182091212 hasRelatedWork W2896466118 @default.