Matches in SemOpenAlex for { <https://semopenalex.org/work/W4285347673> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W4285347673 abstract "Gradient descent has been a central training prin-ciple for artificial neural networks from the early beginnings to today's deep learning networks. The most common imple-mentation is the backpropagation algorithm for training feed-forward neural networks in a supervised fashion. A drawback of backpropagation has been the search required to find optimal values of two important training parameters, learning rate and momentum weight. The learning rate specifies the step size towards a minimum of the loss function when following the gradient, while the momentum weight considers previous weight changes when updating current weights. Using both parameters in conjunction with each other generally improves training, although their specific values do not follow immediately from standard backpropagation theory. This paper proposes a new information-theoretical loss function based on cross-entropy for which it derives a specific learning rate and momentum weight. Many training procedures based on backpropagation use cross-entropy directly as their loss function. Instead, this paper investigates a dual process model with two processes, in which one process minimizes the Kullback-Leibler divergence while its dual counterpart minimizes the Shannon entropy. The golden ratio plays an important role here, allowing to derive theoretical values for the learning rate and momentum weight, matching closely the values traditionally used in the literature, which are determined empirically. To validate this information-theoretical approach further, classification results for a handwritten digit recognition task are presented, showing that the proposed loss function, in conjunction with the derived learning rate and momentum weight, works in practice." @default.
- W4285347673 created "2022-07-14" @default.
- W4285347673 creator A5035100758 @default.
- W4285347673 date "2021-10-12" @default.
- W4285347673 modified "2023-10-16" @default.
- W4285347673 title "The Golden Ratio in Machine Learning" @default.
- W4285347673 cites W1498436455 @default.
- W4285347673 cites W1806891645 @default.
- W4285347673 cites W1965555277 @default.
- W4285347673 cites W1985940938 @default.
- W4285347673 cites W1995875735 @default.
- W4285347673 cites W2160208155 @default.
- W4285347673 cites W2194775991 @default.
- W4285347673 cites W2565654137 @default.
- W4285347673 cites W2919115771 @default.
- W4285347673 cites W4250767313 @default.
- W4285347673 cites W4255949318 @default.
- W4285347673 cites W4361979926 @default.
- W4285347673 doi "https://doi.org/10.1109/aipr52630.2021.9762080" @default.
- W4285347673 hasPublicationYear "2021" @default.
- W4285347673 type Work @default.
- W4285347673 citedByCount "2" @default.
- W4285347673 countsByYear W42853476732022 @default.
- W4285347673 countsByYear W42853476732023 @default.
- W4285347673 crossrefType "proceedings-article" @default.
- W4285347673 hasAuthorship W4285347673A5035100758 @default.
- W4285347673 hasConcept C10138342 @default.
- W4285347673 hasConcept C106301342 @default.
- W4285347673 hasConcept C11413529 @default.
- W4285347673 hasConcept C119857082 @default.
- W4285347673 hasConcept C121332964 @default.
- W4285347673 hasConcept C153258448 @default.
- W4285347673 hasConcept C154945302 @default.
- W4285347673 hasConcept C155032097 @default.
- W4285347673 hasConcept C162324750 @default.
- W4285347673 hasConcept C33923547 @default.
- W4285347673 hasConcept C41008148 @default.
- W4285347673 hasConcept C50644808 @default.
- W4285347673 hasConcept C60718061 @default.
- W4285347673 hasConcept C62520636 @default.
- W4285347673 hasConceptScore W4285347673C10138342 @default.
- W4285347673 hasConceptScore W4285347673C106301342 @default.
- W4285347673 hasConceptScore W4285347673C11413529 @default.
- W4285347673 hasConceptScore W4285347673C119857082 @default.
- W4285347673 hasConceptScore W4285347673C121332964 @default.
- W4285347673 hasConceptScore W4285347673C153258448 @default.
- W4285347673 hasConceptScore W4285347673C154945302 @default.
- W4285347673 hasConceptScore W4285347673C155032097 @default.
- W4285347673 hasConceptScore W4285347673C162324750 @default.
- W4285347673 hasConceptScore W4285347673C33923547 @default.
- W4285347673 hasConceptScore W4285347673C41008148 @default.
- W4285347673 hasConceptScore W4285347673C50644808 @default.
- W4285347673 hasConceptScore W4285347673C60718061 @default.
- W4285347673 hasConceptScore W4285347673C62520636 @default.
- W4285347673 hasLocation W42853476731 @default.
- W4285347673 hasOpenAccess W4285347673 @default.
- W4285347673 hasPrimaryLocation W42853476731 @default.
- W4285347673 hasRelatedWork W1183256782 @default.
- W4285347673 hasRelatedWork W1489050811 @default.
- W4285347673 hasRelatedWork W2018863220 @default.
- W4285347673 hasRelatedWork W2082482750 @default.
- W4285347673 hasRelatedWork W2156376770 @default.
- W4285347673 hasRelatedWork W2362189222 @default.
- W4285347673 hasRelatedWork W2391384657 @default.
- W4285347673 hasRelatedWork W3123071383 @default.
- W4285347673 hasRelatedWork W3159389381 @default.
- W4285347673 hasRelatedWork W3170244987 @default.
- W4285347673 isParatext "false" @default.
- W4285347673 isRetracted "false" @default.
- W4285347673 workType "article" @default.