Matches in SemOpenAlex for { <https://semopenalex.org/work/W3093668913> ?p ?o ?g. }
- W3093668913 abstract "Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence. To achieve this, we maintain a distribution over all examples, selecting a mini-batch in each iteration by sampling according to this distribution, which we update using a multi-armed bandit algorithm. This ensures that examples that are more beneficial to the model training are sampled with higher probabilities. We theoretically show that Adambs improves the convergence rate of Adam---$O(sqrt{frac{log n}{T} })$ instead of $O(sqrt{frac{n}{T}})$ in some cases. Experiments on various models and datasets demonstrate Adambs's fast convergence in practice." @default.
- W3093668913 created "2020-10-29" @default.
- W3093668913 creator A5007101740 @default.
- W3093668913 creator A5060287971 @default.
- W3093668913 creator A5081675173 @default.
- W3093668913 date "2020-10-24" @default.
- W3093668913 modified "2023-09-23" @default.
- W3093668913 title "Adam with Bandit Sampling for Deep Learning" @default.
- W3093668913 cites W1825512540 @default.
- W3093668913 cites W1842094663 @default.
- W3093668913 cites W1918179283 @default.
- W3093668913 cites W2049934117 @default.
- W3093668913 cites W2093717447 @default.
- W3093668913 cites W2116067849 @default.
- W3093668913 cites W2135106139 @default.
- W3093668913 cites W2146502635 @default.
- W3093668913 cites W2162287622 @default.
- W3093668913 cites W2164075197 @default.
- W3093668913 cites W2168020168 @default.
- W3093668913 cites W2168405694 @default.
- W3093668913 cites W2177410802 @default.
- W3093668913 cites W2265846598 @default.
- W3093668913 cites W2296073425 @default.
- W3093668913 cites W2402144811 @default.
- W3093668913 cites W2428862780 @default.
- W3093668913 cites W2741107118 @default.
- W3093668913 cites W2750384547 @default.
- W3093668913 cites W2772169551 @default.
- W3093668913 cites W2775062428 @default.
- W3093668913 cites W2784570262 @default.
- W3093668913 cites W2785523195 @default.
- W3093668913 cites W2794302998 @default.
- W3093668913 cites W2891128925 @default.
- W3093668913 cites W2903382683 @default.
- W3093668913 cites W2904671658 @default.
- W3093668913 cites W2926516160 @default.
- W3093668913 cites W2963341956 @default.
- W3093668913 cites W2964121744 @default.
- W3093668913 cites W2967845286 @default.
- W3093668913 cites W2970617576 @default.
- W3093668913 cites W3005560345 @default.
- W3093668913 cites W3013571468 @default.
- W3093668913 cites W3118608800 @default.
- W3093668913 cites W6908809 @default.
- W3093668913 hasPublicationYear "2020" @default.
- W3093668913 type Work @default.
- W3093668913 sameAs 3093668913 @default.
- W3093668913 citedByCount "2" @default.
- W3093668913 countsByYear W30936689132020 @default.
- W3093668913 countsByYear W30936689132021 @default.
- W3093668913 crossrefType "posted-content" @default.
- W3093668913 hasAuthorship W3093668913A5007101740 @default.
- W3093668913 hasAuthorship W3093668913A5060287971 @default.
- W3093668913 hasAuthorship W3093668913A5081675173 @default.
- W3093668913 hasConcept C106131492 @default.
- W3093668913 hasConcept C110121322 @default.
- W3093668913 hasConcept C11413529 @default.
- W3093668913 hasConcept C126255220 @default.
- W3093668913 hasConcept C134306372 @default.
- W3093668913 hasConcept C140779682 @default.
- W3093668913 hasConcept C154945302 @default.
- W3093668913 hasConcept C162324750 @default.
- W3093668913 hasConcept C177148314 @default.
- W3093668913 hasConcept C26517878 @default.
- W3093668913 hasConcept C2777303404 @default.
- W3093668913 hasConcept C31972630 @default.
- W3093668913 hasConcept C33923547 @default.
- W3093668913 hasConcept C38652104 @default.
- W3093668913 hasConcept C41008148 @default.
- W3093668913 hasConcept C50522688 @default.
- W3093668913 hasConcept C57869625 @default.
- W3093668913 hasConceptScore W3093668913C106131492 @default.
- W3093668913 hasConceptScore W3093668913C110121322 @default.
- W3093668913 hasConceptScore W3093668913C11413529 @default.
- W3093668913 hasConceptScore W3093668913C126255220 @default.
- W3093668913 hasConceptScore W3093668913C134306372 @default.
- W3093668913 hasConceptScore W3093668913C140779682 @default.
- W3093668913 hasConceptScore W3093668913C154945302 @default.
- W3093668913 hasConceptScore W3093668913C162324750 @default.
- W3093668913 hasConceptScore W3093668913C177148314 @default.
- W3093668913 hasConceptScore W3093668913C26517878 @default.
- W3093668913 hasConceptScore W3093668913C2777303404 @default.
- W3093668913 hasConceptScore W3093668913C31972630 @default.
- W3093668913 hasConceptScore W3093668913C33923547 @default.
- W3093668913 hasConceptScore W3093668913C38652104 @default.
- W3093668913 hasConceptScore W3093668913C41008148 @default.
- W3093668913 hasConceptScore W3093668913C50522688 @default.
- W3093668913 hasConceptScore W3093668913C57869625 @default.
- W3093668913 hasLocation W30936689131 @default.
- W3093668913 hasOpenAccess W3093668913 @default.
- W3093668913 hasPrimaryLocation W30936689131 @default.
- W3093668913 hasRelatedWork W1835900096 @default.
- W3093668913 hasRelatedWork W2099471337 @default.
- W3093668913 hasRelatedWork W2106164082 @default.
- W3093668913 hasRelatedWork W2142774925 @default.
- W3093668913 hasRelatedWork W2616001803 @default.
- W3093668913 hasRelatedWork W2897463648 @default.
- W3093668913 hasRelatedWork W2963094593 @default.
- W3093668913 hasRelatedWork W2966236865 @default.
- W3093668913 hasRelatedWork W2987639737 @default.