Matches in SemOpenAlex for { <https://semopenalex.org/work/W3093012717> ?p ?o ?g. }
- W3093012717 abstract "We propose a novel and efficient training method for RNNs by iteratively seeking a local minima on the loss surface within a small region, and leverage this directional vector for the update, in an outer-loop. We propose to utilize the Frank-Wolfe (FW) algorithm in this context. Although, FW implicitly involves normalized gradients, which can lead to a slow convergence rate, we develop a novel RNN training method that, surprisingly, even with the additional cost, the overall training cost is empirically observed to be lower than back-propagation. Our method leads to a new Frank-Wolfe method, that is in essence an SGD algorithm with a restart scheme. We prove that under certain conditions our algorithm has a sublinear convergence rate of $O(1/epsilon)$ for $epsilon$ error. We then conduct empirical experiments on several benchmark datasets including those that exhibit long-term dependencies, and show significant performance improvement. We also experiment with deep RNN architectures and show efficient training performance. Finally, we demonstrate that our training method is robust to noisy data." @default.
- W3093012717 created "2020-10-22" @default.
- W3093012717 creator A5021384155 @default.
- W3093012717 creator A5037164369 @default.
- W3093012717 creator A5048506332 @default.
- W3093012717 creator A5048704387 @default.
- W3093012717 date "2020-10-11" @default.
- W3093012717 modified "2023-09-23" @default.
- W3093012717 title "RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm" @default.
- W3093012717 cites W1482804306 @default.
- W3093012717 cites W1522301498 @default.
- W3093012717 cites W1574851760 @default.
- W3093012717 cites W1771410628 @default.
- W3093012717 cites W1800356822 @default.
- W3093012717 cites W1815076433 @default.
- W3093012717 cites W1889624880 @default.
- W3093012717 cites W1981039871 @default.
- W3093012717 cites W2008308152 @default.
- W3093012717 cites W2016589492 @default.
- W3093012717 cites W2038904236 @default.
- W3093012717 cites W2040055413 @default.
- W3093012717 cites W2062210787 @default.
- W3093012717 cites W2064675550 @default.
- W3093012717 cites W2087831875 @default.
- W3093012717 cites W2108563286 @default.
- W3093012717 cites W2112246162 @default.
- W3093012717 cites W2128084896 @default.
- W3093012717 cites W2135473110 @default.
- W3093012717 cites W2136885855 @default.
- W3093012717 cites W2140575957 @default.
- W3093012717 cites W2243249356 @default.
- W3093012717 cites W2547718140 @default.
- W3093012717 cites W2581719241 @default.
- W3093012717 cites W2619808423 @default.
- W3093012717 cites W2620263509 @default.
- W3093012717 cites W2749590927 @default.
- W3093012717 cites W2752813595 @default.
- W3093012717 cites W2906125418 @default.
- W3093012717 cites W2914623577 @default.
- W3093012717 cites W2918846640 @default.
- W3093012717 cites W2942751004 @default.
- W3093012717 cites W2950059857 @default.
- W3093012717 cites W2950297649 @default.
- W3093012717 cites W2958796454 @default.
- W3093012717 cites W2962840019 @default.
- W3093012717 cites W2963010571 @default.
- W3093012717 cites W2963042606 @default.
- W3093012717 cites W2963052924 @default.
- W3093012717 cites W2963167419 @default.
- W3093012717 cites W2963174729 @default.
- W3093012717 cites W2963241221 @default.
- W3093012717 cites W2963276946 @default.
- W3093012717 cites W2963570896 @default.
- W3093012717 cites W2963755523 @default.
- W3093012717 cites W2963983719 @default.
- W3093012717 cites W2963992014 @default.
- W3093012717 cites W2964199361 @default.
- W3093012717 cites W2964269252 @default.
- W3093012717 cites W2964304579 @default.
- W3093012717 cites W2964347220 @default.
- W3093012717 cites W2970418991 @default.
- W3093012717 cites W2991290085 @default.
- W3093012717 cites W2994747431 @default.
- W3093012717 cites W2999543292 @default.
- W3093012717 cites W905619 @default.
- W3093012717 doi "https://doi.org/10.48550/arxiv.2010.05397" @default.
- W3093012717 hasPublicationYear "2020" @default.
- W3093012717 type Work @default.
- W3093012717 sameAs 3093012717 @default.
- W3093012717 citedByCount "0" @default.
- W3093012717 crossrefType "posted-content" @default.
- W3093012717 hasAuthorship W3093012717A5021384155 @default.
- W3093012717 hasAuthorship W3093012717A5037164369 @default.
- W3093012717 hasAuthorship W3093012717A5048506332 @default.
- W3093012717 hasAuthorship W3093012717A5048704387 @default.
- W3093012717 hasBestOaLocation W30930127171 @default.
- W3093012717 hasConcept C11413529 @default.
- W3093012717 hasConcept C117160843 @default.
- W3093012717 hasConcept C119857082 @default.
- W3093012717 hasConcept C121332964 @default.
- W3093012717 hasConcept C126255220 @default.
- W3093012717 hasConcept C13280743 @default.
- W3093012717 hasConcept C134306372 @default.
- W3093012717 hasConcept C151730666 @default.
- W3093012717 hasConcept C153083717 @default.
- W3093012717 hasConcept C153294291 @default.
- W3093012717 hasConcept C154945302 @default.
- W3093012717 hasConcept C162324750 @default.
- W3093012717 hasConcept C185798385 @default.
- W3093012717 hasConcept C186633575 @default.
- W3093012717 hasConcept C205649164 @default.
- W3093012717 hasConcept C26517878 @default.
- W3093012717 hasConcept C2777211547 @default.
- W3093012717 hasConcept C2777303404 @default.
- W3093012717 hasConcept C2779343474 @default.
- W3093012717 hasConcept C33923547 @default.
- W3093012717 hasConcept C38652104 @default.
- W3093012717 hasConcept C41008148 @default.
- W3093012717 hasConcept C50522688 @default.
- W3093012717 hasConcept C57869625 @default.