Matches in SemOpenAlex for { <https://semopenalex.org/work/W3214406912> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W3214406912 abstract "Ensembles of machine learning models yield improved system performance as well as robust and interpretable uncertainty estimates; however, their inference costs may often be prohibitively high. emph{Ensemble Distribution Distillation} is an approach that allows a single model to efficiently capture both the predictive performance and uncertainty estimates of an ensemble. For classification, this is achieved by training a Dirichlet distribution over the ensemble members' output distributions via the maximum likelihood criterion. Although theoretically principled, this criterion exhibits poor convergence when applied to large-scale tasks where the number of classes is very high. In our work, we analyze this effect and show that the Dirichlet log-likelihood criterion classes with low probability induce larger gradients than high-probability classes. This forces the model to focus on the distribution of the ensemble tail-class probabilities. We propose a new training objective that minimizes the reverse KL-divergence to a emph{Proxy-Dirichlet} target derived from the ensemble. This loss resolves the gradient issues of Ensemble Distribution Distillation, as we demonstrate both theoretically and empirically on the ImageNet and WMT17 En-De datasets containing 1000 and 40,000 classes, respectively." @default.
- W3214406912 created "2021-11-22" @default.
- W3214406912 creator A5050766679 @default.
- W3214406912 creator A5080001860 @default.
- W3214406912 creator A5085980816 @default.
- W3214406912 date "2021-12-06" @default.
- W3214406912 modified "2023-10-18" @default.
- W3214406912 title "Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets" @default.
- W3214406912 hasPublicationYear "2021" @default.
- W3214406912 type Work @default.
- W3214406912 sameAs 3214406912 @default.
- W3214406912 citedByCount "0" @default.
- W3214406912 crossrefType "proceedings-article" @default.
- W3214406912 hasAuthorship W3214406912A5050766679 @default.
- W3214406912 hasAuthorship W3214406912A5080001860 @default.
- W3214406912 hasAuthorship W3214406912A5085980816 @default.
- W3214406912 hasConcept C105795698 @default.
- W3214406912 hasConcept C119857082 @default.
- W3214406912 hasConcept C126255220 @default.
- W3214406912 hasConcept C134306372 @default.
- W3214406912 hasConcept C149441793 @default.
- W3214406912 hasConcept C154945302 @default.
- W3214406912 hasConcept C169214877 @default.
- W3214406912 hasConcept C182310444 @default.
- W3214406912 hasConcept C2524010 @default.
- W3214406912 hasConcept C2776214188 @default.
- W3214406912 hasConcept C33923547 @default.
- W3214406912 hasConcept C41008148 @default.
- W3214406912 hasConcept C45942800 @default.
- W3214406912 hasConcept C99844830 @default.
- W3214406912 hasConceptScore W3214406912C105795698 @default.
- W3214406912 hasConceptScore W3214406912C119857082 @default.
- W3214406912 hasConceptScore W3214406912C126255220 @default.
- W3214406912 hasConceptScore W3214406912C134306372 @default.
- W3214406912 hasConceptScore W3214406912C149441793 @default.
- W3214406912 hasConceptScore W3214406912C154945302 @default.
- W3214406912 hasConceptScore W3214406912C169214877 @default.
- W3214406912 hasConceptScore W3214406912C182310444 @default.
- W3214406912 hasConceptScore W3214406912C2524010 @default.
- W3214406912 hasConceptScore W3214406912C2776214188 @default.
- W3214406912 hasConceptScore W3214406912C33923547 @default.
- W3214406912 hasConceptScore W3214406912C41008148 @default.
- W3214406912 hasConceptScore W3214406912C45942800 @default.
- W3214406912 hasConceptScore W3214406912C99844830 @default.
- W3214406912 hasLocation W32144069121 @default.
- W3214406912 hasOpenAccess W3214406912 @default.
- W3214406912 hasPrimaryLocation W32144069121 @default.
- W3214406912 hasRelatedWork W1689407543 @default.
- W3214406912 hasRelatedWork W2062665066 @default.
- W3214406912 hasRelatedWork W2082574365 @default.
- W3214406912 hasRelatedWork W2167306325 @default.
- W3214406912 hasRelatedWork W2284561687 @default.
- W3214406912 hasRelatedWork W2437817353 @default.
- W3214406912 hasRelatedWork W2908620745 @default.
- W3214406912 hasRelatedWork W2912371996 @default.
- W3214406912 hasRelatedWork W2920079995 @default.
- W3214406912 hasRelatedWork W2947651699 @default.
- W3214406912 hasRelatedWork W2963424904 @default.
- W3214406912 hasRelatedWork W3034711337 @default.
- W3214406912 hasRelatedWork W3035158267 @default.
- W3214406912 hasRelatedWork W3098430511 @default.
- W3214406912 hasRelatedWork W3099446623 @default.
- W3214406912 hasRelatedWork W3111934013 @default.
- W3214406912 hasRelatedWork W3161663089 @default.
- W3214406912 hasRelatedWork W3181674477 @default.
- W3214406912 hasRelatedWork W3194745876 @default.
- W3214406912 hasRelatedWork W62412192 @default.
- W3214406912 hasVolume "34" @default.
- W3214406912 isParatext "false" @default.
- W3214406912 isRetracted "false" @default.
- W3214406912 magId "3214406912" @default.
- W3214406912 workType "article" @default.