Matches in SemOpenAlex for { <https://semopenalex.org/work/W2898061370> ?p ?o ?g. }
- W2898061370 abstract "Mixture-of-Experts (MoE) is a widely popular model for ensemble learning and is a basic building block of highly successful modern neural networks as well as a component in Gated Recurrent Units (GRU) and Attention networks. However, present algorithms for learning MoE including the EM algorithm, and gradient descent are known to get stuck in local optima. From a theoretical viewpoint, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this paper, we introduce the first algorithm that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees. While existing algorithms jointly or iteratively estimate the expert parameters and the gating paramters in the MoE, we propose a novel algorithm that breaks the deadlock and can directly estimate the expert parameters by sensing its echo in a carefully designed cross-moment tensor between the inputs and the output. Once the experts are known, the recovery of gating parameters still requires an EM algorithm; however, we show that the EM algorithm for this simplified problem, unlike the joint EM algorithm, converges to the true parameters. We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines." @default.
- W2898061370 created "2018-10-26" @default.
- W2898061370 creator A5028243041 @default.
- W2898061370 creator A5050142356 @default.
- W2898061370 creator A5053980484 @default.
- W2898061370 creator A5060138971 @default.
- W2898061370 date "2018-02-21" @default.
- W2898061370 modified "2023-09-22" @default.
- W2898061370 title "Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms" @default.
- W2898061370 cites W110304591 @default.
- W2898061370 cites W1486307571 @default.
- W2898061370 cites W1581152950 @default.
- W2898061370 cites W1625941627 @default.
- W2898061370 cites W1839868949 @default.
- W2898061370 cites W1873595945 @default.
- W2898061370 cites W1924770834 @default.
- W2898061370 cites W2011277999 @default.
- W2898061370 cites W2025653905 @default.
- W2898061370 cites W2057503509 @default.
- W2898061370 cites W2061933243 @default.
- W2898061370 cites W2066334462 @default.
- W2898061370 cites W2068238590 @default.
- W2898061370 cites W2097039814 @default.
- W2898061370 cites W2105724942 @default.
- W2898061370 cites W2125126592 @default.
- W2898061370 cites W2127771408 @default.
- W2898061370 cites W2130031702 @default.
- W2898061370 cites W2138967244 @default.
- W2898061370 cites W2150884987 @default.
- W2898061370 cites W2245816615 @default.
- W2898061370 cites W2345886625 @default.
- W2898061370 cites W2518479542 @default.
- W2898061370 cites W2550703763 @default.
- W2898061370 cites W2625063094 @default.
- W2898061370 cites W2766371994 @default.
- W2898061370 cites W2952339051 @default.
- W2898061370 cites W2962737134 @default.
- W2898061370 cites W2963164444 @default.
- W2898061370 cites W2963254338 @default.
- W2898061370 cites W2963383839 @default.
- W2898061370 cites W2963403868 @default.
- W2898061370 cites W2963519230 @default.
- W2898061370 cites W2963744427 @default.
- W2898061370 cites W2964207716 @default.
- W2898061370 cites W2966327190 @default.
- W2898061370 hasPublicationYear "2018" @default.
- W2898061370 type Work @default.
- W2898061370 sameAs 2898061370 @default.
- W2898061370 citedByCount "0" @default.
- W2898061370 crossrefType "posted-content" @default.
- W2898061370 hasAuthorship W2898061370A5028243041 @default.
- W2898061370 hasAuthorship W2898061370A5050142356 @default.
- W2898061370 hasAuthorship W2898061370A5053980484 @default.
- W2898061370 hasAuthorship W2898061370A5060138971 @default.
- W2898061370 hasConcept C105795698 @default.
- W2898061370 hasConcept C11413529 @default.
- W2898061370 hasConcept C119857082 @default.
- W2898061370 hasConcept C121332964 @default.
- W2898061370 hasConcept C136197465 @default.
- W2898061370 hasConcept C141934464 @default.
- W2898061370 hasConcept C153258448 @default.
- W2898061370 hasConcept C154945302 @default.
- W2898061370 hasConcept C165064840 @default.
- W2898061370 hasConcept C168167062 @default.
- W2898061370 hasConcept C17744445 @default.
- W2898061370 hasConcept C179254644 @default.
- W2898061370 hasConcept C199539241 @default.
- W2898061370 hasConcept C2524010 @default.
- W2898061370 hasConcept C2776436953 @default.
- W2898061370 hasConcept C2777210771 @default.
- W2898061370 hasConcept C2778558725 @default.
- W2898061370 hasConcept C33923547 @default.
- W2898061370 hasConcept C41008148 @default.
- W2898061370 hasConcept C50644808 @default.
- W2898061370 hasConcept C74650414 @default.
- W2898061370 hasConcept C94625758 @default.
- W2898061370 hasConcept C97355855 @default.
- W2898061370 hasConceptScore W2898061370C105795698 @default.
- W2898061370 hasConceptScore W2898061370C11413529 @default.
- W2898061370 hasConceptScore W2898061370C119857082 @default.
- W2898061370 hasConceptScore W2898061370C121332964 @default.
- W2898061370 hasConceptScore W2898061370C136197465 @default.
- W2898061370 hasConceptScore W2898061370C141934464 @default.
- W2898061370 hasConceptScore W2898061370C153258448 @default.
- W2898061370 hasConceptScore W2898061370C154945302 @default.
- W2898061370 hasConceptScore W2898061370C165064840 @default.
- W2898061370 hasConceptScore W2898061370C168167062 @default.
- W2898061370 hasConceptScore W2898061370C17744445 @default.
- W2898061370 hasConceptScore W2898061370C179254644 @default.
- W2898061370 hasConceptScore W2898061370C199539241 @default.
- W2898061370 hasConceptScore W2898061370C2524010 @default.
- W2898061370 hasConceptScore W2898061370C2776436953 @default.
- W2898061370 hasConceptScore W2898061370C2777210771 @default.
- W2898061370 hasConceptScore W2898061370C2778558725 @default.
- W2898061370 hasConceptScore W2898061370C33923547 @default.
- W2898061370 hasConceptScore W2898061370C41008148 @default.
- W2898061370 hasConceptScore W2898061370C50644808 @default.
- W2898061370 hasConceptScore W2898061370C74650414 @default.
- W2898061370 hasConceptScore W2898061370C94625758 @default.
- W2898061370 hasConceptScore W2898061370C97355855 @default.