Matches in SemOpenAlex for { <https://semopenalex.org/work/W3177558017> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W3177558017 endingPage "108151" @default.
- W3177558017 startingPage "108151" @default.
- W3177558017 abstract "Deep neural network models owe their representational power and high performance in classification tasks to the high number of learnable parameters. Running deep neural network models in limited-resource environments is a problematic task. Models employing conditional computing aim to reduce the computational burden while retaining model performance on par with more complex neural network models. This paper, proposes a new model, Conditional Information Gain Networks as Sparse Mixture of Experts (sMoE-CIGNs). A CIGN model is a neural tree that allows conditionally skipping parts of the tree based on routing mechanisms inserted into the architecture. These routing mechanisms are based on differentiable Information Gain objectives. CIGN groups semantically similar samples in the leaves, enabling simpler classifiers to focus on differentiating between similar classes. This lets the CIGN model attain high classification performances with lighter models. We further improve the basic CIGN model by proposing a sparse mixture of experts model for difficult to classify samples that may get routed to suboptimal branches. If a sample has routing confidence higher than a specific threshold, the sample may be routed towards multiple child nodes. The classification decision can then be taken as a mixture of these expert decisions. We learn the optimal routing thresholds by Bayesian Optimization over a validation set by minimizing a weighted loss, including the classification accuracy and the number of multiplication and accumulations (MAC). We show the effectiveness of the CIGN models enhanced with the Sparse Mixture of Experts approach with extensive tests on MNIST, Fashion MNIST, CIFAR 100 and UCI-USPS datasets, as well as comparisons with methods from the literature. sMoE-CIGN models can retain high generalization performance, on par with a thick unconditional model while keeping the operation burden at the same level with a much thinner model.1" @default.
- W3177558017 created "2021-07-19" @default.
- W3177558017 creator A5006939827 @default.
- W3177558017 creator A5064788653 @default.
- W3177558017 date "2021-12-01" @default.
- W3177558017 modified "2023-09-27" @default.
- W3177558017 title "Conditional information gain networks as sparse mixture of experts" @default.
- W3177558017 cites W1510052597 @default.
- W3177558017 cites W2025653905 @default.
- W3177558017 cites W2100659887 @default.
- W3177558017 cites W2112796928 @default.
- W3177558017 cites W2142859438 @default.
- W3177558017 cites W2604319603 @default.
- W3177558017 cites W2898886170 @default.
- W3177558017 cites W2952179044 @default.
- W3177558017 cites W2963096189 @default.
- W3177558017 cites W4236137412 @default.
- W3177558017 doi "https://doi.org/10.1016/j.patcog.2021.108151" @default.
- W3177558017 hasPublicationYear "2021" @default.
- W3177558017 type Work @default.
- W3177558017 sameAs 3177558017 @default.
- W3177558017 citedByCount "0" @default.
- W3177558017 crossrefType "journal-article" @default.
- W3177558017 hasAuthorship W3177558017A5006939827 @default.
- W3177558017 hasAuthorship W3177558017A5064788653 @default.
- W3177558017 hasConcept C113174947 @default.
- W3177558017 hasConcept C119857082 @default.
- W3177558017 hasConcept C124101348 @default.
- W3177558017 hasConcept C134306372 @default.
- W3177558017 hasConcept C154945302 @default.
- W3177558017 hasConcept C162324750 @default.
- W3177558017 hasConcept C177264268 @default.
- W3177558017 hasConcept C187736073 @default.
- W3177558017 hasConcept C190502265 @default.
- W3177558017 hasConcept C199360897 @default.
- W3177558017 hasConcept C2780451532 @default.
- W3177558017 hasConcept C31258907 @default.
- W3177558017 hasConcept C33923547 @default.
- W3177558017 hasConcept C41008148 @default.
- W3177558017 hasConcept C50644808 @default.
- W3177558017 hasConcept C74172769 @default.
- W3177558017 hasConceptScore W3177558017C113174947 @default.
- W3177558017 hasConceptScore W3177558017C119857082 @default.
- W3177558017 hasConceptScore W3177558017C124101348 @default.
- W3177558017 hasConceptScore W3177558017C134306372 @default.
- W3177558017 hasConceptScore W3177558017C154945302 @default.
- W3177558017 hasConceptScore W3177558017C162324750 @default.
- W3177558017 hasConceptScore W3177558017C177264268 @default.
- W3177558017 hasConceptScore W3177558017C187736073 @default.
- W3177558017 hasConceptScore W3177558017C190502265 @default.
- W3177558017 hasConceptScore W3177558017C199360897 @default.
- W3177558017 hasConceptScore W3177558017C2780451532 @default.
- W3177558017 hasConceptScore W3177558017C31258907 @default.
- W3177558017 hasConceptScore W3177558017C33923547 @default.
- W3177558017 hasConceptScore W3177558017C41008148 @default.
- W3177558017 hasConceptScore W3177558017C50644808 @default.
- W3177558017 hasConceptScore W3177558017C74172769 @default.
- W3177558017 hasLocation W31775580171 @default.
- W3177558017 hasOpenAccess W3177558017 @default.
- W3177558017 hasPrimaryLocation W31775580171 @default.
- W3177558017 hasRelatedWork W1509467138 @default.
- W3177558017 hasRelatedWork W2358841807 @default.
- W3177558017 hasRelatedWork W2597787948 @default.
- W3177558017 hasRelatedWork W2762452805 @default.
- W3177558017 hasRelatedWork W2904174853 @default.
- W3177558017 hasRelatedWork W2951786554 @default.
- W3177558017 hasRelatedWork W2978290780 @default.
- W3177558017 hasRelatedWork W3186840088 @default.
- W3177558017 hasRelatedWork W4287064118 @default.
- W3177558017 hasRelatedWork W1629725936 @default.
- W3177558017 hasVolume "120" @default.
- W3177558017 isParatext "false" @default.
- W3177558017 isRetracted "false" @default.
- W3177558017 magId "3177558017" @default.
- W3177558017 workType "article" @default.