Matches in SemOpenAlex for { <https://semopenalex.org/work/W2951210602> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W2951210602 abstract "Softmax is an output activation function for modeling categorical probability distributions in many applications of deep learning. However, a recent study revealed that softmax can be a bottleneck of representational capacity of neural networks in language modeling (the softmax bottleneck). In this paper, we propose an output activation function for breaking the softmax bottleneck without additional parameters. We re-analyze the softmax bottleneck from the perspective of the output set of log-softmax and identify the cause of the softmax bottleneck. On the basis of this analysis, we propose sigsoftmax, which is composed of a multiplication of an exponential function and sigmoid function. Sigsoftmax can break the softmax bottleneck. The experiments on language modeling demonstrate that sigsoftmax and mixture of sigsoftmax outperform softmax and mixture of softmax, respectively." @default.
- W2951210602 created "2019-06-27" @default.
- W2951210602 creator A5006786260 @default.
- W2951210602 creator A5010327448 @default.
- W2951210602 creator A5055297022 @default.
- W2951210602 creator A5075849177 @default.
- W2951210602 date "2018-05-28" @default.
- W2951210602 modified "2023-09-28" @default.
- W2951210602 title "Sigsoftmax: Reanalysis of the Softmax Bottleneck" @default.
- W2951210602 cites W1591801644 @default.
- W2951210602 cites W1632114991 @default.
- W2951210602 cites W183625566 @default.
- W2951210602 cites W2041530929 @default.
- W2951210602 cites W2140766383 @default.
- W2951210602 cites W2143612262 @default.
- W2951210602 cites W2194775991 @default.
- W2951210602 cites W2525332836 @default.
- W2951210602 cites W2526800167 @default.
- W2951210602 cites W2750562598 @default.
- W2951210602 cites W2757047188 @default.
- W2951210602 cites W2949117887 @default.
- W2951210602 cites W2949888546 @default.
- W2951210602 cites W2962876041 @default.
- W2951210602 cites W2963123301 @default.
- W2951210602 cites W2963537482 @default.
- W2951210602 hasPublicationYear "2018" @default.
- W2951210602 type Work @default.
- W2951210602 sameAs 2951210602 @default.
- W2951210602 citedByCount "4" @default.
- W2951210602 countsByYear W29512106022019 @default.
- W2951210602 countsByYear W29512106022020 @default.
- W2951210602 countsByYear W29512106022021 @default.
- W2951210602 crossrefType "posted-content" @default.
- W2951210602 hasAuthorship W2951210602A5006786260 @default.
- W2951210602 hasAuthorship W2951210602A5010327448 @default.
- W2951210602 hasAuthorship W2951210602A5055297022 @default.
- W2951210602 hasAuthorship W2951210602A5075849177 @default.
- W2951210602 hasConcept C14036430 @default.
- W2951210602 hasConcept C149635348 @default.
- W2951210602 hasConcept C154945302 @default.
- W2951210602 hasConcept C188441871 @default.
- W2951210602 hasConcept C2780513914 @default.
- W2951210602 hasConcept C41008148 @default.
- W2951210602 hasConcept C50644808 @default.
- W2951210602 hasConcept C78458016 @default.
- W2951210602 hasConcept C86803240 @default.
- W2951210602 hasConceptScore W2951210602C14036430 @default.
- W2951210602 hasConceptScore W2951210602C149635348 @default.
- W2951210602 hasConceptScore W2951210602C154945302 @default.
- W2951210602 hasConceptScore W2951210602C188441871 @default.
- W2951210602 hasConceptScore W2951210602C2780513914 @default.
- W2951210602 hasConceptScore W2951210602C41008148 @default.
- W2951210602 hasConceptScore W2951210602C50644808 @default.
- W2951210602 hasConceptScore W2951210602C78458016 @default.
- W2951210602 hasConceptScore W2951210602C86803240 @default.
- W2951210602 hasOpenAccess W2951210602 @default.
- W2951210602 hasRelatedWork W1546411676 @default.
- W2951210602 hasRelatedWork W2268150819 @default.
- W2951210602 hasRelatedWork W2524137671 @default.
- W2951210602 hasRelatedWork W2594407953 @default.
- W2951210602 hasRelatedWork W2742102274 @default.
- W2951210602 hasRelatedWork W2766557427 @default.
- W2951210602 hasRelatedWork W2786456373 @default.
- W2951210602 hasRelatedWork W2804323070 @default.
- W2951210602 hasRelatedWork W2898744876 @default.
- W2951210602 hasRelatedWork W2915685186 @default.
- W2951210602 hasRelatedWork W2915887922 @default.
- W2951210602 hasRelatedWork W2963902346 @default.
- W2951210602 hasRelatedWork W2995862373 @default.
- W2951210602 hasRelatedWork W3018363485 @default.
- W2951210602 hasRelatedWork W3023331847 @default.
- W2951210602 hasRelatedWork W3026936447 @default.
- W2951210602 hasRelatedWork W3029433960 @default.
- W2951210602 hasRelatedWork W3084008324 @default.
- W2951210602 hasRelatedWork W3096040360 @default.
- W2951210602 hasRelatedWork W3176904855 @default.
- W2951210602 isParatext "false" @default.
- W2951210602 isRetracted "false" @default.
- W2951210602 magId "2951210602" @default.
- W2951210602 workType "article" @default.