Matches in SemOpenAlex for { <https://semopenalex.org/work/W3212856952> ?p ?o ?g. }
- W3212856952 abstract "Classification problems having thousands or more classes naturally occur in NLP, for example language models or document classification. A softmax or one-vs-all classifier naturally handles many classes, but it is very slow at inference time, because every class score must be calculated to find the top class. We propose the “softmax tree”, consisting of a binary tree having sparse hyperplanes at the decision nodes (which make hard, not soft, decisions) and small softmax classifiers at the leaves. This is much faster at inference because the input instance follows a single path to a leaf (whose length is logarithmic on the number of leaves) and the softmax classifier at each leaf operates on a small subset of the classes. Although learning accurate tree-based models has proven difficult in the past, we are able to overcome this by using a variation of a recent algorithm, tree alternating optimization (TAO). Compared to a softmax and other classifiers, the resulting softmax trees are both more accurate in prediction and faster in inference, as shown in NLP problems having from one thousand to one hundred thousand classes." @default.
- W3212856952 created "2021-11-22" @default.
- W3212856952 creator A5009367143 @default.
- W3212856952 creator A5024657565 @default.
- W3212856952 creator A5057957982 @default.
- W3212856952 date "2021-01-01" @default.
- W3212856952 modified "2023-10-14" @default.
- W3212856952 title "Softmax Tree: An Accurate, Fast Classifier When the Number of Classes Is Large" @default.
- W3212856952 cites W1554944419 @default.
- W3212856952 cites W1594031697 @default.
- W3212856952 cites W1614298861 @default.
- W3212856952 cites W1663973292 @default.
- W3212856952 cites W1676820704 @default.
- W3212856952 cites W179875071 @default.
- W3212856952 cites W1878054055 @default.
- W3212856952 cites W2068074736 @default.
- W3212856952 cites W2100714283 @default.
- W3212856952 cites W2101234009 @default.
- W3212856952 cites W2106854428 @default.
- W3212856952 cites W2118585731 @default.
- W3212856952 cites W2125055259 @default.
- W3212856952 cites W2131462252 @default.
- W3212856952 cites W2132339004 @default.
- W3212856952 cites W2135482703 @default.
- W3212856952 cites W2148771442 @default.
- W3212856952 cites W2153579005 @default.
- W3212856952 cites W2155144632 @default.
- W3212856952 cites W2183087644 @default.
- W3212856952 cites W2185726469 @default.
- W3212856952 cites W2250539671 @default.
- W3212856952 cites W2493916176 @default.
- W3212856952 cites W2593236586 @default.
- W3212856952 cites W2757291580 @default.
- W3212856952 cites W2891895300 @default.
- W3212856952 cites W2963156201 @default.
- W3212856952 cites W2963390429 @default.
- W3212856952 cites W2964205997 @default.
- W3212856952 cites W2990176236 @default.
- W3212856952 cites W3035751413 @default.
- W3212856952 cites W3093489024 @default.
- W3212856952 cites W3106569781 @default.
- W3212856952 cites W3147278129 @default.
- W3212856952 cites W3163041779 @default.
- W3212856952 cites W3175939087 @default.
- W3212856952 cites W3194213524 @default.
- W3212856952 cites W3200924559 @default.
- W3212856952 cites W36903255 @default.
- W3212856952 cites W756166754 @default.
- W3212856952 doi "https://doi.org/10.18653/v1/2021.emnlp-main.838" @default.
- W3212856952 hasPublicationYear "2021" @default.
- W3212856952 type Work @default.
- W3212856952 sameAs 3212856952 @default.
- W3212856952 citedByCount "3" @default.
- W3212856952 countsByYear W32128569522021 @default.
- W3212856952 countsByYear W32128569522022 @default.
- W3212856952 countsByYear W32128569522023 @default.
- W3212856952 crossrefType "proceedings-article" @default.
- W3212856952 hasAuthorship W3212856952A5009367143 @default.
- W3212856952 hasAuthorship W3212856952A5024657565 @default.
- W3212856952 hasAuthorship W3212856952A5057957982 @default.
- W3212856952 hasBestOaLocation W32128569521 @default.
- W3212856952 hasConcept C113174947 @default.
- W3212856952 hasConcept C119857082 @default.
- W3212856952 hasConcept C134306372 @default.
- W3212856952 hasConcept C153180895 @default.
- W3212856952 hasConcept C154945302 @default.
- W3212856952 hasConcept C188441871 @default.
- W3212856952 hasConcept C2776214188 @default.
- W3212856952 hasConcept C33923547 @default.
- W3212856952 hasConcept C41008148 @default.
- W3212856952 hasConcept C50644808 @default.
- W3212856952 hasConcept C84525736 @default.
- W3212856952 hasConcept C95623464 @default.
- W3212856952 hasConceptScore W3212856952C113174947 @default.
- W3212856952 hasConceptScore W3212856952C119857082 @default.
- W3212856952 hasConceptScore W3212856952C134306372 @default.
- W3212856952 hasConceptScore W3212856952C153180895 @default.
- W3212856952 hasConceptScore W3212856952C154945302 @default.
- W3212856952 hasConceptScore W3212856952C188441871 @default.
- W3212856952 hasConceptScore W3212856952C2776214188 @default.
- W3212856952 hasConceptScore W3212856952C33923547 @default.
- W3212856952 hasConceptScore W3212856952C41008148 @default.
- W3212856952 hasConceptScore W3212856952C50644808 @default.
- W3212856952 hasConceptScore W3212856952C84525736 @default.
- W3212856952 hasConceptScore W3212856952C95623464 @default.
- W3212856952 hasLocation W32128569521 @default.
- W3212856952 hasOpenAccess W3212856952 @default.
- W3212856952 hasPrimaryLocation W32128569521 @default.
- W3212856952 hasRelatedWork W2743258233 @default.
- W3212856952 hasRelatedWork W2771515600 @default.
- W3212856952 hasRelatedWork W2807311372 @default.
- W3212856952 hasRelatedWork W2900180889 @default.
- W3212856952 hasRelatedWork W2997969508 @default.
- W3212856952 hasRelatedWork W3004093983 @default.
- W3212856952 hasRelatedWork W3120400911 @default.
- W3212856952 hasRelatedWork W3208883981 @default.
- W3212856952 hasRelatedWork W4307834408 @default.
- W3212856952 hasRelatedWork W4320925816 @default.
- W3212856952 isParatext "false" @default.
- W3212856952 isRetracted "false" @default.