Matches in SemOpenAlex for { <https://semopenalex.org/work/W2896227908> ?p ?o ?g. }
Showing items 1 to 82 of
82
with 100 items per page.
- W2896227908 abstract "Nonlinearity is crucial to the performance of a deep (neural) network (DN). To date there has been little progress understanding the menagerie of available nonlinearities, but recently progress has been made on understanding the role played by piecewise affine and convex nonlinearities like the ReLU and absolute value activation functions and max-pooling. In particular, DN layers constructed from these operations can be interpreted as {em max-affine spline operators} (MASOs) that have an elegant link to vector quantization (VQ) and $K$-means. While this is good theoretical progress, the entire MASO approach is predicated on the requirement that the nonlinearities be piecewise affine and convex, which precludes important activation functions like the sigmoid, hyperbolic tangent, and softmax. {em This paper extends the MASO framework to these and an infinitely large class of new nonlinearities by linking deterministic MASOs with probabilistic Gaussian Mixture Models (GMMs).} We show that, under a GMM, piecewise affine, convex nonlinearities like ReLU, absolute value, and max-pooling can be interpreted as solutions to certain natural VQ inference problems, while sigmoid, hyperbolic tangent, and softmax can be interpreted as solutions to corresponding VQ inference problems. We further extend the framework by hybridizing the hard and soft VQ optimizations to create a $beta$-VQ inference that interpolates between hard, soft, and linear VQ inference. A prime example of a $beta$-VQ DN nonlinearity is the {em swish} nonlinearity, which offers state-of-the-art performance in a range of computer vision tasks but was developed ad hoc by experimentation. Finally, we validate with experiments an important assertion of our theory, namely that DN performance can be significantly improved by enforcing orthogonality in its linear filters." @default.
- W2896227908 created "2018-10-26" @default.
- W2896227908 creator A5047293370 @default.
- W2896227908 creator A5072713767 @default.
- W2896227908 date "2018-10-22" @default.
- W2896227908 modified "2023-09-23" @default.
- W2896227908 title "From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference" @default.
- W2896227908 cites W1533861849 @default.
- W2896227908 cites W1634005169 @default.
- W2896227908 cites W1663973292 @default.
- W2896227908 cites W2095222360 @default.
- W2896227908 cites W2098929365 @default.
- W2896227908 cites W2103183297 @default.
- W2896227908 cites W2105280352 @default.
- W2896227908 cites W2109820980 @default.
- W2896227908 cites W2162931300 @default.
- W2896227908 cites W2792643794 @default.
- W2896227908 cites W2884447957 @default.
- W2896227908 cites W2887953126 @default.
- W2896227908 cites W2962834855 @default.
- W2896227908 cites W2963325419 @default.
- W2896227908 cites W2963543570 @default.
- W2896227908 cites W2963568027 @default.
- W2896227908 cites W2964144352 @default.
- W2896227908 hasPublicationYear "2018" @default.
- W2896227908 type Work @default.
- W2896227908 sameAs 2896227908 @default.
- W2896227908 citedByCount "0" @default.
- W2896227908 crossrefType "posted-content" @default.
- W2896227908 hasAuthorship W2896227908A5047293370 @default.
- W2896227908 hasAuthorship W2896227908A5072713767 @default.
- W2896227908 hasConcept C11413529 @default.
- W2896227908 hasConcept C153180895 @default.
- W2896227908 hasConcept C154945302 @default.
- W2896227908 hasConcept C188441871 @default.
- W2896227908 hasConcept C199833920 @default.
- W2896227908 hasConcept C202444582 @default.
- W2896227908 hasConcept C2776214188 @default.
- W2896227908 hasConcept C28826006 @default.
- W2896227908 hasConcept C33923547 @default.
- W2896227908 hasConcept C41008148 @default.
- W2896227908 hasConcept C50644808 @default.
- W2896227908 hasConcept C81388566 @default.
- W2896227908 hasConcept C92757383 @default.
- W2896227908 hasConceptScore W2896227908C11413529 @default.
- W2896227908 hasConceptScore W2896227908C153180895 @default.
- W2896227908 hasConceptScore W2896227908C154945302 @default.
- W2896227908 hasConceptScore W2896227908C188441871 @default.
- W2896227908 hasConceptScore W2896227908C199833920 @default.
- W2896227908 hasConceptScore W2896227908C202444582 @default.
- W2896227908 hasConceptScore W2896227908C2776214188 @default.
- W2896227908 hasConceptScore W2896227908C28826006 @default.
- W2896227908 hasConceptScore W2896227908C33923547 @default.
- W2896227908 hasConceptScore W2896227908C41008148 @default.
- W2896227908 hasConceptScore W2896227908C50644808 @default.
- W2896227908 hasConceptScore W2896227908C81388566 @default.
- W2896227908 hasConceptScore W2896227908C92757383 @default.
- W2896227908 hasOpenAccess W2896227908 @default.
- W2896227908 hasRelatedWork W1562284614 @default.
- W2896227908 hasRelatedWork W2189037515 @default.
- W2896227908 hasRelatedWork W2735426806 @default.
- W2896227908 hasRelatedWork W2890223728 @default.
- W2896227908 hasRelatedWork W2899686093 @default.
- W2896227908 hasRelatedWork W2958769143 @default.
- W2896227908 hasRelatedWork W2962930448 @default.
- W2896227908 hasRelatedWork W2963455631 @default.
- W2896227908 hasRelatedWork W2963516679 @default.
- W2896227908 hasRelatedWork W2965677819 @default.
- W2896227908 hasRelatedWork W2981523199 @default.
- W2896227908 hasRelatedWork W3013087590 @default.
- W2896227908 hasRelatedWork W3032954808 @default.
- W2896227908 hasRelatedWork W3080682497 @default.
- W2896227908 hasRelatedWork W3094346332 @default.
- W2896227908 hasRelatedWork W3100852746 @default.
- W2896227908 hasRelatedWork W3127586331 @default.
- W2896227908 hasRelatedWork W3171388751 @default.
- W2896227908 hasRelatedWork W3174622107 @default.
- W2896227908 hasRelatedWork W3189554725 @default.
- W2896227908 isParatext "false" @default.
- W2896227908 isRetracted "false" @default.
- W2896227908 magId "2896227908" @default.
- W2896227908 workType "article" @default.