Matches in SemOpenAlex for { <https://semopenalex.org/work/W4376988605> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4376988605 abstract "Toeplitz Neural Networks (TNNs) (Qin et. al. 2023) are a recent sequence model with impressive results. They require O(n log n) computational complexity and O(n) relative positional encoder (RPE) multi-layer perceptron (MLP) and decay bias calls. We aim to reduce both. We first note that the RPE is a non-SPD (symmetric positive definite) kernel and the Toeplitz matrices are pseudo-Gram matrices. Further 1) the learned kernels display spiky behavior near the main diagonals with otherwise smooth behavior; 2) the RPE MLP is slow. For bidirectional models, this motivates a sparse plus low-rank Toeplitz matrix decomposition. For the sparse component's action, we do a small 1D convolution. For the low rank component, we replace the RPE MLP with linear interpolation and use asymmetric Structured Kernel Interpolation (SKI) (Wilson et. al. 2015) for O(n) complexity: we provide rigorous error analysis. For causal models, fast causal masking (Katharopoulos et. al. 2020) negates SKI's benefits. Working in the frequency domain, we avoid an explicit decay bias. To enforce causality, we represent the kernel via the real part of its frequency response using the RPE and compute the imaginary part via a Hilbert transform. This maintains O(n log n) complexity but achieves an absolute speedup. Modeling the frequency response directly is also competitive for bidirectional training, using one fewer FFT. We set a speed state of the art on Long Range Arena (Tay et. al. 2020) with minimal score degradation." @default.
- W4376988605 created "2023-05-18" @default.
- W4376988605 creator A5007321215 @default.
- W4376988605 creator A5047511271 @default.
- W4376988605 creator A5061866399 @default.
- W4376988605 date "2023-05-15" @default.
- W4376988605 modified "2023-09-27" @default.
- W4376988605 title "SKI to go Faster: Accelerating Toeplitz Neural Networks via Asymmetric Kernels" @default.
- W4376988605 doi "https://doi.org/10.48550/arxiv.2305.09028" @default.
- W4376988605 hasPublicationYear "2023" @default.
- W4376988605 type Work @default.
- W4376988605 citedByCount "0" @default.
- W4376988605 crossrefType "posted-content" @default.
- W4376988605 hasAuthorship W4376988605A5007321215 @default.
- W4376988605 hasAuthorship W4376988605A5047511271 @default.
- W4376988605 hasAuthorship W4376988605A5061866399 @default.
- W4376988605 hasBestOaLocation W43769886051 @default.
- W4376988605 hasConcept C11413529 @default.
- W4376988605 hasConcept C114614502 @default.
- W4376988605 hasConcept C118615104 @default.
- W4376988605 hasConcept C121332964 @default.
- W4376988605 hasConcept C147710293 @default.
- W4376988605 hasConcept C158693339 @default.
- W4376988605 hasConcept C164226766 @default.
- W4376988605 hasConcept C173608175 @default.
- W4376988605 hasConcept C202444582 @default.
- W4376988605 hasConcept C33923547 @default.
- W4376988605 hasConcept C41008148 @default.
- W4376988605 hasConcept C49712288 @default.
- W4376988605 hasConcept C62520636 @default.
- W4376988605 hasConcept C68339613 @default.
- W4376988605 hasConcept C74193536 @default.
- W4376988605 hasConcept C75172450 @default.
- W4376988605 hasConceptScore W4376988605C11413529 @default.
- W4376988605 hasConceptScore W4376988605C114614502 @default.
- W4376988605 hasConceptScore W4376988605C118615104 @default.
- W4376988605 hasConceptScore W4376988605C121332964 @default.
- W4376988605 hasConceptScore W4376988605C147710293 @default.
- W4376988605 hasConceptScore W4376988605C158693339 @default.
- W4376988605 hasConceptScore W4376988605C164226766 @default.
- W4376988605 hasConceptScore W4376988605C173608175 @default.
- W4376988605 hasConceptScore W4376988605C202444582 @default.
- W4376988605 hasConceptScore W4376988605C33923547 @default.
- W4376988605 hasConceptScore W4376988605C41008148 @default.
- W4376988605 hasConceptScore W4376988605C49712288 @default.
- W4376988605 hasConceptScore W4376988605C62520636 @default.
- W4376988605 hasConceptScore W4376988605C68339613 @default.
- W4376988605 hasConceptScore W4376988605C74193536 @default.
- W4376988605 hasConceptScore W4376988605C75172450 @default.
- W4376988605 hasLocation W43769886051 @default.
- W4376988605 hasLocation W43769886052 @default.
- W4376988605 hasOpenAccess W4376988605 @default.
- W4376988605 hasPrimaryLocation W43769886051 @default.
- W4376988605 hasRelatedWork W1998329188 @default.
- W4376988605 hasRelatedWork W2008793362 @default.
- W4376988605 hasRelatedWork W2077155832 @default.
- W4376988605 hasRelatedWork W2088025173 @default.
- W4376988605 hasRelatedWork W2088272510 @default.
- W4376988605 hasRelatedWork W2323505138 @default.
- W4376988605 hasRelatedWork W2786001072 @default.
- W4376988605 hasRelatedWork W2955795271 @default.
- W4376988605 hasRelatedWork W4242837953 @default.
- W4376988605 hasRelatedWork W4300568036 @default.
- W4376988605 isParatext "false" @default.
- W4376988605 isRetracted "false" @default.
- W4376988605 workType "article" @default.