Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386065629> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4386065629 abstract "Post-training quantization (PTQ) is an effective compression method to reduce the model size and computational cost. However, quantizing a model into a low-bit one, e.g., lower than 4, is difficult and often results in non-negligible performance degradation. To address this, we investigate the loss landscapes of quantized networks with various bit-widths. We show that the network with more ragged loss surface, is more easily trapped into bad local minima, which mostly appears in low-bit quantization. A deeper analysis indicates, the ragged surface is caused by the injection of excessive quantization noise. To this end, we detach a sharpness term from the loss which reflects the impact of quantization noise. To smooth the rugged loss surface, we propose to limit the sharpness term small and stable during optimization. Instead of directly optimizing the target bit network, we design a self-adapted shrinking scheduler for the bit-width in continuous domain from high bit-width to the target by limiting the increasing sharpness term within a proper range. It can be viewed as iteratively adding small “instant” quantization noise and adjusting the network to eliminate its impact. Widely experiments including classification and detection tasks demonstrate the effectiveness of the Bit-shrinking strategy in PTQ. On the Vision Transformer models, our INT8 and INT6 models drop within 0.5% and 1.5% Top-1 accuracy, respectively. On the traditional CNN networks, our INT4 quantized models drop within 1.3% and 3.5% Top-1 accuracy on ResNet18 and MobileNetV2 without fine-tuning, which achieves the state-of-the-art performance." @default.
- W4386065629 created "2023-08-23" @default.
- W4386065629 creator A5022098030 @default.
- W4386065629 creator A5024940897 @default.
- W4386065629 creator A5031951156 @default.
- W4386065629 creator A5046994319 @default.
- W4386065629 creator A5059358267 @default.
- W4386065629 creator A5072137495 @default.
- W4386065629 creator A5085955762 @default.
- W4386065629 date "2023-06-01" @default.
- W4386065629 modified "2023-10-16" @default.
- W4386065629 title "Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization" @default.
- W4386065629 cites W2108598243 @default.
- W4386065629 cites W2194775991 @default.
- W4386065629 cites W2550403740 @default.
- W4386065629 cites W2798742790 @default.
- W4386065629 cites W2808868252 @default.
- W4386065629 cites W2963363373 @default.
- W4386065629 cites W2963480671 @default.
- W4386065629 cites W2963521187 @default.
- W4386065629 cites W2982479999 @default.
- W4386065629 cites W3034528892 @default.
- W4386065629 cites W3034719990 @default.
- W4386065629 cites W3138516171 @default.
- W4386065629 cites W3166792421 @default.
- W4386065629 cites W3174042481 @default.
- W4386065629 cites W3202442802 @default.
- W4386065629 doi "https://doi.org/10.1109/cvpr52729.2023.01554" @default.
- W4386065629 hasPublicationYear "2023" @default.
- W4386065629 type Work @default.
- W4386065629 citedByCount "0" @default.
- W4386065629 crossrefType "proceedings-article" @default.
- W4386065629 hasAuthorship W4386065629A5022098030 @default.
- W4386065629 hasAuthorship W4386065629A5024940897 @default.
- W4386065629 hasAuthorship W4386065629A5031951156 @default.
- W4386065629 hasAuthorship W4386065629A5046994319 @default.
- W4386065629 hasAuthorship W4386065629A5059358267 @default.
- W4386065629 hasAuthorship W4386065629A5072137495 @default.
- W4386065629 hasAuthorship W4386065629A5085955762 @default.
- W4386065629 hasConcept C11413529 @default.
- W4386065629 hasConcept C127413603 @default.
- W4386065629 hasConcept C134306372 @default.
- W4386065629 hasConcept C186633575 @default.
- W4386065629 hasConcept C188198153 @default.
- W4386065629 hasConcept C24326235 @default.
- W4386065629 hasConcept C28855332 @default.
- W4386065629 hasConcept C33923547 @default.
- W4386065629 hasConcept C41008148 @default.
- W4386065629 hasConcept C78519656 @default.
- W4386065629 hasConceptScore W4386065629C11413529 @default.
- W4386065629 hasConceptScore W4386065629C127413603 @default.
- W4386065629 hasConceptScore W4386065629C134306372 @default.
- W4386065629 hasConceptScore W4386065629C186633575 @default.
- W4386065629 hasConceptScore W4386065629C188198153 @default.
- W4386065629 hasConceptScore W4386065629C24326235 @default.
- W4386065629 hasConceptScore W4386065629C28855332 @default.
- W4386065629 hasConceptScore W4386065629C33923547 @default.
- W4386065629 hasConceptScore W4386065629C41008148 @default.
- W4386065629 hasConceptScore W4386065629C78519656 @default.
- W4386065629 hasLocation W43860656291 @default.
- W4386065629 hasOpenAccess W4386065629 @default.
- W4386065629 hasPrimaryLocation W43860656291 @default.
- W4386065629 hasRelatedWork W1567571245 @default.
- W4386065629 hasRelatedWork W1974104207 @default.
- W4386065629 hasRelatedWork W2061430950 @default.
- W4386065629 hasRelatedWork W2321123111 @default.
- W4386065629 hasRelatedWork W2386767533 @default.
- W4386065629 hasRelatedWork W2510353211 @default.
- W4386065629 hasRelatedWork W2540906477 @default.
- W4386065629 hasRelatedWork W4253988311 @default.
- W4386065629 hasRelatedWork W4306312216 @default.
- W4386065629 hasRelatedWork W4353055780 @default.
- W4386065629 isParatext "false" @default.
- W4386065629 isRetracted "false" @default.
- W4386065629 workType "article" @default.