Matches in SemOpenAlex for { <https://semopenalex.org/work/W4330339900> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4330339900 abstract "Recent studies on transfer learning have shown that selectively fine-tuning a subset of layers or customizing different learning rates for each layer can greatly improve robustness to out-of-distribution (OOD) data and retain generalization capability in the pre-trained models. However, most of these methods employ manually crafted heuristics or expensive hyper-parameter searches, which prevent them from scaling up to large datasets and neural networks. To solve this problem, we propose Trainable Projected Gradient Method (TPGM) to automatically learn the constraint imposed for each layer for a fine-grained fine-tuning regularization. This is motivated by formulating fine-tuning as a bi-level constrained optimization problem. Specifically, TPGM maintains a set of projection radii, i.e., distance constraints between the fine-tuned model and the pre-trained model, for each layer, and enforces them through weight projections. To learn the constraints, we propose a bi-level optimization to automatically learn the best set of projection radii in an end-to-end manner. Theoretically, we show that the bi-level optimization formulation could explain the regularization capability of TPGM. Empirically, with little hyper-parameter search cost, TPGM outperforms existing fine-tuning methods in OOD performance while matching the best in-distribution (ID) performance. For example, when fine-tuned on DomainNet-Real and ImageNet, compared to vanilla fine-tuning, TPGM shows $22%$ and $10%$ relative OOD improvement respectively on their sketch counterparts. Code is available at url{https://github.com/PotatoTian/TPGM}." @default.
- W4330339900 created "2023-03-22" @default.
- W4330339900 creator A5024074850 @default.
- W4330339900 creator A5052001217 @default.
- W4330339900 creator A5057482975 @default.
- W4330339900 creator A5079738111 @default.
- W4330339900 creator A5080137972 @default.
- W4330339900 creator A5088892134 @default.
- W4330339900 date "2023-03-19" @default.
- W4330339900 modified "2023-10-18" @default.
- W4330339900 title "Trainable Projected Gradient Method for Robust Fine-tuning" @default.
- W4330339900 doi "https://doi.org/10.48550/arxiv.2303.10720" @default.
- W4330339900 hasPublicationYear "2023" @default.
- W4330339900 type Work @default.
- W4330339900 citedByCount "0" @default.
- W4330339900 crossrefType "posted-content" @default.
- W4330339900 hasAuthorship W4330339900A5024074850 @default.
- W4330339900 hasAuthorship W4330339900A5052001217 @default.
- W4330339900 hasAuthorship W4330339900A5057482975 @default.
- W4330339900 hasAuthorship W4330339900A5079738111 @default.
- W4330339900 hasAuthorship W4330339900A5080137972 @default.
- W4330339900 hasAuthorship W4330339900A5088892134 @default.
- W4330339900 hasBestOaLocation W43303399001 @default.
- W4330339900 hasConcept C104317684 @default.
- W4330339900 hasConcept C111919701 @default.
- W4330339900 hasConcept C11413529 @default.
- W4330339900 hasConcept C121332964 @default.
- W4330339900 hasConcept C126255220 @default.
- W4330339900 hasConcept C127705205 @default.
- W4330339900 hasConcept C154945302 @default.
- W4330339900 hasConcept C157524613 @default.
- W4330339900 hasConcept C177264268 @default.
- W4330339900 hasConcept C185592680 @default.
- W4330339900 hasConcept C199360897 @default.
- W4330339900 hasConcept C2524010 @default.
- W4330339900 hasConcept C2776036281 @default.
- W4330339900 hasConcept C2776135515 @default.
- W4330339900 hasConcept C33923547 @default.
- W4330339900 hasConcept C41008148 @default.
- W4330339900 hasConcept C55493867 @default.
- W4330339900 hasConcept C62520636 @default.
- W4330339900 hasConcept C63479239 @default.
- W4330339900 hasConcept C99844830 @default.
- W4330339900 hasConceptScore W4330339900C104317684 @default.
- W4330339900 hasConceptScore W4330339900C111919701 @default.
- W4330339900 hasConceptScore W4330339900C11413529 @default.
- W4330339900 hasConceptScore W4330339900C121332964 @default.
- W4330339900 hasConceptScore W4330339900C126255220 @default.
- W4330339900 hasConceptScore W4330339900C127705205 @default.
- W4330339900 hasConceptScore W4330339900C154945302 @default.
- W4330339900 hasConceptScore W4330339900C157524613 @default.
- W4330339900 hasConceptScore W4330339900C177264268 @default.
- W4330339900 hasConceptScore W4330339900C185592680 @default.
- W4330339900 hasConceptScore W4330339900C199360897 @default.
- W4330339900 hasConceptScore W4330339900C2524010 @default.
- W4330339900 hasConceptScore W4330339900C2776036281 @default.
- W4330339900 hasConceptScore W4330339900C2776135515 @default.
- W4330339900 hasConceptScore W4330339900C33923547 @default.
- W4330339900 hasConceptScore W4330339900C41008148 @default.
- W4330339900 hasConceptScore W4330339900C55493867 @default.
- W4330339900 hasConceptScore W4330339900C62520636 @default.
- W4330339900 hasConceptScore W4330339900C63479239 @default.
- W4330339900 hasConceptScore W4330339900C99844830 @default.
- W4330339900 hasLocation W43303399001 @default.
- W4330339900 hasOpenAccess W4330339900 @default.
- W4330339900 hasPrimaryLocation W43303399001 @default.
- W4330339900 hasRelatedWork W1492332033 @default.
- W4330339900 hasRelatedWork W1593899949 @default.
- W4330339900 hasRelatedWork W1978564766 @default.
- W4330339900 hasRelatedWork W2020825524 @default.
- W4330339900 hasRelatedWork W202590844 @default.
- W4330339900 hasRelatedWork W2032560733 @default.
- W4330339900 hasRelatedWork W2066702091 @default.
- W4330339900 hasRelatedWork W2403558388 @default.
- W4330339900 hasRelatedWork W2625860596 @default.
- W4330339900 hasRelatedWork W848451299 @default.
- W4330339900 isParatext "false" @default.
- W4330339900 isRetracted "false" @default.
- W4330339900 workType "article" @default.