Matches in SemOpenAlex for { <https://semopenalex.org/work/W4297645139> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4297645139 abstract "Larger deep learning models usually lead to higher model quality with an ever-increasing GPU memory footprint. Although tensor checkpointing techniques have been proposed to enable training under a restricted GPU memory budget, the input tensor dynamics have been unexploited for optimizing performance while reducing GPU memory footprint. Specifically, due to the diverse datasets and subsequent data argumentation, the input tensor size per mini-batch is dynamic during the training process, leading to a changing GPU memory footprint. However, to leverage such input tensor dynamics in checkpointing, there are two challenges to be solved. First, the checkpointing plan needs to be determined during runtime due to the dynamics of input tensors. Second, the checkpointing plan needs to be applied on the fly without significantly deteriorating the performance. In this paper, we propose Mimose, an input-aware tensor checkpointing planner respecting the memory budget while enabling efficient model training on GPU. Mimose builds a lightweight but accurate prediction model of GPU memory usage online, without pre-analyzing the model. It generates a tensor checkpointing plan based on per-layer memory prediction and applies it to training progress on the fly. It also adopts a caching strategy to avoid having to regenerate the plan for repeated input size. Our experiments show that Mimose achieves superior training throughput compared to state-of-the-art memory planners under the same GPU memory budgets." @default.
- W4297645139 created "2022-09-30" @default.
- W4297645139 creator A5001560763 @default.
- W4297645139 creator A5018705589 @default.
- W4297645139 creator A5046015937 @default.
- W4297645139 creator A5046708261 @default.
- W4297645139 creator A5058313495 @default.
- W4297645139 creator A5060999547 @default.
- W4297645139 creator A5064154151 @default.
- W4297645139 creator A5074183877 @default.
- W4297645139 creator A5075958056 @default.
- W4297645139 creator A5079362609 @default.
- W4297645139 creator A5086694700 @default.
- W4297645139 date "2022-09-06" @default.
- W4297645139 modified "2023-10-18" @default.
- W4297645139 title "Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU" @default.
- W4297645139 doi "https://doi.org/10.48550/arxiv.2209.02478" @default.
- W4297645139 hasPublicationYear "2022" @default.
- W4297645139 type Work @default.
- W4297645139 citedByCount "0" @default.
- W4297645139 crossrefType "posted-content" @default.
- W4297645139 hasAuthorship W4297645139A5001560763 @default.
- W4297645139 hasAuthorship W4297645139A5018705589 @default.
- W4297645139 hasAuthorship W4297645139A5046015937 @default.
- W4297645139 hasAuthorship W4297645139A5046708261 @default.
- W4297645139 hasAuthorship W4297645139A5058313495 @default.
- W4297645139 hasAuthorship W4297645139A5060999547 @default.
- W4297645139 hasAuthorship W4297645139A5064154151 @default.
- W4297645139 hasAuthorship W4297645139A5074183877 @default.
- W4297645139 hasAuthorship W4297645139A5075958056 @default.
- W4297645139 hasAuthorship W4297645139A5079362609 @default.
- W4297645139 hasAuthorship W4297645139A5086694700 @default.
- W4297645139 hasBestOaLocation W42976451391 @default.
- W4297645139 hasConcept C111919701 @default.
- W4297645139 hasConcept C113775141 @default.
- W4297645139 hasConcept C153083717 @default.
- W4297645139 hasConcept C154945302 @default.
- W4297645139 hasConcept C155281189 @default.
- W4297645139 hasConcept C173608175 @default.
- W4297645139 hasConcept C202444582 @default.
- W4297645139 hasConcept C2776999362 @default.
- W4297645139 hasConcept C33923547 @default.
- W4297645139 hasConcept C41008148 @default.
- W4297645139 hasConcept C74912251 @default.
- W4297645139 hasConceptScore W4297645139C111919701 @default.
- W4297645139 hasConceptScore W4297645139C113775141 @default.
- W4297645139 hasConceptScore W4297645139C153083717 @default.
- W4297645139 hasConceptScore W4297645139C154945302 @default.
- W4297645139 hasConceptScore W4297645139C155281189 @default.
- W4297645139 hasConceptScore W4297645139C173608175 @default.
- W4297645139 hasConceptScore W4297645139C202444582 @default.
- W4297645139 hasConceptScore W4297645139C2776999362 @default.
- W4297645139 hasConceptScore W4297645139C33923547 @default.
- W4297645139 hasConceptScore W4297645139C41008148 @default.
- W4297645139 hasConceptScore W4297645139C74912251 @default.
- W4297645139 hasLocation W42976451391 @default.
- W4297645139 hasOpenAccess W4297645139 @default.
- W4297645139 hasPrimaryLocation W42976451391 @default.
- W4297645139 hasRelatedWork W1491899005 @default.
- W4297645139 hasRelatedWork W1604898313 @default.
- W4297645139 hasRelatedWork W2021204413 @default.
- W4297645139 hasRelatedWork W2117014006 @default.
- W4297645139 hasRelatedWork W2372170743 @default.
- W4297645139 hasRelatedWork W2762832356 @default.
- W4297645139 hasRelatedWork W3113046600 @default.
- W4297645139 hasRelatedWork W4205613022 @default.
- W4297645139 hasRelatedWork W4233815414 @default.
- W4297645139 hasRelatedWork W4319917399 @default.
- W4297645139 isParatext "false" @default.
- W4297645139 isRetracted "false" @default.
- W4297645139 workType "article" @default.