Skip to content

Latest commit

 

History

History
77 lines (47 loc) · 3.23 KB

File metadata and controls

77 lines (47 loc) · 3.23 KB

pytorch_training_optimization_using_tensordict_memory_mapping

Optimizing PyTorch training by wrapping torch.utils.data.Dataset with tensordict.TensorDict.MemoryMappedTensor mapped, pinned, and loaded onto an Nvidia GPU and inputting TensorDict(Dataset) into torch.utils.data.DataLoader--to boost model training speed.

Boost PyTorch Model Training Speed:

subplots_demo

To run the demo:

git clone https://github.com/OriYarden/pytorch_training_optimization_using_tensordict_memory_mapping
cd pytorch_training_optimization_using_tensordict_memory_mapping
python run_demo.py

Visualizing tensordict_packages Enwrapment:

image

Visualizing PyTorch TensorDict Memory Mapped Tensors Speed Advantage:

(and what run_demo.py looks like in gifs)

PyTorch Model Training BASELINE - Control Condition

torch.utils.data.Dataset # Training 1 Epoch:

demo_dataloader

PyTorch Model Training TEST - Experimental Condition

tensordict.TensorDict.MemoryMappedTensor(torch.utils.data.Dataset) # Training 1 Epoch:

demo_td_dataloader

torch.utils.data.Dataset's POV:

lamborghini-race-car

The only thing you have to change in your code (along with potentially a few other minor changes, see comments in code):

ds = Dataset() # <--- potentially requires minor changes in __getitem__ method
ds = dataset_to_tensordict( # <--- Wraps here, this must be added into your existing code (from tensordict_packages).
    ds=ds,
    DEVICE=DEVICE,
)
loader = DataLoader(ds, collate_fn) # <--- requires inputting Collate_Fn wrapper (from tensordict_packages).
# That's it! Just two lines of code.

Concluding Remarks:

The TensorDict Memory Mapping tools that I've provided in tensordict_packages boosts PyTorch model training speed.

However, the initial tensordict_packages wrapping runtime is approximately equal to 1 epoch of torch.utils.data.Dataset:

demo_td_wrapper

So there may not be a scenario in which tensordict_packages can benefit PyTorch model inferencing alone.

Still, PyTorch model training speed can be improved by orders of magnitude when using tensordict_packages, and therefore, we should make the most out of the Nvidia GPU resources (i.e. memory) available so that we can speed up PyTorch model training time, reduce PyTorch model training cost, and shorten the gap between initially developing PyTorch models and having PyTorch models in production.

And with the current AI boom, where LLMs and text-to-video PyTorch models require months of training, we can save time, resources, and Nvidia GPUs via tensordict_packages's ability to leverage TensorDict and MemoryMappedTensors with torch.utils.data.DataLoader.