Presentations

Parallelization and Scaling


Parallelization Strategies


Data Parallelism


Pipeline Parallelism


Model Sharding


Tensor Parallelism


Sequence Parallelism


Context Parallelism


Expert Parallelism


Frameworks


Native Pytorch


Megatron-LM


NeMo


Modalities


Another One (TODO)


HuggingFace Accelerate, PyTorch Lightning,


Also Existing


Measuring Performance

Metrics: