Media Summary: Presenter: Daniel Vega-Myhre, with part by wave_function Paper: Paper: LMCache ( Presenter: A. Mahmood Slides: ...

Ml Performance Reading Group Session - Detailed Analysis & Overview

Presenter: Daniel Vega-Myhre, with part by wave_function Paper: Paper: LMCache ( Presenter: A. Mahmood Slides: ...

Photo Gallery

ML Performance Reading Group Session 19: Speculative Decoding
ML Performance Reading Group Session 24: Flash Attention 4
ML Performance Reading Group Session 1: GPU Architecture, CUDA, NCCL
ML Performance Reading Group Session 2: Flash Attention
ML Performance Reading Group Session 18: Kimi Delta Attention
ML Performance Reading Group Session 16: LMCache
ML Performance Reading Group Session 15: Megablocks
ML Performance Reading Group Session 5: Paged Attention
ML Performance Reading Group Session 17: MXFP8 Training for MoEs with TorchAO
ML Performance Reading Group Session 25: Prefill as a Service
ML Performance Reading Group Session 6: Zero Bubble Pipeline Parallelism
ML Performance Reading Group Session 11: Async Tensor Parallelism
View Detailed Profile
ML Performance Reading Group Session 19: Speculative Decoding

ML Performance Reading Group Session 19: Speculative Decoding

Session

ML Performance Reading Group Session 24: Flash Attention 4

ML Performance Reading Group Session 24: Flash Attention 4

ML Performance Reading Group Session

ML Performance Reading Group Session 1: GPU Architecture, CUDA, NCCL

ML Performance Reading Group Session 1: GPU Architecture, CUDA, NCCL

ML Performance

ML Performance Reading Group Session 2: Flash Attention

ML Performance Reading Group Session 2: Flash Attention

ML Performance Reading Group Session

ML Performance Reading Group Session 18: Kimi Delta Attention

ML Performance Reading Group Session 18: Kimi Delta Attention

Presenter: Daniel Vega-Myhre, with part by wave_function Paper: https://arxiv.org/pdf/2510.26692.

ML Performance Reading Group Session 16: LMCache

ML Performance Reading Group Session 16: LMCache

Paper: LMCache (https://arxiv.org/pdf/2510.09665) Presenter: A. Mahmood Slides: ...

ML Performance Reading Group Session 15: Megablocks

ML Performance Reading Group Session 15: Megablocks

Paper: Megablocks (https://arxiv.org/pdf/2211.15841) Presenter: rdyro.

ML Performance Reading Group Session 5: Paged Attention

ML Performance Reading Group Session 5: Paged Attention

ML Performance Reading Group Session

ML Performance Reading Group Session 17: MXFP8 Training for MoEs with TorchAO

ML Performance Reading Group Session 17: MXFP8 Training for MoEs with TorchAO

Presenter: Daniel Vega-Myhre Code: https://github.com/pytorch/ao/tree/main/torchao/prototype/moe_training.

ML Performance Reading Group Session 25: Prefill as a Service

ML Performance Reading Group Session 25: Prefill as a Service

Paper: https://www.alphaxiv.org/abs/2604.15039v1 Slides: ...

ML Performance Reading Group Session 6: Zero Bubble Pipeline Parallelism

ML Performance Reading Group Session 6: Zero Bubble Pipeline Parallelism

ML Performance Reading Group Session

ML Performance Reading Group Session 11: Async Tensor Parallelism

ML Performance Reading Group Session 11: Async Tensor Parallelism

ML Performance Reading Group Session

ML Performance Reading Group Session 8: Megatron-LM

ML Performance Reading Group Session 8: Megatron-LM

ML Performance Reading Group Session