gau-nernst's blog
Thien Tran
Thien Tran
A personal blog by Thien Tran

Use NVRTC to explore MMA instruction variants

Use NVRTC to explore MMA instruction variants

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++