From Scratch · the recreations desk

Recreations

Built from scratch, in runnable code.

First-party paper rebuilds as minimal, annotated, runnable code. Narrative build logs with our own repo, scope disclosure, and whether our numbers matched.

RSS ↗1 checkpoint

CHECKPOINT 00052026-05-28scope: minimal

Recreating FlashAttention: A Tiled, IO-Aware Attention Kernel from Scratch

FlashAttention is exact attention restructured for the memory hierarchy, not an approximation. We implement the tiled forward and recompute backward in Triton, validate exactness against a reference, and separate what a tutorial actually reproduces from what needs CUTLASS-grade engineering.

flash-attention kernels attention gpu-memory