The complete logbook
Archive
Every checkpoint, in reverse chronological order. 20 filed and counting.
2026
- 2026-06-28EssaysWhat a 4B Model Can Actually Do: Field Notes from 155 Experiments0014
- 2026-06-27EssaysThe Harness Is the Product: Why Agent Evals Are the Real Moat0013
- 2026-06-26SignalsvLLM and the new default shape of LLM serving
- 2026-06-24SignalsFlashAttention-3: async, low-precision, Hopper-native
- 2026-06-23EssaysThe Economics of Thinking: Test-Time Compute as a Scaling Axis0012
- 2026-06-20SignalsMamba and the selective-state-space line
- 2026-06-19ExplainersReading a Model Release Like an Engineer: Weights, Licenses, System Cards, and Evals0011
- 2026-06-17SignalsSGLang and RadixAttention for prefix reuse
- 2026-06-15ReproductionsReproducing the nanoGPT Speedrun: What Actually Moves the Loss Curve0010
- 2026-06-13SignalsDeepSeek-R1: RL-trained reasoning with open weights
- 2026-06-11SignalsThe modded-nanogpt speedrun and the Muon optimizer
- 2026-06-11LibrariesTRL in Anger: SFT, DPO, and GRPO Without Rewriting Your Training Loop0009
- 2026-06-08ExplainersPost-Training Quantization in Practice: GPTQ, AWQ, and FP80008
- 2026-06-04ExplainersGRPO, Demystified: Group-Relative Policy Optimization for Reasoning Models0007
- 2026-06-01ExplainersRouting Is the Hard Part: A Practitioner's Guide to Mixture-of-Experts0006
- 2026-05-28RecreationsRecreating FlashAttention: A Tiled, IO-Aware Attention Kernel from Scratch0005
- 2026-05-24ExplainersRoPE and the Long-Context Stack: Rotation, Interpolation, and What Breaks at 128k0004
- 2026-05-20LibrariesvLLM, Explained: PagedAttention, Continuous Batching, and the Serving Stack0003
- 2026-05-15ExplainersSharding the Model: FSDP, ZeRO, and Tensor/Pipeline Parallelism0002
- 2026-05-12EssaysHow We Separate Signal From Noise: Frontier Checkpoint's Verification Rubric0001