What a 4B Model Can Actually Do: Field Notes from 155 Experiments
Across 155 small-model experiments centered on Qwen 3.5 4B, the same thing kept working: give the model something executable it can check against the evidence it has, and it punches far above its benchmark weight. Here is the field guide — the levers that worked, how I know they're real, and the frontier they opened up.