Topic

Mixture of Experts

Sparse models that route tokens to a subset of expert FFNs.

1 checkpoint