Projects Blog About
BLOG

Problems with Chinchilla Approach 2

The Chinchilla paper's "Approach 2," fitting parabolas to IsoFLOP curves, turns out to have some subtle biases that can add up. We show these can lead to non-trivial errors in compute-optimal allocation (around 6.5% of total compute for Llama 3, worth over $1M in GPU time), especially when IsoFLOP grids aren't perfectly centered or symmetric. The paper also proposes a reparameterization of "Approach 3" that makes direct parametric fitting simple and stable. You can even run it in 70 lines of JavaScript.

March 27, 2026 · Eric Czech

© 2026 Open Athena.
All rights reserved.

Privacy Policy Terms of Use
LinkedIn Logo