This page collects my long-form notes on mechanistic interpretability, deep learning theory, optimization, and scaling laws. If you are new here, start from the latest posts below.
-
Can We Derive Scaling Law From First Principles?
New research available. Click to read the full PDF.