scaling-law
an archive of posts with this tag
| Jun 16, 2026 | 为什么 LLM pretrain 过程中途要把 batch size 翻倍 -- views |
|---|---|
| May 24, 2026 | 重听杨植麟:Bet on Scaling、第一性原理和长期主义 -- views |
| Apr 14, 2026 | 在 LLM 语境下,梯度里的噪声会如何影响 training dynamics? -- views |
| Feb 01, 2026 | 如何对齐不同初始化大小下的 Data scaling 曲线 -- views |
| Dec 30, 2025 | Can We Derive Scaling Law From First Principles? |