#artificial-intelligence

Articles tagged with #artificial-intelligence

RL in the Pre-train Space: Why Training on P(y) Beats Training on P(y|x)
A new paper shows that reinforcement learning directly on the marginal distribution unlocks reasoning capabilities that standard RLVR can never reach.
Apr 16, 20267 min read
The Three Walls Your AI Research Agent Keeps Hitting
A Meta paper reveals the overfitting wall everyone accepted was actually evaluation noise, and the real ceiling is much further out
Apr 4, 20268 min read3
Chain-of-Thought Was Supposed to Be Our Window Into AI Reasoning. Optimization Is Slamming It Shut.
Here's the deal we thought we had with chain-of-thought prompting: let the model show its work, and we can watch the reasoning unfold. If something goes wrong, we'd see it in the chain. CoT was our audit trail, our interpretability shortcut, our free...
Apr 2, 20266 min read
Tucker Attention: GQA, MLA, and MHA Were the Same Thing All Along
All major attention variants are special cases of one tensor decomposition, achieving 10x parameter reduction with zero performance loss
Apr 2, 20267 min read1
Your LLM Doesn't Know When It's Wrong. A Second One Might.
Cross-model disagreement is a training-free, label-free signal that catches confident errors your model's own uncertainty metrics will miss every time.
Mar 29, 20267 min read
The Compression Wars: Why Making AI Smaller Is Now Harder Than Making It Bigger
Google's TurboQuant, Apple's Gemini distillation, and a new knowledge transfer method converge on the same message: the race to make AI bigger is over.
Mar 28, 20268 min read

#artificial-intelligence - The Agent Stack