Reasoning

Demystifying Video Reasoning: Models Don't Think in Frames - They Think in Denoising Steps

Video generation models like Sora can solve mazes, manipulate objects, and answer math questions - all by generating video. But how do they reason? The intuitive answer: step by step, frame by frame, like a person drawing a solution on a whiteboard. That answer is wrong. The paper “Demystifying Video Reasoning” shows that reasoning in video diffusion models doesn’t unfold across frames. It unfolds across denoising steps - the iterative process that turns noise into a coherent video. The authors call this Chain-of-Steps (CoS), and it fundamentally changes how we understand what these models are doing. ...

SAGE: Your Reasoning Model Knows When to Stop Thinking — You Just Won't Let It

Reasoning models generate long chains of thought to arrive at answers. But what if over half of those “thoughts” are useless noise, and the model has known the answer for a while — it just doesn’t know it can stop? The paper “Does Your Reasoning Model Implicitly Know When to Stop Thinking?” discovers that this is exactly the case, and proposes SAGE — a method that cuts token usage by 40-50% while maintaining or improving accuracy. ...

Attention as a Compass – Teaching Reasoning Models to Explore Smarter

Large Language Models (LLMs) are no longer just text generators — they are becoming reasoners, capable of solving mathematical problems, logical puzzles, or planning tasks step by step. One of the key challenges is how to improve the quality of this reasoning. Traditional Reinforcement Learning (RL) rewards only the final outcome, but in complex reasoning it makes more sense to evaluate each intermediate step. This is called process-supervised RL (PSRL). ...