HyDRA: Teaching Your Phone to Understand Images Without Breaking the Bank

Imagine teaching your phone to recognize photos of dishes and suggest recipes. The catch? Models capable of this are massive and require the computational power of a Google data center. HyDRA is a clever method that adapts such models for mobile devices — without bankruptcy and without melting the planet. The Problem: An Elephant in Your Phone Vision Language Models (VLMs) are AI models that understand both images and text simultaneously. You can show them a photo and ask “what do you see?” or “how do I fix this?”. Sounds great, but there’s a catch. ...

December 27, 2025

Cost-Constrained LLM Cascades — Meet C3PO

Imagine you have an army of helpers — several different Large Language Models (LLMs), each capable of handling tasks from simple queries to complex reasoning. But each helper costs something: time, compute, or actual money if you’re using an API. So the question is: Can we orchestrate these models wisely — starting from the cheapest one that might do the job, escalating only when needed — without exceeding a cost budget? ...

November 14, 2025

SNOO – Old-School Nesterov Momentum in a New Jacket: Making Big Models Learn Faster

Imagine you’re training a massive language model — the kind that takes weeks to learn even the basics. Every training step costs time, electricity, and a small fortune. In such a world, even a tiny bump in efficiency feels like finding a way to get free coffee at work — small, but sweet. Enter SNOO – Step-K Nesterov Outer Optimizer, a clever idea that takes Nesterov momentum, a decades-old optimization trick, and applies it in a new place — outside the normal training loop. The result? Models that learn faster and more smoothly, without much extra computational cost. ...

October 20, 2025