Machine Learning on MLLog.dev

To Grok Grokking: Why Neural Networks Sometimes Understand Late

Tue, 27 Jan 2026 00:00:00 +0000

In machine learning, we expect a model to either learn or overfit. What we don’t expect is for a model to overfit first and then — much later, with no changes — suddenly start generalizing well. This phenomenon is called grokking, and it has puzzled researchers since its discovery. A new paper finally explains why it happens and proves it mathematically — in the simplest possible setting.

What is Grokking?

Grokking was first observed in 2022 on small algorithmic tasks (like modular arithmetic). The pattern is striking:

Tensor Networks: A Mathematical Bridge Between Neural and Symbolic AI

Fri, 23 Jan 2026 00:00:00 +0000

Neural networks excel at learning patterns from data. Symbolic AI excels at logical reasoning and interpretability. For decades, researchers have tried to combine them — with limited success. A new paper proposes an elegant mathematical framework that unifies both approaches: tensor networks. The key insight? Both neural and symbolic computations can be expressed as tensor decompositions, and inference in both reduces to tensor contractions.

The Problem: Two Worlds That Don’t Talk

Modern AI is split into two camps:

M²FMoE: When Experts Learn to Predict Floods

Wed, 14 Jan 2026 00:00:00 +0000

Time series forecasting is one of the most important applications of machine learning — from demand prediction, through infrastructure monitoring, to flood forecasting. The problem? Standard models optimize for typical cases. Yet it’s precisely the atypical ones — extreme events — that are often most important to predict. M²FMoE is a model that learns to predict both.

The Problem: Extreme Events Break Standard Models

Time series forecasting has made remarkable progress. Transformers, frequency-domain methods, and hybrid architectures achieve impressive results on benchmarks. But there’s a catch.

BALLAST: When a Bandit Teaches Your Database How Long to Wait

Mon, 05 Jan 2026 00:00:00 +0000

Imagine you’re a team leader. You send a message and wait for a response. How long do you wait before assuming your colleague has “disappeared”? Too short — and you panic for no reason. Too long — and the whole project stalls. BALLAST is a system that teaches databases to answer this question automatically, using machine learning techniques.

The Problem: Raft’s Achilles Heel

Raft is a consensus protocol — the way distributed databases (like etcd, Consul, CockroachDB) agree on who’s the “leader” and which data is current. It works like this:

AI Co-Scientist: Teaching Models to Write Research Plans Better Than Humans

Tue, 30 Dec 2025 00:00:00 +0000

What if AI could not just answer questions, but actively plan scientific research? Not generating text — creating coherent, novel experiment plans that experts rate as better than human-written ones. Sounds like science fiction? Researchers from Meta AI and partners just achieved this.

The Problem: How Do You Grade Scientific Creativity?

Training models for “closed” tasks (math, coding) is relatively straightforward — the answer is correct or not. But how do you evaluate a research plan?

HyDRA: Teaching Your Phone to Understand Images Without Breaking the Bank

Sat, 27 Dec 2025 00:00:00 +0000

Imagine teaching your phone to recognize photos of dishes and suggest recipes. The catch? Models capable of this are massive and require the computational power of a Google data center. HyDRA is a clever method that adapts such models for mobile devices — without bankruptcy and without melting the planet.

The Problem: An Elephant in Your Phone

Vision Language Models (VLMs) are AI models that understand both images and text simultaneously. You can show them a photo and ask “what do you see?” or “how do I fix this?”. Sounds great, but there’s a catch.