<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Machine Learning on MLLog.dev</title><link>https://mllog.dev/en/categories/machine-learning/</link><description>Recent content in Machine Learning on MLLog.dev</description><image><title>MLLog.dev</title><url>https://mllog.dev/images/default_mllog.png</url><link>https://mllog.dev/images/default_mllog.png</link></image><generator>Hugo -- 0.147.9</generator><language>en</language><lastBuildDate>Tue, 27 Jan 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://mllog.dev/en/categories/machine-learning/index.xml" rel="self" type="application/rss+xml"/><item><title>To Grok Grokking: Why Neural Networks Sometimes Understand Late</title><link>https://mllog.dev/en/posts/grokking-provable-ridge-regression/</link><pubDate>Tue, 27 Jan 2026 00:00:00 +0000</pubDate><guid>https://mllog.dev/en/posts/grokking-provable-ridge-regression/</guid><description>&lt;p>In machine learning, we expect a model to either learn or overfit. What we don&amp;rsquo;t expect is for a model to overfit first and then — much later, with no changes — suddenly start generalizing well. This phenomenon is called &lt;strong>grokking&lt;/strong>, and it has puzzled researchers since its discovery. A new paper finally explains why it happens and proves it mathematically — in the simplest possible setting.&lt;/p>
&lt;h2 id="what-is-grokking">What is Grokking?&lt;/h2>
&lt;p>Grokking was first observed in 2022 on small algorithmic tasks (like modular arithmetic). The pattern is striking:&lt;/p></description></item><item><title>Tensor Networks: A Mathematical Bridge Between Neural and Symbolic AI</title><link>https://mllog.dev/en/posts/tensor-networks-neuro-symbolic-ai/</link><pubDate>Fri, 23 Jan 2026 00:00:00 +0000</pubDate><guid>https://mllog.dev/en/posts/tensor-networks-neuro-symbolic-ai/</guid><description>&lt;p>Neural networks excel at learning patterns from data. Symbolic AI excels at logical reasoning and interpretability. For decades, researchers have tried to combine them — with limited success. A new paper proposes an elegant mathematical framework that unifies both approaches: &lt;strong>tensor networks&lt;/strong>. The key insight? Both neural and symbolic computations can be expressed as tensor decompositions, and inference in both reduces to tensor contractions.&lt;/p>
&lt;h2 id="the-problem-two-worlds-that-dont-talk">The Problem: Two Worlds That Don&amp;rsquo;t Talk&lt;/h2>
&lt;p>Modern AI is split into two camps:&lt;/p></description></item><item><title>M²FMoE: When Experts Learn to Predict Floods</title><link>https://mllog.dev/en/posts/m2fmoe-extreme-adaptive-time-series-forecasting/</link><pubDate>Wed, 14 Jan 2026 00:00:00 +0000</pubDate><guid>https://mllog.dev/en/posts/m2fmoe-extreme-adaptive-time-series-forecasting/</guid><description>&lt;p>Time series forecasting is one of the most important applications of machine learning — from demand prediction, through infrastructure monitoring, to flood forecasting. The problem? Standard models optimize for &lt;strong>typical&lt;/strong> cases. Yet it&amp;rsquo;s precisely the &lt;strong>atypical&lt;/strong> ones — extreme events — that are often most important to predict. &lt;strong>M²FMoE&lt;/strong> is a model that learns to predict both.&lt;/p>
&lt;h2 id="the-problem-extreme-events-break-standard-models">The Problem: Extreme Events Break Standard Models&lt;/h2>
&lt;p>Time series forecasting has made remarkable progress. Transformers, frequency-domain methods, and hybrid architectures achieve impressive results on benchmarks. But there&amp;rsquo;s a catch.&lt;/p></description></item><item><title>BALLAST: When a Bandit Teaches Your Database How Long to Wait</title><link>https://mllog.dev/en/posts/ballast-contextual-bandits-raft-timeouts/</link><pubDate>Mon, 05 Jan 2026 00:00:00 +0000</pubDate><guid>https://mllog.dev/en/posts/ballast-contextual-bandits-raft-timeouts/</guid><description>&lt;p>Imagine you&amp;rsquo;re a team leader. You send a message and wait for a response. How long do you wait before assuming your colleague has &amp;ldquo;disappeared&amp;rdquo;? Too short — and you panic for no reason. Too long — and the whole project stalls. &lt;strong>BALLAST&lt;/strong> is a system that teaches databases to answer this question automatically, using machine learning techniques.&lt;/p>
&lt;h2 id="the-problem-rafts-achilles-heel">The Problem: Raft&amp;rsquo;s Achilles Heel&lt;/h2>
&lt;p>&lt;strong>Raft&lt;/strong> is a consensus protocol — the way distributed databases (like etcd, Consul, CockroachDB) agree on who&amp;rsquo;s the &amp;ldquo;leader&amp;rdquo; and which data is current. It works like this:&lt;/p></description></item><item><title>AI Co-Scientist: Teaching Models to Write Research Plans Better Than Humans</title><link>https://mllog.dev/en/posts/ai-co-scientist-rubric-rewards/</link><pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate><guid>https://mllog.dev/en/posts/ai-co-scientist-rubric-rewards/</guid><description>&lt;p>What if AI could not just answer questions, but actively &lt;strong>plan scientific research&lt;/strong>? Not generating text — creating coherent, novel experiment plans that experts rate as better than human-written ones. Sounds like science fiction? Researchers from Meta AI and partners just achieved this.&lt;/p>
&lt;h2 id="the-problem-how-do-you-grade-scientific-creativity">The Problem: How Do You Grade Scientific Creativity?&lt;/h2>
&lt;p>Training models for &amp;ldquo;closed&amp;rdquo; tasks (math, coding) is relatively straightforward — the answer is correct or not. But how do you evaluate a &lt;strong>research plan&lt;/strong>?&lt;/p></description></item><item><title>HyDRA: Teaching Your Phone to Understand Images Without Breaking the Bank</title><link>https://mllog.dev/en/posts/hydra-dynamic-rank-adaptation-mobile-vlm/</link><pubDate>Sat, 27 Dec 2025 00:00:00 +0000</pubDate><guid>https://mllog.dev/en/posts/hydra-dynamic-rank-adaptation-mobile-vlm/</guid><description>&lt;p>Imagine teaching your phone to recognize photos of dishes and suggest recipes. The catch? Models capable of this are massive and require the computational power of a Google data center. &lt;strong>HyDRA&lt;/strong> is a clever method that adapts such models for mobile devices — without bankruptcy and without melting the planet.&lt;/p>
&lt;h2 id="the-problem-an-elephant-in-your-phone">The Problem: An Elephant in Your Phone&lt;/h2>
&lt;p>&lt;strong>Vision Language Models&lt;/strong> (VLMs) are AI models that understand both images and text simultaneously. You can show them a photo and ask &amp;ldquo;what do you see?&amp;rdquo; or &amp;ldquo;how do I fix this?&amp;rdquo;. Sounds great, but there&amp;rsquo;s a catch.&lt;/p></description></item></channel></rss>