RiemannLoRA: A Unified Riemannian Framework for Ambiguity-Free LoRA Optimization

In recent years, Low‑Rank Adaptation (LoRA) has become a cornerstone technique for parameter‑efficient fine‑tuning of large language models (LLMs) and diffusion models. By injecting low‑rank matrices into pre-trained weights, LoRA drastically reduces memory and compute requirements, enabling rapid experimentation and deployment. However, practitioners face two persistent challenges: Initialization ambiguity: Different low‑rank factor pairs $$A, B$$ can represent the same adapted weight update $AB^\top$, leading to unstable or suboptimal starts. Redundant parameterization: Without a canonical representation, gradient updates can wander through equivalent parameter configurations. The RiemannLoRA framework, introduced by Bogachev et al., offers a unifying geometric viewpoint that removes these ambiguities and yields faster, more stable fine‑tuning. ...

July 17, 2025

A Neural Network Model of Complementary Learning Systems: Pattern Separation and Completion for Continual Learning

Standard neural networks often suffer from catastrophic forgetting, where learning new tasks degrades performance on previously learned tasks. In contrast, the human brain integrates new and old memories through two complementary memory systems: the hippocampus and neocortex. 1. Objectives The authors aim to build a model that captures: Pattern separation: distinct encoding of similar experiences, Pattern completion: reconstructing full representations from partial inputs, to support continual learning without loss of previously acquired skills. ...

July 16, 2025

Target Polish: How to Polish Data and Reveal Its True Structure

Imagine you’re analyzing sensor data. Suddenly one sensor shows -999°C. That’s an outlier — a single data point that can completely ruin your analysis. 🧩 What is factorization? Matrix factorization means decomposing data $X$ into two non-negative components: $$ X \approx WH $$ Where $W$ contains “features” and $H$ shows how much of each is needed. 💡 The problem Classical methods like NMF are sensitive to noise and outliers. When data is messy, analysis breaks down. ...

July 15, 2025

Optimistic Exploration for Risk-Averse Constrained Reinforcement Learning

Reinforcement Learning (RL) has revolutionized how agents learn to act in complex environments. But what happens when an agent can’t afford to make mistakes—because a mistake means a car crash, system failure, or energy limit violation? In such cases, we turn to Constrained Reinforcement Learning (CRL), where agents aim to maximize reward while staying within safety or cost constraints. Unfortunately, current CRL methods often become… too cautious, leading to poor performance. ...

July 14, 2025

Not Just Bigger Models: Why AI Should See Better Instead of Just Scaling

In recent years, AI progress has been largely defined by size: bigger models, bigger datasets, bigger compute budgets. GPT-4, Claude, Gemini – each new model pushes the limits further. But is bigger always better? A group of researchers (Baek, Park, Ko, Oh, Gong, Kim) argue in their recent paper "AI Should Sense Better, Not Just Scale Bigger" (arXiv:2507.07820) that we’ve hit diminishing returns. Instead of growing endlessly, they propose a new focus: adaptive sensing. ...

July 13, 2025

HGMP: Revolutionizing Complex Graph Analysis with Prompt Learning

In the era dominated by language models and machine learning, the importance of structured data is growing rapidly: social networks, biological relationships, and business connections. This data is represented in the form of graphs, which are often not homogeneous: they contain nodes of different types (e.g., people, products, companies) and different types of edges (e.g., “purchased”, “recommended”, “works at”). Processing such heterogeneous graphs requires specialized methods. What are heterogeneous graphs? A heterogeneous graph is a structure in which: ...

July 12, 2025

Predicting and Generating Antibiotics Against Future Pathogens with ApexOracle

The accelerating crisis of antimicrobial resistance (AMR) demands new computational methods to stay ahead of evolving pathogens. ApexOracle is a unified ML platform designed to both predict the activity of candidate compounds against specific bacterial strains and generate novel molecules de novo, proactively targeting future superbugs. Motivation and Scope Global Impact: AMR contributes to nearly 5 million deaths annually. Traditional Challenges: Standard drug discovery pipelines are slow, resource-intensive, and reactive. ApexOracle Goal: Integrate genomic context and molecular design into one end-to-end framework. ApexOracle Architecture Layman’s Explanation: Imagine you have three sets of clues: the code of the bacteria (its genome), a simple description of its behaviors (like a basic fact sheet), and the building blocks of a potential drug (a molecular recipe). ApexOracle acts like a super-smart detective that reads all three clues at once. It combines them, figures out which molecules might work best, and even drafts entirely new molecular recipes that could stop the bacteria in its tracks. ...

July 11, 2025

HeLo – A New Path for Multimodal Emotion Recognition

Modern emotion-recognition systems increasingly leverage data from multiple sources—ranging from physiological signals (e.g., heart rate, skin conductance) to facial video. The goal is to capture the richness of human feelings, where multiple emotions often co-occur. Traditional approaches, however, focused on single-label classification (e.g., “happy” or “sad”). The paper “HeLo: Heterogeneous Multi-Modal Fusion with Label Correlation for Emotion Distribution Learning” introduces an entirely new paradigm: emotion distribution learning, where the model predicts the probability of each basic emotion being present. ...

July 10, 2025

Modern Methods in Associative Memory

Associative memory is the ability to store patterns and retrieve them when presented with partial or noisy inputs. Inspired by how the human brain recalls memories, associative memory models are recurrent neural networks that converge to stored patterns over time. The tutorial ‘Modern Methods in Associative Memory’ by Krotov et al. offers an accessible overview for newcomers and a rigorous mathematical treatment for experts, bridging classical ideas with cutting-edge developments in deep learning. ...

July 9, 2025

QuEst: Blending Data and Predictions for Robust Quantile Estimation

Imagine you track your morning commute times by recording 50 real-world trips with your GPS-enabled phone. You also run a traffic simulator to generate 5,000 possible commute scenarios. You want a reliable estimate of the 95th percentile of commute time—the duration you won’t exceed 95% of the days. Using only your 50 recorded trips yields a wide confidence interval. Using only the simulator risks systematic biases: it might ignore sudden road closures or special events. ...

July 8, 2025