Benchmarking on MLLog.dev

Benchmarking on MLLog.devhttps://mllog.dev/en/tags/benchmarking/Recent content in Benchmarking on MLLog.devMLLog.devhttps://mllog.dev/images/default_mllog.pnghttps://mllog.dev/images/default_mllog.pngHugo -- 0.147.9enWed, 15 Apr 2026 10:00:00 +0100ClawGUI: A Full-Stack Open-Source Pipeline for GUI Agentshttps://mllog.dev/en/posts/2026-04-15-clawgui-unified-framework-gui-agents/Wed, 15 Apr 2026 10:00:00 +0100https://mllog.dev/en/posts/2026-04-15-clawgui-unified-framework-gui-agents/ClawGUI unifies online RL training, reproducible evaluation, and real-device deployment of GUI agents into one open-source pipeline — and shows a 2B model trained inside it can beat 72B untrained baselines on MobileWorld.