A new arXiv paper examines multimodal large language models’ (MLLMs) struggles with spatial reasoning, attributing them to architectural flaws in fusing visual and linguistic data. It proposes injecting targeted reasoning mechanisms to improve reliability for applications like robotics. This could advance agentic AI by 2025, addressing ethical concerns in education and infrastructure.
Recent Stories
App Store apps are exposing data from millions of users
An effort led by security research lab CovertLabs is actively uncovering troves of (mostly) AI-related apps that leak and expose user data.
Jan 19, 2026Stop ignoring AI risks in finance, MPs tell BoE and FCA
Treasury committee urges regulators and Treasury to take more ‘proactive’ approach
Jan 19, 2026OpenAI CFO Friar: 2026 is year for ‘practical adoption’ of AI
OpenAI CFO Sarah Friar said the company is focused on "practical adoption" in 2026, especially in health, science, and enterprise.