Microsoft’s New ‘Fara-7B’ AI Agent Rivals GPT-4o, Runs Locally on Your PC

Image: Microsoft

Microsoft’s Fara-7B is a 7B-parameter computer-use agent that runs locally on PCs, rivals GPT-4o on web tasks, and adds safety checkpoints for risky actions.

Verfasst von

Aminu Abdullahi

Nov 25, 2025

Microsoft has unveiled Fara-7B, a compact computer-use AI model designed to perform tasks the way a human would, using a mouse, a keyboard, and whatever is shown on the screen.

The model is the latest in Microsoft’s push toward on-device, agentic AI. According to the Microsoft Research Blog, “Computer Use Agent (CUA) models like Fara-7B leverage computer interfaces, such as a mouse and keyboard, to complete tasks on behalf of users.”

At just 7 billion parameters, the system is small enough to run directly on PCs, including Windows devices with built-in NPUs. Microsoft says local execution reduces latency and helps protect privacy because user data never leaves the device.

Rather than reading code or relying on accessibility metadata, Fara-7B navigates the web by visually interpreting screenshots. Microsoft says the model operates by visually perceiving a webpage and taking actions such as scrolling, typing, and clicking at directly predicted coordinates. This approach allows the agent to work even when websites are complex or obfuscated.

Yash Lara, senior PM lead at Microsoft Research, highlighted the privacy benefit, telling VentureBeat that on-device processing enables “pixel sovereignty,” adding that this method “helps organizations meet strict requirements in regulated sectors, including HIPAA and GLBA.”

Benchmark results that punch above its weight

Despite its small size, Fara-7B posted numbers that rival far larger systems. The model achieved 73.5% on the WebVoyager benchmark, outperforming GPT-4o (65.1%) when evaluated as a computer-use agent.

According to the company, the model achieves “state-of-the-art performance, … even outperforming native computer use agents like UI-TARS-1.5-7B, or much larger models like GPT-4o.” It also completes tasks in fewer steps than earlier 7B models, resulting in faster, more predictable automation.

One major challenge in building computer-use agents is gathering detailed data about how people complete tasks on a computer. To solve this, Microsoft relied heavily on synthetic training.

Microsoft states that the team generated training data using an Orchestrator and a WebSurfer agent, producing 145,000 successful task trajectories. The company explains that the approach avoids manual labeling by relying on a “scalable synthetic data” pipeline built from real web pages and user-inspired tasks.

Microsoft also notes that Fara-7B is built on Qwen2.5-VL-7B because of its strong visual grounding and extended context capabilities.

More Microsoft news

Stopping at safety checkpoints

Because an AI agent that can operate a computer poses high risks, Microsoft included multiple layers of safeguards. One of the core design elements is what the team calls “Critical Points.”

In Microsoft’s words, a Critical Point “is any situation that requires the user’s personal data or consent before engaging in a transaction or irreversible action.” This means the agent must pause before actions such as entering personal information, sending messages, or confirming a purchase.

Available for developers and researchers

Microsoft released Fara-7B under an MIT license and made it available through Hugging Face and Microsoft Foundry. Developers can experiment with the model using Magentic-UI, the company’s environment for testing computer-use agents.

The company stresses that the project is still in early stages, and future work will involve enhancing reliability through reinforcement learning and sandboxed training environments.

At Ignite 2025, Microsoft’s broader AI lifecycle vision outlines how models, agents, and developer tools are being tied together into a single platform strategy.

Aminu Abdullahi

Aminu Abdullahi is a B2C and B2B technology and finance writer with more than six years of experience covering enterprise IT, cybersecurity, cloud computing, artificial intelligence, fintech, business software, and emerging technologies. He has written for a wide range of technical and business audiences, from IT professionals and cybersecurity leaders to small business owners, executives, and technology buyers. His work has appeared in publications including: TechRepublic eWEEK Channel Insider Geekflare Enterprise Networking Planet eSecurity Planet CIO Insight Webopedia With a background in computer science, Aminu specializes in translating complex technical subjects into clear, practical, and accessible content. His writing helps readers understand emerging technologies, evaluate business software, strengthen cybersecurity strategies, and make more informed decisions about technology investments. Across his work, Aminu focuses on the real-world impact of technology, connecting technical innovation with business value, operational efficiency, security, and long-term digital transformation.