Amazon’s new AI system allows precise browser control with over 90% accuracy, outperforming competing models from OpenAI and Anthropic.
The Nova Act SDK, recently released in Research Preview, represents a significant advance in AI-powered browser automation. Developed by former OpenAI researchers David Luan and Pieter Abbeel, the technology enables developers to create AI agents that can reliably interact with websites – from calendar scheduling to full e-commerce checkout.
Unlike traditional AI assistants, which are limited to text responses, Nova Act can independently perform actions in a web browser. The system combines AI-powered decision making with deterministic control over the browser, making sensitive operations such as password entry more secure.
Superior performance in benchmarks
In internal Amazon tests, Nova Act outperforms the leading competitor models in various tasks:
Function | Nova Act | Claude 3.7 | OpenAI CUA |
---|---|---|---|
Text element interaction | 93,9% | 90,0% | 88,3% |
Icon interaction | 87,9% | 85,4% | 80,6% |
General understanding of UI | 80,5% | 82,5% | 82,3% |
Particularly noteworthy is the high level of accuracy when interacting with complex UI elements such as date selections and drop-down menus, where previous AI models often had difficulties.
The strategic importance of Nova Act goes far beyond browser automation. The system is expected to serve as the core technology for the upcoming Alexa update, enabling Amazon’s voice assistant to navigate the internet autonomously. Nova Act is also part of the broader Amazon Nova ecosystem, which will be accessible to developers via AWS Bedrock.
Ads
Summary:
- Nova Act enables AI-driven browser interactions with over 90% accuracy on UI elements
- The system outperforms competing models from OpenAI and Anthropic in internal tests
- The technology is developed by former OpenAI researchers David Luan and Pieter Abbeel
- Integration with Alexa will give Amazon a competitive advantage in the AI assistant market
- Responsible AI practices such as input/output moderation and C2PA-compliant watermarking are built in
Source: Amazon AGI Labs