Alibaba Qwen3 coder with 480 billion parameters: Open-source AI outperforms GPT-4

Alibaba presents Qwen3-Coder-480B-A35B-Instruct, an AI model that redefines the standards for autonomous software development and outperforms proprietary systems such as GPT-4.1 and Claude Sonnet-4 in key areas.

Released on July 22, 2025, the model uses a Mixture-of-Experts architecture with 480 billion parameters, but only activates 35 billion parameters per inference run. This efficiency enables high-quality code generation with significantly reduced computing resources. Native support for 256,000 tokens, expandable to one million tokens through YaRN optimization, allows the analysis of complete code repositories in a single processing step.

The training was performed with 7.5 trillion tokens, with 70 percent of the data coming from code sources. Of particular note is the Agent RL framework, which used over 20,000 parallel environments to simulate realistic development scenarios. This methodology enables the model to autonomously process GitHub issues, including code modification, testing and documentation updates without human intervention.

Table of Contents

Benchmark dominance in critical areas

Qwen3-Coder achieves an accuracy of 61.8 percent on SWE-Bench Verified, significantly outperforming GPT-4.1 (38.8 percent) and coming close to Claude Sonnet-4 (67.0 percent). This benchmark tests the ability to solve real-world GitHub issues by analyzing code, implementing fixes and validating solutions. In CodeForce’s ELO ratings for algorithmic programming, the model sets new standards among open source systems.

The AIME evaluation (Agent Integration and Multitask Evaluation) shows Qwen3-Coder’s superiority in tool-integrated workflows: It outperforms GPT-4.1 by 8.2 percentage points in tasks that combine web browsing, API usage and debugging. On the Aider Polygot benchmark, it achieves 61.8 percent accuracy in multilingual projects – only 1.3 percentage points below Claude Sonnet-4 despite a significantly lower number of parameters.

Practical application through agent-based workflows

The model goes beyond conventional code completion and executes autonomous development workflows. The Qwen Code command line interface, adapted from Gemini Code, orchestrates development tools such as Git, Docker and test frameworks through natural language commands. Developers can formulate goals such as “refactor authentication module with OAuth 2.0 support”, whereupon the system coordinates tool execution and code implementation.

The model’s iterative refinement protocols analyze error logs, adjust implementations, and rerun tests until functional specifications are achieved. This capability proves transformative for legacy system modernization, where it identifies technical debt and recommends refactoring strategies that improve maintainability without compromising functionality.

Key facts about the update

Architecture: 480 billion parameter mixture-of-experts model with 35 billion active parameters per inference
Context processing: Native 256K token support, expandable to 1 million tokens through YaRN optimization
Benchmark performance: 61.8% accuracy on SWE-Bench Verified, outperforms GPT-4.1 by 23 percentage points
Open source availability: Apache 2.0 license enables commercial use without restrictive fees
Tool integration: Qwen Code CLI orchestrates Git, Docker, test frameworks through natural language commands
Quantization: GGUF format enables 4-bit execution on consumer hardware with 98.7% original accuracy
Multilingual support: Comprehensive support for Python, JavaScript, Java, C , Go, Rust and other languages
Agentic capabilities: Autonomous GitHub issue editing with code modification, testing and documentation
Training innovation: Agent RL framework with 20,000 parallel environments for realistic development scenarios
Community ecosystem: Active GitHub repositories with 119 merged pull requests and continuous development

Source: GitHub

Benchmark dominance in critical areas

Practical application through agent-based workflows

Key facts about the update

Related Posts: