OpenAI’s new GPT Image 1 API reaches a significant milestone in AI-powered image generation after 130 million users have already created over 700 million images in just one week.
The new API, which is now available to all developers, powers the same feature that made a splash in ChatGPT and enables the seamless integration of image generation capabilities into various applications. Unlike previous models, GPT Image 1 is designed to be natively multimodal, meaning that it can process text and images in a unified architecture. This capability allows the model to accurately follow instructions, utilize real-world knowledge, and even correctly display text within images.
Well-known companies such as Adobe, Figma, Airtable and Wix have already partnered with OpenAI to integrate the new API into their products. At Adobe, the integration has led to an impressive 37% reduction in production costs for digital assets.
Technical details and pricing
The API offers developers extensive control over image generation with three main endpoints: Generations (creating new images from text), Edits (modifying existing images) and Variations (stylistic alternatives). The quality and resolution settings range from 512×512 pixels (low quality) to 1792×1024 pixels (high quality).
Pricing follows a token-based model:
- Text input: $5 per 1 million tokens
- Image input: 10 $ per 1 million tokens
- Image output: 40 $ per 1 million tokens
This results in costs of around $0.02 for a low-quality image, $0.04 for standard quality and $0.19 for high-quality images. This positions OpenAI between low-cost open source alternatives such as Stable Diffusion and more expensive options such as Google’s Imagen 3.
Ads
Summary
- Native multimodality: GPT Image 1 combines speech and image understanding in a unified architecture.
- High accuracy: 94.7% precision for complex instructions and 98.3% readability for text display.
- Flexible API: Three main endpoints for generation, editing and variations with extensive parameters.
- Competitive pricing: More cost-effective than DALL-E 3 and Imagen 3 with transparent token-based pricing model.
- Comprehensive security measures: C2PA metadata for proof of origin and customizable moderation settings.
- Broad industry application: Already successfully integrated into creative tools, e-commerce, gaming and enterprise applications.
Source: OpenAI