OpenAI presents GPT Image 1 API: AI image generation becomes more accessible and powerful

OpenAI’s new GPT Image 1 API reaches a significant milestone in AI-powered image generation after 130 million users have already created over 700 million images in just one week.

The new API, which is now available to all developers, powers the same feature that made a splash in ChatGPT and enables the seamless integration of image generation capabilities into various applications. Unlike previous models, GPT Image 1 is designed to be natively multimodal, meaning that it can process text and images in a unified architecture. This capability allows the model to accurately follow instructions, utilize real-world knowledge, and even correctly display text within images.

Well-known companies such as Adobe, Figma, Airtable and Wix have already partnered with OpenAI to integrate the new API into their products. At Adobe, the integration has led to an impressive 37% reduction in production costs for digital assets.

Technical details and pricing

The API offers developers extensive control over image generation with three main endpoints: Generations (creating new images from text), Edits (modifying existing images) and Variations (stylistic alternatives). The quality and resolution settings range from 512×512 pixels (low quality) to 1792×1024 pixels (high quality).

Pricing follows a token-based model:

The best free AI tools

The best free AI tools
View free AI Tools

  • Text input: $5 per 1 million tokens
  • Image input: 10 $ per 1 million tokens
  • Image output: 40 $ per 1 million tokens

This results in costs of around $0.02 for a low-quality image, $0.04 for standard quality and $0.19 for high-quality images. This positions OpenAI between low-cost open source alternatives such as Stable Diffusion and more expensive options such as Google’s Imagen 3.

Ads

Legal Notice: This website ai-rockstars.com participates in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

Summary

  • Native multimodality: GPT Image 1 combines speech and image understanding in a unified architecture.
  • High accuracy: 94.7% precision for complex instructions and 98.3% readability for text display.
  • Flexible API: Three main endpoints for generation, editing and variations with extensive parameters.
  • Competitive pricing: More cost-effective than DALL-E 3 and Imagen 3 with transparent token-based pricing model.
  • Comprehensive security measures: C2PA metadata for proof of origin and customizable moderation settings.
  • Broad industry application: Already successfully integrated into creative tools, e-commerce, gaming and enterprise applications.

Source: OpenAI