Voice - Articles, News & Guides

OpenAI gpt-realtime: New capabilities & How to use

2026-03-182025-08-30 by Ralf Schukay

The new real-time voice model gpt-realtime and the new real-time API from OpenAI offer significantly improved voice quality with now 82% in the benchmark. Natural voice capabilities enable productive applications in customer support, as personal assistants and in education.

RAG technology from ElevenLabs: AI assistants with precise knowledge databases

2026-03-182025-04-01 by Florian Schröder

ElevenLabs’ new RAG (Retrieval-Augmented Generation) feature enables AI assistants to deliver precise answers by efficiently retrieving information from extensive knowledge bases. This groundbreaking integration significantly improves the accuracy and relevance of virtual assistants.

Google Gemini gets powerful AI collaboration tools: Canvas and Audio Overview introduced

2026-03-182025-03-24 by Florian Schröder

Google is expanding its AI offering with powerful collaboration features that are revolutionizing the way people work together on documents and code.

OpenAI’s new audio APIs improve voice assistant development

2026-03-182025-03-21 by Florian Schröder

OpenAI is setting new standards for speech technology with its new audio APIs, enabling developers to create advanced voice assistants with more natural interactions.

Tutorial: Transcribing YouTube videos with Google AI Studio

2026-03-182025-03-20 by Ralf Schukay

With Google AI Studio, you can transcribe and summarize YouTube videos quickly and for free – thanks to the latest Gemini models from Google. Whether for SEO, content reuse or accessibility, we’ll show you the best prompts for accurate transcriptions and summaries.

Cartesia Sonic: Fast, realistic and flexible text-to-speech technology

2026-03-182025-03-12 by Florian Schröder

Cartesia brings a new generation in text-to-speech (TTS) technology with Sonic – with amazing speed, outstanding realism and ultimate adaptability. This innovation sets new standards in AI speech synthesis.

The challenges and opportunities of AI voice technology: overcoming Uncanny Valley

2026-03-182025-03-03 by Florian Schröder

The development of artificial intelligence in voice technology has made enormous progress in recent years. However, it is precisely these advances that are creating new challenges – in particular the phenomenon of the Uncanny Valley, which often occurs with AI-generated voices. Although these voices sound impressively human, minimal irregularities such as unnatural pitches or rhythms can create an emotional distance and a sense of discomfort for users.

ElevenLabs enters the ASR market with innovative speech-to-text technology

2026-03-182025-03-03 by Florian Schröder

With the introduction of “Scribe”, ElevenLabs is expanding its portfolio and sending a clear signal to the market for automatic speech recognition (ASR). This innovative speech-to-text solution impresses with its high accuracy and advanced functions that exceed current standards in the ASR sector.

Amazon Alexa : Context-aware voice assistance with generative AI

2026-03-182025-02-27 by Florian Schröder

Amazon has unveiled “Alexa “, an advanced version of its popular voice assistant that uses generative AI to create even more context-aware, interactive and personalized experiences. This update is not only a technological leap forward, but could also make a significant contribution to further establishing the use of AI-driven assistants and their integration into everyday life.

OpenAI Sora: New milestone for advanced voice and video AI

2026-03-182024-12-09 by Florian Schröder

The release of OpenAI’s Sora has reached a new milestone in the development of AI technologies. This extraordinary combination of advanced speech processing and text-to-video generation could permanently change the way companies and individuals use AI.