TaoAvatar AI: AR communication redefined by 90 FPS 3D avatars

Alibaba’s latest innovation TaoAvatar sets new standards for photorealistic 3D avatars in real time and finally makes AR communication suitable for everyday use.

The technology combines 3D Gaussian Splatting (3DGS) with an innovative teacher-student network approach to create fully controllable human avatars. These digital representations not only achieve impressive visual quality, but also run at 90 frames per second on mobile devices such as the Apple Vision Pro – a crucial factor for practical use in AR applications. The avatars follow a parametric SMPLX template with consistent topology, allowing precise control over poses, gestures and facial expressions.

Unlike previous technologies, TaoAvatar only requires multi-view camera sequences as input and achieves 2.4 dB better PSNR image quality than comparable systems. At the same time, the technology reduces memory requirements by 70% compared to NeRF-based approaches.

Technical innovation on several levels

At the heart of the system is a hybrid representation model that combines SMPLX meshes with 3D Gaussian textures. This enables both precise geometric control and convincing dynamic appearances. Particularly noteworthy is the use of a teacher-student framework:

  1. The StyleUnet teacher network captures high-frequency details through position-based deformation maps
  2. The MLP student network is optimized for mobile devices and ensures 90 FPS at 2K resolution

To develop the technology, the research team used the TalkBody4D dataset with 59-camera recordings in 20 FPS and 3K×4K resolution. The integration of Audio2BS technology also enables the synchronization of lip movements, facial expressions and gestures with spoken language.

Advertisement

Ebook - ChatGPT for Work and Life - The Beginner's Guide to Getting More Done

For Beginners: Learn ChatGPT for Your Job & Life

Our latest e-book provides a simple and structured guide on how to use ChatGPT in your job or personal life.

  • Includes many examples and prompts to try out
  • 8 use cases included: e.g., as a translator, learning assistant, mortgage calculator, and more
  • 40 pages: clearly explained and focused on the essentials

View E-Book

Areas of application and future prospects

The technology developed by Alibaba researchers opens up a wide range of possible applications:

  • Life-sizeAR shopping assistants for 3D product demonstrations
  • Holographic meetings with emotional expressiveness
  • AI customer service with natural body language

Despite this impressive progress, there are still challenges in modeling extreme facial expressions and the high computational cost of initial template creation (approximately 8 hours per avatar). However, with the planned release of the code and dataset via Hugging Face, the technology should soon find wider application.

Ads

Legal Notice: This website ai-rockstars.com participates in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

Summary:

  • TaoAvatar creates photorealistic 3D avatars with consistent topology
  • Real-time rendering at 90 FPS on mobile devices and AR headsets
  • Hybrid architecture combines 3D Gaussian splatting with parametric models
  • 70% memory savings compared to conventional methods
  • Applications in e-commerce, AR communication and AI assistance
  • Integration of audio-to-facial expression synchronization for natural interactions

Source: Taoavatar