Nvidia reveals new platform for creating AI avatars
Nvidia, an American multinational technology company, has unveiled Nvidia Omniverse Avatar, a technology platform for generating interactive AI avatars.
The platform enables users to leverage speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies. Avatars created in the platform are interactive characters with ray-traced 3D graphics that can see, speak, converse on a wide range of subjects, and understand naturally spoken intent.
“The dawn of intelligent virtual assistants has arrived,” said Jensen Huang, founder and CEO of NVIDIA. “Omniverse Avatar combines NVIDIA’s foundational graphics, simulation and AI technologies to make some of the most complex real-time applications ever created. The use cases of collaborative robots and virtual assistants are incredible and far-reaching.”
What is NVIDIA’s Omniverse?
Omniverse Avatar is part of NVIDIA Omniverse, a virtual world simulation and collaboration platform for 3D workflows currently in open beta with over 70,000 users. Creators, designers, researchers, and engineers can connect major design tools, assets, and projects to collaborate and iterate in a shared virtual space. Developers and software providers can also easily build and sell Extensions, Apps, Connectors, and Microservices on Omniverse’s modular platform to expand its functionality.
In his keynote address at NVIDIA GTC, Huang shared various examples of Omniverse Avatar: Project Tokkio for customer support, NVIDIA DRIVE Concierge for always-on, intelligent services in vehicles, and Project Maxine for video conferencing.
In the first demonstration of Project Tokkio, Huang showed colleagues engaging in a real-time conversation with an avatar crafted as a toy replica of himself — conversing on such topics as biology and climate science.
The release of Omniverse Avatar will provide marketers with a solution that they can use to interact with customers in virtual worlds, and simulation platforms like Nvidia Omniverse, where users can deploy the avatars to facilitate personalised customer service interactions with consumers, and enhance customer satisfaction.
The key elements used by Omniverse Avatar
Omniverse Avatar uses elements from speech AI, computer vision, natural language understanding, recommendation engines, facial animation, and graphics delivered through the following technologies:
- Its speech recognition is based on NVIDIA Riva, a software development kit that recognises speech across multiple languages. Riva is also used to generate human-like speech responses using text-to-speech capabilities.
- Its natural language understanding is based on the Megatron 530B large language model that can recognise, understand and generate human language. Megatron 530B is a pretrained model that can, with little or no training, complete sentences, answer questions of a large domain of subjects, summarise long, complex stories, translate to other languages, and handle many domains that it is not trained specifically to do.
- Its recommendation engine is provided by NVIDIA Merlin, a framework that allows businesses to build deep learning recommender systems capable of handling large amounts of data to make smarter suggestions.
- Its perception capabilities are enabled by NVIDIA Metropolis, a computer vision framework for video analytics.
- Its avatar animation is powered by NVIDIA Video2Face and Audio2Face, 2D and 3D AI-driven facial animation and rendering technologies.