Compound sounds have come a long way over the years. Gone are the days of artificial voices that sounded like robots from a 1960s science fiction movie. Modern AI assistants like Alexa and Siri make voice sound more realistic.
Regarding syntax sounds and texts, they are not yet complete. However, Nvidia's Text-to-Speech Research department has developed some machine learning tools to achieve voice synthesis in a variety of applications.
Nvidia has developed an artificial intelligence model called RAD-TTS. Developers can teach the model with their voices, and use distortions and learned tones to convert text messages into natural speech. It can also change the voice of one speaker to another. "Inspired by the idea of the human voice as a musical instrument, the RAD-TTS interface allows users to precisely control the volume, duration, and power of the built-in volume and frame."
You can see examples of the technology used in Nvidia's "I AM AI" video series. Nvidia's video producer read the text in these demos and turned her voice model into a speaker. Once the form has body text, the developer can change the narration to emphasize specific words and change the speed to fit the video.
This technology works in many areas, including automated customer service, language translation, disability assistance, and even gaming. Any program that requires a natural human voice can benefit from RAD-TTS.
"Many models with tens of thousands of hours of audio data have been trained on NVIDIA DGX systems. Developers can read the company's blog post.
These tools in terms of GPU acceleration and of course optimized For use on computers with Nvidia graphics cards, however, it is open source and free to all interested developers.NGC Container and Software Center is available.
RVID-TTS NVIDIA creates true AI voices that are more expressive