Artificial Intelligence (AI) is becoming realistic as it sounds like humans

Jan, 2021 - By WMR

Artificial intelligence, since its very inception, has been one of the most exciting technologies in the world. As years passed, AI has become more and more sophisticated, efficient, and human-like. Advances in AI have led to its wide usage in a variety of fields including medicine, defense, e-commerce, supply chain, and customer services. Virtual assistants such as Appleâ€™s Siri, Googleâ€™s Google Assistant, and Amazonâ€™s Alexa are some of the most popular examples of AI-enabled assistants. AI has improved translation, Facebook newsfeed, Google search, and interactive games like chess and Go. One of the most evident uses of AI is a chatbot, where a software application is used to conduct online chat conversations via text-to-speech to provide a human-like effect. The chatbot technology is also getting better with voice-enabled chatbots gaining major attraction among businesses.

However, recent developments have shown that these voice-chatbots are now sounding like a human than a computerized sound. Scientists in the field of AI and natural language processing believe this could be a major breakthrough where AI could not only speak human language but sound like humans also. An AI sounding similar to humans still remains one of the most breakthrough and shockingly good development in the field of AI. This is considered a major development as the human voice has unique features. Changes in the pitch and how a particular word is spoken defines human speaking. Previously, AI-enabled voice assistants were only able to speak what humans did. However, they can speak incorporate those changes in the pitch and loudness, typically mimicking human-like speaking. Similar research is being carried out the Northeastern University, Massachusetts, the U.S. with Rupal Patel, founder, and CEO of VocaliD, Inc. at the helm.

The group of researchers is studying speech prosody, which refers to changes in loudness, pitch, and duration that are used to convey emotion and intention through voice. Research head Patel addressed how she grew interested in prosody after discovering it was the only key feature of vocal communication that seemed to aid people with speech disorders. Patel, who has a Ph.D. in speech-language pathology from the University of Toronto, observed that these patients were able to make expressive sounds even though they could not speak clearly. Patel founded VocaliD in 2014 in a similar line of thinking to build synthetic voices for non-speaking individuals. Since then, the company has expanded robustly and has emerged as a commercial brand.

In 2019, VocaliD entered into a partnership with Voices.com to create their own voices for smart speakers and other voice applications. The latest inventions in technology have made AI-enabled speech far more sophisticated with the help of machine learning where it could replicate awkward pauses and lip smacks. However, it still requires hours and hours of training of various samples to get the most real-world systems. Researchers and technology experts from VocaliD are continuously implementing novel methods to make it more efficient. It is only a matter of time the machines will not only speak but feel like us.