Select Page

“Hey, find me the most interesting books about the history of the U.S., and also book me an appointment with my doctor next Tuesday.” Imagine that you said this phrase to your digital assistant, which would convert your words into text and then execute them within a few seconds. Think about the amount of time you would be able to save by communicating with the smart machines via voice? How efficient would this be for you in the time of information overload?

Today we’re going to start a series of articles about the speech recognition technology because it’s the future (well, it’s reality now rather than future) of communication with the smart machines. We’ll cover such topics on its benefits, hardware and software principle of operation and provide you with a brief history and key figures, who changed the way we look at the speech recognition today.

Decades ago, people would’ve gone crazy about the idea that a computer was able to recognize, understand, classify and execute spoken words into actions. Remember that Stanley Kubrick’s movie “2001: A Space Odyssey,” where an intelligent computer “HAL” was able to understand and speak a fluent language. It brought up a viral idea that humans can talk to machines in the same way as humans do speak with each other.

Current market offers a variety of devices with built-in speech recognition technologies, and they’re available for both commercial and personal use. It’s worth to mention that speech recognition technology is still far from its full completion, because ideally it implies that all machines can understand and chat with humans on any topic like humans do talk to each other. However, scientific and technological advancement in the past 30 years have moved humans closer to the moment, when we will consider the machines as great chat partners, family friends, good advisors, co-workers and just an important part of our society.

Many people are still sometimes confused with the terms speech recognition and voice recognition technologies. So, speech recognition is the process of converting spoken words into digital data, when voice recognition is the process of identifying the person who is speaking.

Why do many tech companies pay the most attention and even build entire products around this technology?

Well, the speech was the first medium for humans to communicate with each other. If you look closer at the speech recognition history (we’ll cover that in the next article), then you’ll see that machines learned to understand a language in the same way as humans did. In the beginning, they (machines) could recognize only digits, then separate letters, words, phrases, sentences and recently they became able to understand complex requests.

Humans had a similar process in their evolution, but the only difference was that they learned how to convert produced sounds into words. It’s still unclear when humans began to communicate with each other using words, because speech, unlike writing, hasn’t left any historical prints. If you’re interested in origins of speech, then we highly recommend you do research on different theories about it.     

So, if we can teach a machine to understand correctly and classify our words, then it will mean that machine is an intelligent system.

Using speech recognition in our devices, without which an average person is unlikely to exist, will lead us to the enormous time saving. Only imagine that an average American person receives around 100,000 words per day – the longest novel ever written is “Cyrus the Great,” written by Georges de Scudery and Madeleine de Scudery, has a length of 2,100,000 words or 13,095 pages, so you could spend around two weeks to finish that book easily. The length of an average novel is around 60,000 – 80,000 words. However, the information is going to grow rapidly over the next years.

We’ve reached a point – when we need some help from our “computer friends” to processing all data on the Internet. According to author Clem Chambers, computer technologies will be combined with AI allowing collective problem-solving on a larger scale and the creation of vast amounts of data.

Devices such as personal digital assistants powered by speech recognition technology are going to benefit our society by changing the way we work and communicate with each other.

Here are only some benefits of that technology: hands free, simply say a voice command and your device will execute it; simplicity of using it, there’s no need for typing, just say your command; fast and automated; can be integrated into any software system; multilingual, they understand different languages, so it’s easier to communicate with the other people, systems, and data; reliable and secure.

Speech recognition technology is crucial for MYLE, because language is the main remedy for a device to communicate with its user. Therefore, we work hard to develop and implement a machine learning algorithm that will recognize, understand, categorize and execute your words into data and actions.

If you want to learn how MYLE, Siri, Google Now and other digital assistants and other gadgets recognize and execute your commands based on what you said, then stay tuned and enjoy those articles. We hope to see you in the next post. 

WordPress Lightbox