During the last months some IT companies have shown interest in using speech recognition and translation technologies, which may bring a significant added value to their products.
We will soon be able to use speech recognition in Facebook as a quicker way to have text previews of the voice clips after the company bought Wit.ai.
In the context of Skype, speech technologies are used to speak to people with which we do not share a common language in the following way. First, transcriptions of the spoken utterances are obtained with a deep learning approach applied to speech recognition. Then, the recognized text is translated to another language using natural language processing techniques, and finally this text is synthesized into audible speech. Reviews on this new feature have already been published, pointing out both the limitations (accurate translated words, when to take the turn to break in and translate) and achievements (some language barriers starts to be broken down, solutions to correct recognized text before synthesis).
Similarly, a universal speech translator will be available on Android smartphones as described here and here.
Although in real world scenarios these applications will suffer from several difficulties (e.g. ambient noise, large vocabulary used by the speaker, speaker accents or speaking speed), it will be interesting to test how state of the art techniques like deep learning perform for these applications.
Next, there are some videos showing some Skype Translator demos. The first video is an introduction, followed by a life demo between English and German, and finally a demo between English and Spanish. Finally, there is a Google translation demo.
We will soon be able to use speech recognition in Facebook as a quicker way to have text previews of the voice clips after the company bought Wit.ai.
In the context of Skype, speech technologies are used to speak to people with which we do not share a common language in the following way. First, transcriptions of the spoken utterances are obtained with a deep learning approach applied to speech recognition. Then, the recognized text is translated to another language using natural language processing techniques, and finally this text is synthesized into audible speech. Reviews on this new feature have already been published, pointing out both the limitations (accurate translated words, when to take the turn to break in and translate) and achievements (some language barriers starts to be broken down, solutions to correct recognized text before synthesis).
Similarly, a universal speech translator will be available on Android smartphones as described here and here.
Although in real world scenarios these applications will suffer from several difficulties (e.g. ambient noise, large vocabulary used by the speaker, speaker accents or speaking speed), it will be interesting to test how state of the art techniques like deep learning perform for these applications.
Next, there are some videos showing some Skype Translator demos. The first video is an introduction, followed by a life demo between English and German, and finally a demo between English and Spanish. Finally, there is a Google translation demo.