This is an interesting TED Talk on a project that aims to help people who can’t speak by creating synthetic voices which are unique to each user, just like in the real world.
People with voice disorders may be able to handle prosody (e.g. speech rate, pitch, and dynamics) but are not able to produce understandable speech. Based on the source-filter model for speech, this project’s main idea is to combine the user prosody (source) with the voice identity (filter) of a donor. A concatenative text-to-speech (TTS) system is build from units recorded from the donor, and the TTS is controlled by the user’s prosody.
People with voice disorders may be able to handle prosody (e.g. speech rate, pitch, and dynamics) but are not able to produce understandable speech. Based on the source-filter model for speech, this project’s main idea is to combine the user prosody (source) with the voice identity (filter) of a donor. A concatenative text-to-speech (TTS) system is build from units recorded from the donor, and the TTS is controlled by the user’s prosody.