[MUD-Dev2] [Technology] Lend your ears to science!

Mike Rozak Mike at mxac.com.au
Sat Jun 13 01:53:39 CEST 2009


Jeffrey Kesselman wrote:
> Well, im very inertested in speech to speech.
>
> Which is to say the modification of speech to sound like it came from
> someone with very different vocal characteristics from the
> speaker.  The most advanced modeling in speech production seems to be
> in the TTS space, so Im wondering now about
> the possabilities of speech to text to speech as an approach.

Basically, three options:

1) Ye olde formant shifting. This'll make your voice sound like a midget, 
giant, etc. Any change makes the voice signficiantly more difficult to 
understand. (I have implemented something a half step better than the norm, 
but it's still limited.) The player's accent will be unchanged.

2) Google for "hmm voice transformation". I know there's some work at CMU. 
The examples I heard sounded like they transformed from one voice to another 
BUT they introduced a LOT of distortion. This technique might be able to 
partially change accents, from american english to scottish, for example, 
but only partially. No one has really tried accent conversion (that I know 
about - which isn't much).

3) Run a dictation engine (with a language model specialized for gaming), 
and resynthesize on the other end. While doing dictation, note the prosody 
(pitch, timing, duration) and send those along with the text. This will be 
able to do American English to Scottish, BUT dictation will introduce 
often-comical misrecognition mistakes, especially when players yell at one 
another. HMM voice transformation won't have the misrecognition problems.

You can quickly prototype the basics (minus prosody) with standard engines 
and text-to-speech. When I was at Microsoft, we did this for fun; 
interestingly, some of the transcription errors were covered up by text to 
speech because dictation errors usually involved stuff like "ice cream" -> 
"I scream"... the two look different, but sound almost the same when spoken 
by TTS.





More information about the mud-dev2-archive mailing list