[MUD-Dev2] [Technology] Lend your ears to science!

Jeffrey Kesselman jeffpk at gmail.com
Sat Jun 13 01:57:47 CEST 2009


Well, im very inertested in speech to speech.

Which is to say the modification of speech to sound like it came from
someone with very different vocal characteristics from the
speaker.  The most advanced modeling in speech production seems to be
in the TTS space, so Im wondering now about
the possabilities of speech to text to speech as an approach.

On Mon, Jun 1, 2009 at 1:33 AM, Mike Rozak <Mike at mxac.com.au> wrote:
> Jeffrey Kesselman" wote:
>>
>> Are the technologies identified somewhere?
>> Id be curious to know who is building the most realistic sounding
>> systems and if they are available for development use...
>
> In the test, the voices are anonymous... until the results are made public
> at the beginning of September.
>
> The companies that did best last year were UTSC and IBM. Some of the larger
> companies, like AT&T (which also has a good voice), don't participate
> because the marketing consequences of not doing well on the test are worse
> than the marketing benefits of doing well. If you look through last year's
> papers (http://festvox.org/blizzard/blizzard2008.html) and look for the term
> "mean opinion score" (MOS), you can get an idea which ones did better.
>
> I talked with some of the researchers at last year's conference, and (to use
> a double negative) they didn't seem disinterested in supporting games, BUT
> (a) the current money in high-end TTS is coming from telephony, (b)
> consequently, their technology isn't quite right for games. They can tune
> their technology to be more game friendly (such as including lots of voices
> with personality, as opposed to a couple of very-good, happy-sounding,
> telephone-operator voices), but it's a chicken-and-the-egg issue.
>
> Basically, TTS engines aren't targeted at games because no games use TTS
> (except the one I'm working on). Games don't use TTS because (a) TTS engines
> aren't targeted at games, and (b) (a hyperbolie) contemporary games are
> about killing things - which doesn't really require speech, just grunts.
> Games are about killing things because (a) that's what the current target
> audience who visits game stores wants, (b) gamepads are designed for
> shooting/killing, (c) 3D accelerators are designed for shooting/killing, (d)
> games about killing things make lots of money and therefore attract lots of
> development money, and (e) to create a game about talking (an alternative to
> killing) you need text-to-speech... and current text-to-speech isn't geared
> towards gameplay... so any sensible person decides that it's far
> easier/wiser to make a game about killing things than to explore new
> territory.
>
> Having said that, at least one of the TTS companies (forgot the name...
> they're working with Vivox) has quite a few voices.
>
> I can go into excruciating detail if you want.
>
>



-- 
~~ Microsoft help desk says: reply hazy, ask again later. ~~



More information about the mud-dev2-archive mailing list