[MUD-Dev] [TECH] Voice in MO* - Phoneme Decomposition and Reconstruction

Wed May 22 11:08:51 CEST 2002

John Buehler Responds to Daniel:
> Daniel Harman writes:

[Snippage]

>> I must admit that although I own one, I've never actually used it
>> to talk to people. I have my pc in the lounge and don't want to
>> disturb other in my appartment any more than necessary. The other
>> aspect of having real speech in these games is that only one
>> person can talk at once. For someone who can type reasonably
>> fast, speech is likely more a handicap than an enabler.

> With real speech, multiple people can be talking at once, and
> their voices can overlap and be perfectly intelligible.  I'm
> ignoring technical issues relating to delivery of the speech.  A
> modern PC can trivially handle the output of multiple overlapping
> sound streams.  It wasn't clear from your statement where you
> believe multiple voices breaks down.

In an earlier reply to Raph's where he brought up Cybertown, I
mentioned that it was somewhat confusing to hear speech in a crowded
text-based chat room.  Now, I'm doing a full reversal and saying it
is possible (albeit in a slightly different way).  :)

To support the notion of being able to distinguish voices in a
multiple person environment, I just played around with OnLive
Traveller which uses real speech in a VRML world.  It's basically a
talker.

One of the things that allows voice to work in OnLive is that it
does lip synching (which aids in determining who is/are speaking).
It also incorporates a few other tricks such as 3D positional audio
and distance attenuation so even in a crowded room, it's quite easy
to carry on a conversation, especially given that the software also
focuses on sound sources coming from directly in front of your first
person field of view.

So, I guess spatial sound cues are very important for this to be
pulled off correctly.  Granted, in a text MUD, this might be
problematic since there's no real spatial data associated with
characters.

As for people generally speaking one at a time, it's not required
when in a conversation, but it does occur out of courtesy much as it
does IRL.  Sometimes two people would start to talk at the same
time, but I could still easily make out what was being said.

> Speech input permits two control channels, versus the single
> channel of the keyboard.  With keyboard-only control, I have to
> slice up its use between verbal statements and character control.
> In times of intense character control, I don't say much.  Nobody
> does.  Just running along in the wilderness in a game can be
> dangerous if you choose to make a joke to somebody that takes more
> than a few seconds to complete.  You're running in a straight line
> all that while and you could easily run off a cliff or into a
> wandering monster.

One thing to note though is that although speech is more hands-free
than having to type text, the practicalities of microphones,
breathing, and background noise make auto-detection methods rather
frowned upon.  They tend to transmit your asthma :)

So, at the very least - to be socially accepted (and conserve
bandwidth) -you'll still need to hold down a key, much like a walkie
talkie.  That is, of course, you're roleplaying Darth Vader.

TLC

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev