Settling on the "sound library" model, we needed to find interesting ways to manipulate the sounds in real time. The support for sound playback in the VRML language seems to be very dependent on the platform you are running on. WindowsNT machines seem to be relatively unpredictable as to how they will handle sounds and they will rarely playback multiple sounds simultaneously and cleanly. SGI Indigos seem to be relatively stable and happy to playback multiple audio files without problem. That being said, the following are audio playback parameters that the VRML language specification gives us control over and how we chose to implement them:
Playback speed (relative pitch)
The user has control over the playback speed/pitch of the sound file for each of the three voices by moving a fader up or down. For any given voice, the sound can be transposed one full octave above and one full octave below its original sampled fundamental pitch by moving the fader to its extremes.
Intensity
The user has control over the intensity (volume) of a sound file for each of the three voices by moving a fader up or down. The playback intensity ranges from 0 (inaudible) to 1 (the original volume of the sampled sound file).
Spatialization
We chose, at this point, to make all sounds emanate from stationary point sources even though VRML allows sound sources to be moved in 3-D space. Each of the three voices has its own point of origin located at a different place in the environment. Consider it to be as though immovable loudspeakers are hanging on the walls inside the model.