Integrated Communication: The Reciprocal Voice-Caption Model
Call for Captioning
As visual communications, specifically video-based communications, become more
prevalent in our day to day life, the adage "A picture is worth a thousand words" takes on
new meaning. One view is that as image and voice play larger roles in our daily
experience, text may become less important and potentially superfluous. However it is
arguable, and in fact likely, that text will be an integral mode of communication for the
foreseeable future. Text can facilitate video communication in the form of Closed
Captioning. I propose that we should be working towards what I will call a reciprocal
voice-caption model. This system would be able to transcribe spoken speech by multiple
users and produce speech from text.
Reciprocal Voice-Caption Model: Definition
This system could be an integrated computer and hardware configuration or diverse tools
that are used in concert. The transcription function should be able to: transcribe multi-
participant conversations, display the text in caption format for analog or digital
broadcast, and archive the text in an ASCII file which could be used for a variety of
purposes. This system should accommodate users in local and remote locations and
require little or no human mediation of the technology. The production function should
incorporate a "read back" feature to reproduce out loud transcribed excerpts from the
conversation. In addition, the system should be able to read aloud non-transcribed text
that users input.
Reciprocal Voice-Caption Model: Rationale
Integrating text into audio-video interactions is worthwhile for a variety of reasons.
Captions or subtitles facilitate understanding of spoken words, even for native speakers
of a given language. This may be particularly helpful for environments where audio
quality is questionable. The caption text itself could be useful for users as a record of a
meeting, and could also provide the building blocks for future documents. This system
would additionally benefit heating and visually impaired members of a collaborative
community. Finally, this opens up new possibilities for browsing information. It is
conceivable that users could search captions for relevant keywords or even more easily
follow several conversations simultaneously.