Text to Speech in the Living Room
"Insufficient data."
Those words and the speech synthesis technology behind them, signaled perhaps the greatest advance of the starship Enterprise's onboard computer. Look past the chilliness of the phrasing, the digital device was talking and not displaying its output through a printout or on a display monitor.
Of course, it's been a lot of TV years since the Enterprise flew its maiden voyages. We've all heard talking computers since then, but the voice has always sounded machine-like and the applications remain on the gimmicky side.
Now, as the computer invades the home entertainment space in digital sound systems, audio-visual networks, smart TVs and set-top boxes, it's reasonable to ask whether there is a place for speech synthesis in the living room. The answer is decidedly yes, though current reality still falls somewhat short of the full promise of text-to-speech (TTS) applications.
This article will survey how TTS can serve your needs and some of the products that already use the technology.
What's That Song, Mr. DJ?
One of the most obvious uses for TTS in the living room (or in the car) would be to have your own virtual DJ, either back-announcing or previewing a set of music tracks.
cd3o's media hub, the c300, wirelessly transports MP3, WMA and WAV audio files from your PC to your stereo system. A key feature of the cd3o hub is CD-DJ, a synthesized voice that will happily read out as much of the metadata (song, artist and recording information, lyrics, etc.) attached to the music file as you'd like to hear.
You can cue CD-DJ on demand via the remote control (so if you can't remember who an artist is, you can get an immediate reminder) or set CD-DJ to automatically intro programmed sets of music (with the bonus that you won't ever hear a commercial or public radio pledge spot).
Mike and Crystal
CD-DJ is understandable, but nobody would describe its DJ voice as soothing or lilting. Speech synthesis technology continues to improve, however and digitized voices virtually indistinguishable from human voices are now possible. AT&T, for example, has introduced Natural Voices, which you can sample free at naturalvoices.att.com. Crystal and Mike definitely have a synthetic sound that will fool few people, but the technology is good. AT&T Natural Voices offers a TTS engine that plugs into many software or digital applications. AT&T's Mike and Crystal make a difference when it comes to other TTS applications.
Read Me My E-mail
A company called NextUp (www.nextup.com) markets TextAloud, a downloadable software program that converts any textual material, such as a Webpage, a Word document, an e-mail or an e-book, into an MP3 file. This opens the door to a wide variety of uses. Say, for example, that you're taking an online class and need to get this week's online lecture, which is normally a series of Webpages that you read. With TextAloud, you could easily capture the lecture text, convert it to an MP3, move the file to the living room through your network or media hub and listen to the lecture while you're playing with your child, folding the laundry or watering the plants.
Or take inexpensive e-books. TextAloud can turn these into MP3 audio-books, perfect for travel with an iPod or other portable MP3 player. How often do you encounter a lengthy article online that you don't have time to read at the moment? With TextAloud, you could convert the entire article and then have it read to you from your stereo system. Or you could use TiVo's new Home Media Option in tandem with TextAloud to store and manage dozens of TTS MP3s. Imagine your TiVo reading exercise instructions, movie reviews or online newspaper articles.
'The Wire' Comes On In 5 Minutes
Because A/V networks and media hubs allow you to trigger digital media on a timer, you could record personal voice reminders to go off during the day. At 4pm, Julie's reminded to do her homework. At 7pm, the family's reminded to take the dog for a walk. And at 9:55pm, you and your spouse are reminded that a favorite HBO show will be starting in a few minutes.
NextUp also distributes WeatherAloud, StocksAloud and NewsAloud for the PC. As their names imply, these programs aggregate textual data in order to verbalize the customized information you've requested (for example, today's closing prices on your stocks or local weather forecasts). As of now, these other NextUp programs do not have an export-to-MP3 feature, but that is likely to change.
TextAloud isn't the only program in its category. Shareware text-to-MP3 programs do exist and you can find them easily through a Google search. The commercial polish and ease-of-use TextAloud offers may well justify its modest cost.
The Future
We expect some variation of cd3O's CD-DJ to become a standard feature of home A/V networks. And while TextAloud does make it possible to have nearly any textual information read to us in our living room, it admittedly takes a couple of steps to get there.
In the very near future, however, we should see TTS integrated thoroughly into every audio and video device we use. Audio reminders and readings of customized Electronic Program Guides, are likely to become standard features of TiVo-like devices. Digital media servers should provide audio instructions for set-up and use on demand. Your home network could read a customized round-up of real-time football scores at the half-hour. Your daughter's school essay could be read to you while you open and discard junk mail.
The voice we hear, of course, could be our own. But a little imagination suggests it could also be SpongeBob's or Johnny Depp's or Queen Latifah's. For the sight-impaired, the full-scale adoption of inexpensive TTS in the living room is long overdue. For the rest of us, TTS is going to be both useful and fun, and something that will soon seem indispensable.

Digg This!
del.icio.us
Technorati
Reddit