Pittsburgh (PA) – Carnegie Mellon University (CMU) is developing a real-time translation device that works with facial movements instead of audible speech. Speakers talk silently while electrodes translate the facial movements into another language. The translated words are then sent through computer speakers or headphones in real-time which, from a third-person’s perspective, looks like a normal conversation.
The device relies on sub-vocalization, tiny, almost imperceptible, muscle movements in the face and neck. CMU’s device is not exactly mechanical lip reading, something movie audiences saw when the HAL 9000 computer eavesdropped into crew conversations in the 2001: Space Odyssey movie. Currently electrodes have to be attached, but it’s possible that a future system could work entirely with cameras.
Traditional speech translation systems required speakers to talk and then wait for the computers to translate the words. Such systems are in high demand and the U.S. military is currently testing out the IBM “Mastor” computer-based translation system that works on audible speech. In contrast, CMU’s device translates as you talk.
CMU has made two devices, one to translate from Chinese to English and another to translate English into Spanish and German. So far the vocabulary database is quite small at 100 to 200 words and the machines are about 80% accurate. Researchers eventually hope to achieve full translation.
One of the problems with computerized translation and indeed translation in general is that literal word translations come out sounding strange when the phrase is an idiom or colloquial. As an example, American’s have a phrase “comparing Apples and Oranges” when comparing dissimilar things while Germans would say “Apples and Pears”. In addition, colloquial phrases often change depending even within different regions in the same country.