The Problem with Research on ‘Real-Time’ Speech-to-Text AI Translation

By: Ana Moirano

Despite advancements in AI speech translation, many so-called “real-time” systems may not be as real-time as they claim.

A new study finds that much of the research in simultaneous speech-to-text translation (SimulST) is based on unrealistic assumptions that do not reflect real-world conditions — potentially limiting the industry’s ability to deploy truly live, low-latency translation solutions.

In their December 24, 2024 paper, Sara Papi from Fondazione Bruno Kessler and Peter Polák, Ondřej Bojar, and Dominik Macháček from Charles University, reviewed 110 papers on SimulST and found that the majority focus on translating pre-segmented speech — where the input has been manually split into short utterances before translation — rather than continuous, unbounded speech streams.

The researchers argue that this “narrow focus” simplifies the problem by avoiding challenges such as latency, segmentation, and synchronization, ultimately hindering the development of systems that can work in real time without human intervention. 

“Despite its intended application to unbounded speech, most research has focused on human pre-segmented speech, simplifying the task and overlooking significant challenges,” the researchers said.

Read more…

Source: Slator



Translation news
Stay informed on what is happening in the industry, by sharing and discussing translation industry news stories.

All of ProZ.com
  • All of ProZ.com
  • Αναζήτηση όρου
  • Εργασίες
  • Φόρουμ
  • Multiple search