SlatorPod

#130 Yasmin Moslem on Using Large Language Models to Custom Train Machine Translation

September 02, 2022 Slator
SlatorPod
#130 Yasmin Moslem on Using Large Language Models to Custom Train Machine Translation
Show Notes Chapter Markers

Machine Translation (MT) Researcher, Yasmin Moslem, joins SlatorPod to talk about her research on Domain-Specific Text Generation for Machine Translation — a project she conducted with Rejwanul Haque, John D. Kelleher, and Andy Way at the Adapt Center in Dublin.

Yasmin shares her experience working as a translator, discovering translation productivity (CAT) tools, and experimenting with translation memory to improve MT. She breaks down the paper’s approach to domain-specific MT training using back-translation for data augmentation.

She discusses how some LSPs are already implementing this approach in real-life, customizing it for different use cases. She explains why they used a combination of BLEU, Comet, and other quality evaluation frameworks as well as human evaluation to rate machine translation quality.

Yasmin concludes the podcast with her advice for those in the core industry looking to enter the machine translation space, from the spiral learning process to reading research papers.

First up, Florian and Esther discuss the language industry news of the week, including how a streaming platform used propriety machine dubbing technology for its film offerings in the first quarter of 2022.

Over in London, TransPerfect acquired a virtual data room (VDR) tech company to proactively address the VDR market. In transcription news, VIQ Solutions’ shares dipped by 20% despite reporting strong, half-year revenue growth of 45% year on year. Meanwhile, multilingual captioning provider Ai-Media turns EBITDA-profitable as a 2021 acquisition drives revenue growth.

Agenda and Intro
Machine Dubbing in Real Life
TransPerfect Addresses VDR Market
Ai-Media and VIQ Solutions Financial Results
Yasmin Moslem Joins the Pod
Academic and Professional Background
Domain-Specific Text Generation for MT
Medical and Financial Use Cases
Synthetic Text Generation
Application of Machine Translation Research
Real-Life Application by LSPs
Using BLEU as a Quality Measure
Breaking Into the Machine Translation Industry