%20for%20Buzzsprout%20Guest%20%23250.png)
SlatorPod
SlatorPod is the weekly language industry podcast where we discuss the most important news and trends in translation, localization, interpreting, and language AI. Brought to you by Slator.com.
SlatorPod
#250 HeyGen CTO Rong Yan on AI Video Generation and the Language Challenge
Rong Yan, CTO of HeyGen, joins SlatorPod to recount the company’s transformation from a Metaverse-focused startup to leading the emerging field of AI video generation.
Rong recounts HeyGen’s beginnings and the pivot to its current avatar model, which saw ARR go from zero to USD 1m within six months.
Rong attributes HeyGen’s success to its emphasis on three key elements: quality, consistency, and controllability. The company’s newest model, Avatar IV, enables full-body video generation with natural gestures, synchronized audio, and emotion to speech.
While some of the platform’s growth has been viral, Rong believes sustained success comes from building something users truly value, with a focus on pushing video quality from 70% to 95%.
The platform extends beyond avatars, offering translation, voice cloning, and real-time interactivity. Its dynamic duration feature adjusts translated speech to fit original video timing, preserving realism. Rather than build everything from scratch, HeyGen integrates external models with its own orchestration and user data, optimizing output across languages and contexts.
Rong emphasized that HeyGen’s long-term vision is not entertainment or Hollywood, but helping everyday professionals, especially marketers and educators, who lack traditional video production skills.
Looking ahead, Rong sees video agents, tools that generate complete videos from simple prompts, as the next frontier, driving accessibility and transforming storytelling through AI.