Itinai.com futuristic ui icon design 3d sci fi computer scree 5644fbaa d4d6 428f 950f 9cba83ba298d 2
Itinai.com futuristic ui icon design 3d sci fi computer scree 5644fbaa d4d6 428f 950f 9cba83ba298d 2

Tencent Hunyuan Releases State-of-the-Art Multilingual Translation Models: Hunyuan-MT-7B and Chimera-7B

Introduction

Tencent’s Hunyuan team has made a significant leap in the field of multilingual machine translation with the release of two advanced models: Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B. These models were showcased during the WMT2025 General Machine Translation shared task, where Hunyuan-MT-7B impressively ranked first in 30 out of 31 language pairs. This achievement highlights the potential of these models to transform how we approach translation across diverse languages.

Model Overview

Hunyuan-MT-7B

The Hunyuan-MT-7B model boasts 7 billion parameters and supports mutual translation across 33 languages, including several Chinese ethnic minority languages such as Tibetan, Mongolian, Uyghur, and Kazakh. Its design optimizes performance for both high-resource and low-resource translation tasks, achieving state-of-the-art results among models of similar size.

Hunyuan-MT-Chimera-7B

In contrast, the Hunyuan-MT-Chimera-7B model introduces an innovative weak-to-strong fusion approach. By combining multiple translation outputs during inference, it refines translations using reinforcement learning and aggregation techniques. This model is notable for being the first open-source translation model of its kind, significantly enhancing translation quality beyond what single-system outputs can achieve.

Training Framework

The training of these models follows a comprehensive five-stage framework tailored for translation tasks:

  • General Pre-training: Involves 1.3 trillion tokens across 112 languages and dialects, ensuring a rich diversity through various tagging systems.
  • MT-Oriented Pre-training: Utilizes high-quality monolingual corpora from mC4 and OSCAR, filtered for relevance.
  • Supervised Fine-Tuning (SFT): Consists of two stages with around 3 million parallel pairs, selected through automated scoring and manual verification.
  • Reinforcement Learning (RL): Employs algorithms and reward functions to boost translation quality.
  • Weak-to-Strong RL: Generates multiple candidate outputs, which are aggregated based on rewards, specifically in the Chimera model.

Benchmark Results

Automatic Evaluation

In the WMT24pp evaluation, Hunyuan-MT-7B achieved an impressive score of 0.8585 (XCOMET-XXL), surpassing larger models like Gemini-2.5-Pro (0.8250) and Claude-Sonnet-4 (0.8120). Additionally, in the FLORES-200 evaluation, it scored 0.8758, outperforming open-source baselines such as Qwen3-32B (0.7933).

Comparative Results

Hunyuan-MT-7B demonstrated remarkable performance, outperforming Google Translator by 15–65% across various evaluation categories. It also surpassed specialized translation models like Tower-Plus-9B and Seed-X-PPO-7B, despite having fewer parameters. The Chimera-7B model contributed an additional 2.3% improvement on the FLORES-200 benchmark.

Human Evaluation

A custom evaluation set covering diverse domains revealed that Hunyuan-MT-7B achieved an average score of 3.189, closely approaching the quality of larger proprietary models. This indicates its potential for practical applications in real-world scenarios.

Case Studies

Several real-world case studies illustrate the capabilities of these models:

  • Cultural References: The model accurately translates “小红薯” as the platform “REDnote,” showcasing its understanding of context.
  • Idioms: It interprets “You are killing me” as “你真要把我笑死了,” avoiding literal misinterpretation.
  • Medical Terms: The model precisely translates complex terms like “uric acid kidney stones.”
  • Minority Languages: It produces coherent translations for languages such as Kazakh and Tibetan, which are often underrepresented in translation technology.
  • Chimera Enhancements: The model excels in translating gaming jargon and sports terminology, demonstrating its versatility.

Conclusion

Tencent’s release of Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B sets a new benchmark for open-source translation models. By integrating a meticulously designed training framework with a focus on low-resource and minority language translation, these models achieve quality that rivals or exceeds that of larger closed-source systems. This launch not only provides the AI research community with powerful tools for multilingual translation but also opens doors for further advancements in the field.

FAQ

1. What are the main features of Hunyuan-MT-7B?

Hunyuan-MT-7B supports translation across 33 languages, including minority languages, and is optimized for both high-resource and low-resource tasks.

2. How does Hunyuan-MT-Chimera-7B improve translation quality?

It combines multiple translation outputs using reinforcement learning and aggregation techniques, enhancing the overall translation quality.

3. What is the significance of the training framework used for these models?

The five-stage training framework ensures comprehensive learning from diverse data sources, improving the models’ performance across various languages and contexts.

4. How do these models compare to existing translation tools?

Hunyuan-MT-7B outperforms popular tools like Google Translator by a significant margin, demonstrating superior accuracy and contextual understanding.

5. Can these models handle specialized terminology?

Yes, the models have shown proficiency in translating specialized terms in fields such as medicine and gaming, making them versatile for various applications.

Itinai.com office ai background high tech quantum computing 0002ba7c e3d6 4fd7 abd6 cfe4e5f08aeb 0

Vladimir Dyachkov, Ph.D
Editor-in-Chief itinai.com

I believe that AI is only as powerful as the human insight guiding it.

Unleash Your Creative Potential with AI Agents

Competitors are already using AI Agents

Business Problems We Solve

  • Automation of internal processes.
  • Optimizing AI costs without huge budgets.
  • Training staff, developing custom courses for business needs
  • Integrating AI into client work, automating first lines of contact

Large and Medium Businesses

Startups

Offline Business

100% of clients report increased productivity and reduced operati

AI news and solutions