LFM2 English-Korean Translation: Efficient Bidirectional Translation with 1.2B Parameters
What’s inside?
An efficient bidirectional translation system powered by Liquid AI’s LFM2 1.2B model fine-tuned for Korean-English translation. This project demonstrates how domain-specific fine-tuning can achieve superior performance compared to models 3x larger, outperforming Google’s Gemma-3 4B and Alibaba’s Qwen3 4B on the Flores-200 benchmark. Key features:- Automatic language detection - Intelligently detects input language and translates accordingly
- High-quality translation - CHrF++ 32.96 / BLEU 12.05 on Flores-200 benchmark
- Efficient inference - Runs on modest hardware with merged adapters for speed
- Easy-to-use CLI - Simple command-line interface powered by Fire
Quick Start
- Clone the repository
- Install dependencies
- Run translation example
Understanding the Architecture
The system uses a two-stage training approach:- Supervised Fine-tuning (SFT): 100K high-quality Korean-English parallel datasets establish the translation foundation
- Reinforcement Learning (RL): GRPO optimization with 10K additional samples refines translation quality
Model Components
- Base Model:
gyung/lfm2-1.2b-koen-mt-v4-100k- SFT fine-tuned LFM2 1.2B - Adapter:
gyung/lfm2-1.2b-koen-mt-v5-rl-10k-adapter- LoRA adapter trained with GRPO - Automatic Detection: Regular expression pattern matching for Korean text (Hangul syllables, Jamo)
CLI Usage
Example Usage
Translate English to Korean:Performance Benchmarks
On Flores-200 benchmark (1,012 samples):| Model | Parameters | CHrF++ | BLEU |
|---|---|---|---|
| LFM2-KoEn-v5-RL | 1.2B | 32.96 | 12.05 |
| Gemma-3-4B | 4B | 32.83 | 11.36 |
| Qwen3-4B | 4B | 25.62 | 7.46 |
Further Improvements
Next steps for enhanced performance and efficiency:- Speed optimization with quantization techniques (GGUF, AWQ, GPTQ)
- llama.cpp integration for faster CPU inference
- Full parameter RL training with expanded compute resources
- Length normalization removal based on recent Qwen team findings
- Extended dataset training with 200K SFT + 25K RL samples
Performance Optimization
The current implementation uses adapter merging for faster inference. Future improvements include:- Quantized model variants for resource-constrained environments
- Streaming inference for real-time translation
- Batch processing for large document translation