Local KO/EN Embedding Showdown: BGE vs E5 vs MiniLM

ENGLISH 한국어

I went down a rabbit hole this week researching local embedding models for Rhizome. The idea was simple: find a model that could power semantic search across both desktop and mobile, running entirely on-device. No cloud calls, no API keys—just pure Local-First embeddings.

The short answer? There’s no silver bullet right now. Nothing I found satisfactorily covers both mobile and desktop for Korean/English bilingual use. So I’m shelving this feature for now. But the research itself was worth doing, and I want to share what I learned.

The MTEB Benchmark: Finally, Apples-to-Apples

The MTEB (Massive Text Embedding Benchmark) has become the go-to for comparing embedding models. It tests across hundreds of languages and various downstream tasks—classification, clustering, retrieval, semantic similarity—giving you a much more honest picture than cherry-picked demo results. (Hugging Face Leaderboard)

MTEB Leaderboard

Benchmark Results

The TL;DR

After digging through the benchmarks, the landscape boils down to three trade-offs:

Want decent accuracy + multilingual quality? multilingual-E5-large consistently ranks well. It’s trained with instruction-based prompts and handles Korean/English well. (GitHub)
Need long-context retrieval + document search? BGE (BAAI General Embedding) is the favorite. It supports up to 8K tokens and is optimized for modern RAG pipelines. (Bizety)
Running on a potato? MiniLM-L6 is still king for speed-per-quality on CPU-only environments. (Bizety)

RAG Pipeline

Model Breakdown

multilingual-E5-large
- 1024-dimensional vectors
- Strong at general retrieval and semantic similarity, especially for multilingual use cases (GitHub)
BGE (BAAI General Embedding)
- Long-context, high-precision retrieval
- Built for RAG systems; handles hybrid search and multilingual queries (Bizety)
MiniLM-L6
- Small dimensions, blazing fast inference
- Great for desktop KMS apps or any local system where latency matters more than peak accuracy (Bizety)

Comparison Chart

So Why Not Just Ship It?

Here’s the problem: the models that are accurate enough (E5-large, BGE) are too heavy for mobile. And the ones that are light enough for mobile (MiniLM) aren’t great at Korean. For a bilingual KMS that’s supposed to run everywhere, that’s a dealbreaker.

MTEB and its extension MMTEB (testing 1,000+ languages) confirm what I suspected: there’s no single model that nails speed + accuracy + multilingual quality for on-device use.

For a Local-First Tauri app like Rhizome, the balance between these three factors is everything. I’ll keep an eye on the space—the moment a lightweight model can handle Korean/English well enough on a phone without melting the battery, I’m revisiting this. (modal.com)

—

Researched with GPT.

Note: I’m a solo developer based in Korea. To share my journey with a wider audience, I used AI to help translate my thoughts into English. If any phrasing feels a bit “too AI” or unnatural, please bear with me.

Local KO/EN Embedding Showdown: BGE vs E5 vs MiniLM

The MTEB Benchmark: Finally, Apples-to-Apples

The TL;DR

Model Breakdown

So Why Not Just Ship It?

Join the Investigation

Cookie Consent 🍪