The open-source story in 2026 is no longer “interesting for research, not ready for production.” Multiple open-weight models now match or exceed last year’s closed-source frontier on real-world benchmarks. The question has shifted from “can open-source compete?” to “when does the operational overhead of self-hosting justify the cost savings and control?”
DeepSeek — open-source, rock-bottom pricing
DeepSeek changed the LLM economics conversation. V3.2 at $0.28/$0.42 per million tokens is roughly one-tenth to one-twentieth of OpenAI for comparable quality, with a Speciale variant that rivals Gemini 3 Pro on reasoning benchmarks.
The architecture is clever: 685 billion total parameters with Mixture of Experts (MoE — a technique where a large model only activates a fraction of its parameters per request, reducing compute cost), activating only a fraction at inference. Fully open-source under MIT license. The caveat: Chinese data processing. For European companies with GDPR obligations, this means either self-hosting on EU infrastructure or accepting the sovereignty risk of sending data to China.
Choose this when
Maximum cost efficiency matters and data sovereignty permits. If you self-host on EU infrastructure, you get MIT-licensed frontier capability at a fraction of API costs.
Meta Llama 4 — the community standard
Llama 4 brought two genuinely useful innovations: Mixture of Experts architecture for efficient inference, and an industry-leading 10 million token context window on the Scout model. Llama 4 Maverick exceeds GPT-4o on coding, reasoning, and multilingual benchmarks, though it falls short of the current top tier (Gemini 2.5 Pro, Claude Sonnet 4).
The real advantage: community. Llama is the most deployed open-weight model family. More fine-tuned variants, more deployment guides, more infrastructure tooling than any competitor. Available across every major cloud platform. Scout runs on a single H100 GPU.
Choose this when
You need self-hosted deployments where community support and deployment tooling matter. The 10M-token context on Scout is unmatched for open-weight long-context work.
Mistral — the European option
Mistral is the only major LLM provider that is a European company with EU-based infrastructure as the default. For European businesses where data sovereignty is non-negotiable, this matters more than benchmarks. Mistral Medium 3 at $0.40/$2.00 performs at roughly 90 percent of Claude Sonnet on benchmarks at significantly lower cost.
Mistral also offers enterprise fine-tuning, custom pre-training, and strong multilingual support (particularly French, German, Spanish, Italian). Their open-weight models (Apache 2.0) are self-hostable for maximum control. Devstral is positioned as the best open-source model for coding agents.
Choose this when
European data sovereignty is a requirement, not a preference. The only major provider where EU hosting is the default, not an add-on. Competitive quality at compelling prices.
Qwen and the Chinese open-source wave
Alibaba’s Qwen 3.5 and Zhipu’s GLM-4.7 lead a wave of Chinese open-source models that are genuinely world-class. Qwen 3.5 hits 88.4% on GPQA Diamond and 76.4% on SWE-bench Verified — frontier-tier results. The remarkable efficiency story: Qwen3-4B rivals the performance of Qwen2.5-72B, meaning each generation gets roughly the same quality from a model half the size.
Same caveat as DeepSeek: Chinese origin means you either self-host on your own infrastructure or accept data sovereignty risk. But for teams that can self-host, these models offer strong capability per dollar.
For European teams
If GDPR compliance shapes your AI decisions, your shortlist looks different:
- Mistral: EU company, EU servers by default, open-weight for self-hosting
- OpenAI API: EU data residency since Feb 2025 (new projects only)
- Claude API: EU data residency since Aug 2025 (API only — claude.ai remains US)
- Self-hosted open-weight: Mistral, Llama, or DeepSeek on your own EU infrastructure
DeepSeek and Qwen APIs process data in China. Their models are MIT/Apache licensed for self-hosting, which solves the sovereignty problem if you run them yourself.