AI Models
52 models · 0 new in 60d
- ▾Codestral 25.01Open
Mistral · 256K tokens · self-host
Best for: Code completion, inline suggestions, editor integration
How: Supports FIM for inline completion. Integrate with any editor via LSP or Continue.dev.
Example: Deploy as your team's FIM-capable completion server behind an LSP proxy.
HumanEval 91.0%code completionFIM (fill-in-middle)80+ languagesHardware to self-hostVRAM: 16GB (quantized) / 45GB (FP16)GPU: RTX 4090 24GB or 1× A100 40GBRAM: 32GB+ system RAM22B dense. Fits on a single consumer GPU with quantization.
API: codestral.mistral.ai — dedicated code endpoint
- ▾Qwen 2.5 Coder 32BOpen
Alibaba · 128K tokens · self-host
Best for: Private code completion, self-hosted Copilot replacement
How: ollama run qwen2.5-coder:32b. Plug into Continue.dev or Copilot alternatives.
Example: Set up as your team's private code completion backend — zero data leaves your infra.
HumanEval 92.7%LiveCodeBench 48.5%code completioncode generationApache 2.0Hardware to self-hostVRAM: 20GB (quantized) / 64GB (FP16)GPU: RTX 4090 24GB or 1× A100 40GBRAM: 32GB+ system RAM32B dense. Fits on a single consumer GPU with 4-bit quantization.
API: Ollama, vLLM, or hosted on Together/Fireworks