AI Models

186 models · 0 new in 60d

Live

Sort

Type

All Flagship Small Reasoning Code Embedding Image Video Vision Audio function-calling local AI multimodal

License

All Open Closed

Codestral 25.01Open
Mistral · 256K tokens · self-host
▾
Best for: Code completion, inline suggestions, editor integration
How: Supports FIM for inline completion. Integrate with any editor via LSP or Continue.dev.
Example: Deploy as your team's FIM-capable completion server behind an LSP proxy.
HumanEval 91.0%
code completionFIM (fill-in-middle)80+ languages
Hardware to self-host
VRAM: 16GB (quantized) / 45GB (FP16)
GPU: RTX 4090 24GB or 1× A100 40GB
RAM: 32GB+ system RAM
22B dense. Fits on a single consumer GPU with quantization.
API: codestral.mistral.ai — dedicated code endpoint
Qwen 2.5 Coder 32BOpen
Alibaba · 128K tokens · self-host
▾
Best for: Private code completion, self-hosted Copilot replacement
How: ollama run qwen2.5-coder:32b. Plug into Continue.dev or Copilot alternatives.
Example: Set up as your team's private code completion backend — zero data leaves your infra.
HumanEval 92.7%LiveCodeBench 48.5%
code completioncode generationApache 2.0
Hardware to self-host
VRAM: 20GB (quantized) / 64GB (FP16)
GPU: RTX 4090 24GB or 1× A100 40GB
RAM: 32GB+ system RAM
32B dense. Fits on a single consumer GPU with 4-bit quantization.
API: Ollama, vLLM, or hosted on Together/Fireworks