AI Models

186 models · 0 new in 60d

Compare →

Live

Sort

Newest A→Z

Type

All Flagship Small Reasoning Code Embedding Image Video Vision Audio function-calling local AI multimodal

License

All Open Closed

Gemini 2.5 Flash
Google · 1M tokens · $0.15/M → $0.60/M
▾
Best for: High-volume processing, real-time apps, budget-conscious pipelines
How: Set thinking_budget to control reasoning cost. 0 = no thinking, 24576 = max.
Example: Summarize 1000 GitHub issues per hour for a triage dashboard at ~$1.
speedcostlong contextthinking budget control
API: Same SDK as Gemini Pro. model='gemini-2.5-flash-preview-05-20'
Claude Haiku 4.5
Anthropic · 200K tokens · $0.80/M → $4/M
▾
Best for: Pipelines, batch processing, structured data extraction, routing
How: Use for high-volume, low-complexity tasks: classification, extraction, summarization.
Example: Process 10K support tickets per hour to classify priority and extract entities.
HumanEval 88.5%
speedcoststructured outputclassification
API: api.anthropic.com — same SDK
GPT-4.1 mini
OpenAI · 1M tokens · $0.40/M → $1.60/M
▾
Best for: Embeddings preprocessing, log parsing, lightweight generation
How: Same API as GPT-4.1. Best for high-volume, simple tasks where cost matters.
Example: Parse 50K structured logs per hour and extract error patterns.
SWE-bench 28.8%HumanEval 92.5%
costspeedlong context
API: api.openai.com — same SDK
GPT-4.1 nano
OpenAI · 1M tokens · $0.10/M → $0.40/M
▾
Best for: Intent classification, entity extraction at massive scale
How: Use for routing, tagging, simple extraction where quality bar is lower.
Example: Route 1M incoming messages per day to the right service for $4 total.
ultra-cheapfastclassification
API: api.openai.com — same SDK
Moonshot v1 (8K/32K/128K)
Moonshot AI · 8K / 32K / 128K tokens · $0.14/M → $0.28/M
▾
Best for: Batch processing, structured extraction, JSON pipelines
How: Best for structured output tasks. Supports response_format: json_object. No reasoning overhead.
Example: Process RSS feeds into structured summaries for pennies per 1000 articles.
very cheapno hidden reasoningreliable JSON
API: api.moonshot.ai — OpenAI-compatible. model='moonshot-v1-8k'
Gemma 4 QAT
Google · 128K tokens · api
▾
Best for: Mobile and laptop applications requiring efficient AI models
How: Integrate Gemma 4 QAT models into your application for on-device AI processing
Example: Use Gemma 4 QAT for image recognition on smartphones with low latency and power consumption
Optimizing compression for mobile and laptop efficiency
Auto-discovered from news articles.
Phi-4-mini
Microsoft · api
▾
Best for: expanding on-device AI capabilities in Microsoft Edge
How: use Prompt and Writing Assistance APIs in Microsoft Edge
Example: integrated with Microsoft Edge for on-device AI tasks
on-device AInew models and APIs for the web
Auto-discovered from news articles.