AI Models
52 models · 0 new in 60d
- ▾Gemini 2.5 Flash
Google · 1M tokens · $0.15/M → $0.60/M
Best for: High-volume processing, real-time apps, budget-conscious pipelines
How: Set thinking_budget to control reasoning cost. 0 = no thinking, 24576 = max.
Example: Summarize 1000 GitHub issues per hour for a triage dashboard at ~$1.
speedcostlong contextthinking budget controlAPI: Same SDK as Gemini Pro. model='gemini-2.5-flash-preview-05-20'
- ▾Claude Haiku 4.5
Anthropic · 200K tokens · $0.80/M → $4/M
Best for: Pipelines, batch processing, structured data extraction, routing
How: Use for high-volume, low-complexity tasks: classification, extraction, summarization.
Example: Process 10K support tickets per hour to classify priority and extract entities.
HumanEval 88.5%speedcoststructured outputclassificationAPI: api.anthropic.com — same SDK
- ▾GPT-4.1 mini
OpenAI · 1M tokens · $0.40/M → $1.60/M
Best for: Embeddings preprocessing, log parsing, lightweight generation
How: Same API as GPT-4.1. Best for high-volume, simple tasks where cost matters.
Example: Parse 50K structured logs per hour and extract error patterns.
SWE-bench 28.8%HumanEval 92.5%costspeedlong contextAPI: api.openai.com — same SDK
- ▾GPT-4.1 nano
OpenAI · 1M tokens · $0.10/M → $0.40/M
Best for: Intent classification, entity extraction at massive scale
How: Use for routing, tagging, simple extraction where quality bar is lower.
Example: Route 1M incoming messages per day to the right service for $4 total.
ultra-cheapfastclassificationAPI: api.openai.com — same SDK
- ▾Moonshot v1 (8K/32K/128K)
Moonshot AI · 8K / 32K / 128K tokens · $0.14/M → $0.28/M
Best for: Batch processing, structured extraction, JSON pipelines
How: Best for structured output tasks. Supports response_format: json_object. No reasoning overhead.
Example: Process RSS feeds into structured summaries for pennies per 1000 articles.
very cheapno hidden reasoningreliable JSONAPI: api.moonshot.ai — OpenAI-compatible. model='moonshot-v1-8k'