AI Models

52 models · 0 new in 60d

Compare →
  • Gemini 2.5 Flash

    Google · 1M tokens · $0.15/M → $0.60/M

    Best for: High-volume processing, real-time apps, budget-conscious pipelines

    How: Set thinking_budget to control reasoning cost. 0 = no thinking, 24576 = max.

    Example: Summarize 1000 GitHub issues per hour for a triage dashboard at ~$1.

    speedcostlong contextthinking budget control

    API: Same SDK as Gemini Pro. model='gemini-2.5-flash-preview-05-20'

  • Claude Haiku 4.5

    Anthropic · 200K tokens · $0.80/M → $4/M

    Best for: Pipelines, batch processing, structured data extraction, routing

    How: Use for high-volume, low-complexity tasks: classification, extraction, summarization.

    Example: Process 10K support tickets per hour to classify priority and extract entities.

    HumanEval 88.5%
    speedcoststructured outputclassification

    API: api.anthropic.com — same SDK

  • GPT-4.1 mini

    OpenAI · 1M tokens · $0.40/M → $1.60/M

    Best for: Embeddings preprocessing, log parsing, lightweight generation

    How: Same API as GPT-4.1. Best for high-volume, simple tasks where cost matters.

    Example: Parse 50K structured logs per hour and extract error patterns.

    SWE-bench 28.8%HumanEval 92.5%
    costspeedlong context

    API: api.openai.com — same SDK

  • GPT-4.1 nano

    OpenAI · 1M tokens · $0.10/M → $0.40/M

    Best for: Intent classification, entity extraction at massive scale

    How: Use for routing, tagging, simple extraction where quality bar is lower.

    Example: Route 1M incoming messages per day to the right service for $4 total.

    ultra-cheapfastclassification

    API: api.openai.com — same SDK

  • Moonshot v1 (8K/32K/128K)

    Moonshot AI · 8K / 32K / 128K tokens · $0.14/M → $0.28/M

    Best for: Batch processing, structured extraction, JSON pipelines

    How: Best for structured output tasks. Supports response_format: json_object. No reasoning overhead.

    Example: Process RSS feeds into structured summaries for pennies per 1000 articles.

    very cheapno hidden reasoningreliable JSON

    API: api.moonshot.ai — OpenAI-compatible. model='moonshot-v1-8k'