AI Models
186 models · 4 new in 60d
- ▾Claude Opus 4.7
Anthropic · 1M tokens · $5/M → $25/M
Best for: Most capable generally available model. Complex multi-step coding, long agentic workflows, 1M-token codebase reads.
How: client.messages.create(model='claude-opus-4-7', ...). Adaptive thinking is on by default — no separate extended-thinking mode needed.
Example: Use Claude Code CLI with --model claude-opus-4-7 to handle PR-sized refactors end-to-end in a single run.
SWE-bench step-change over Opus 4.6Context 1M (~555k words)agentic codingnew tokenizeradaptive thinking1M context128k max outputAPI: api.anthropic.com (model: claude-opus-4-7) · AWS Bedrock · GCP Vertex AI · Microsoft Foundry
Step-change improvement in agentic coding vs Opus 4.6. New tokenizer means 1M tokens ≈ 555k words (vs 750k for Sonnet 4.6).
- ▾Kimi K2.5
Moonshot AI · 256K tokens · $0.55/M → $2.19/M
Best for: Budget alternative to flagship models, Chinese language tasks
How: OpenAI SDK with base_url='https://api.moonshot.ai/v1'. WARNING: has implicit reasoning that eats max_tokens.
Example: Use moonshot-v1-8k instead for structured JSON tasks — kimi-k2.5 wastes tokens on hidden thinking.
reasoningmultimodalcheapAPI: api.moonshot.ai — OpenAI-compatible
Watch:hidden thinking burns tokenstemperature locked to 1 - ▾Claude Opus 4.6
Anthropic · 1M tokens · $15/M → $75/M
Best for: Complex multi-step coding, large codebase refactors, long-document analysis
How: Best via Claude Code CLI for coding tasks. For API: messages.create() with system prompt + tools.
Example: claude-code: point it at a repo, describe the feature, it reads/edits/tests autonomously.
SWE-bench 72.5%GPQA Diamond 74.9%HumanEval 95.4%reasoninglong contexttool useagentic workflowscode generationAPI: api.anthropic.com — SDK: pip install anthropic / npm i @anthropic-ai/sdk
- ▾Claude Sonnet 4.6
Anthropic · 200K tokens · $3/M → $15/M
Best for: Production API backends, real-time chat, moderate complexity coding
How: Drop-in replacement for Opus when you need faster/cheaper. Same API, just change model ID.
Example: Use as the default model in your API gateway — upgrade to Opus only for hard problems.
SWE-bench 65.2%HumanEval 93.8%speedcost-efficiencycodingtool useAPI: api.anthropic.com — same SDK as Opus
- ▾GPT-4.1
OpenAI · 1M tokens · $2/M → $8/M
Best for: General-purpose API integration, multimodal apps, coding assistance
How: client.chat.completions.create(model='gpt-4.1', messages=[...]). Supports vision, tools, JSON mode.
Example: Build a PR review bot that reads diffs + screenshots and posts comments.
SWE-bench 54.6%HumanEval 95.3%codinginstruction followinglong contextmultimodalAPI: api.openai.com — SDK: pip install openai / npm i openai
- ▾Gemini 2.5 Pro
Google · 1M tokens · $1.25/M → $10/M
Best for: Long-document analysis, multimodal tasks, apps needing search grounding
How: client.models.generate_content(model='gemini-2.5-pro', contents=[...]). Supports grounding with Google Search.
Example: Feed a 200-page architecture doc and ask it to find security issues.
SWE-bench 63.8%GPQA Diamond 67.2%multimodallong contextsearch groundingcode generationAPI: generativelanguage.googleapis.com — SDK: pip install google-genai
- ▾Grok 3
xAI · 128K tokens · $3/M → $15/M
Best for: Tasks needing real-time information, math-heavy problems
How: OpenAI SDK with base_url override. Also supports live search via tools.
Example: Monitor real-time tech news and generate summaries using live search.
GPQA Diamond 68.2%AIME 2024 93.3%reasoningreal-time datamathAPI: api.x.ai — OpenAI-compatible SDK. Set base_url='https://api.x.ai/v1'
- ▾ESM2
NVIDIA · 128K tokens · api
Best for: computational biology tasks
How: Fine-tune ESM2 using NVIDIA BioNeMo recipes
Example: Fine-tuning ESM2 with LoRA for specific protein tasks
protein language understandinggenomic sequencesAuto-discovered from news articles.
- ▾Ryzen AI Halo
AMD · N/A · api
Best for: petite PC development
How: work with either Microsoft Windows or Linux
Example: use in AI development platforms
Linux-friendlypowered by AMD Ryzen AI Max+Auto-discovered from news articles.
- ▾Claude Code
Anthropic · 128K tokens · api
Best for: use in infrastructure management tasks
How: connect AI to your infrastructure through the Model Context Protocol (MCP)
Example: AI assistants like GitHub Copilot, IBM Bob, Claude Code etc. to interact with Terraform through the Model Context Protocol (MCP)
interacts with Terraformsupports infrastructure managementAuto-discovered from news articles.
- ▾DiffusionGemma
NVIDIA · 128K tokens · api
Best for: real-time AI applications such as chat assistants, copilots, and agentic workflows
How: Run DiffusionGemma on NVIDIA for high-throughput text generation
Example: Developers can leverage DiffusionGemma for building real-time AI applications
Developer-ReadyHigh-ThroughputText GenerationAuto-discovered from news articles.
- ▾Claude Mythos 5New
Anthropic · 1M tokens · →
Best for: Available through Project Glasswing. Successor to Claude Mythos Preview.
How: client.messages.create({model: "claude-mythos-5", messages: [...]})
Example: Use via the Anthropic SDK with model='claude-mythos-5'.
1M tokens contextadaptive thinking128k tokens max outputAPI: api.anthropic.com — model: claude-mythos-5 · AWS Bedrock · GCP Vertex AI
Max output: 128k tokens. Adaptive thinking enabled by default.
- ▾Claude Fable 5New
Anthropic · 1M tokens · →
Best for: Anthropic's most capable widely released model, for the most demanding reasoning and long-horizon agentic work
How: client.messages.create({model: "claude-fable-5", messages: [...]})
Example: Use via the Anthropic SDK with model='claude-fable-5'.
1M tokens contextadaptive thinking128k tokens max outputagentic codingAPI: api.anthropic.com — model: claude-fable-5 · AWS Bedrock · GCP Vertex AI
Max output: 128k tokens. Adaptive thinking enabled by default.
- ▾Google Gemini modelsNew
Google · 128K tokens · api
Best for: AI applications
How: integrate with Apple's new AI architecture
Example: use in AI-powered applications
AI architectureinnovativeAuto-discovered from news articles.
- ▾Claude Opus 4.8New
Anthropic · 1M tokens · $5/M → $25/M
Best for: Anthropic's most capable Opus-tier model for complex reasoning and agentic coding
How: client.messages.create({model: "claude-opus-4-8", messages: [...]})
Example: Use via the Anthropic SDK with model='claude-opus-4-8'.
1M tokens contextadaptive thinking128k tokens max outputagentic codingAPI: api.anthropic.com — model: claude-opus-4-8 · AWS Bedrock · GCP Vertex AI
Max output: 128k tokens. Adaptive thinking enabled by default.
- ▾Mellum2
JetBrains · api
Best for: Advanced AI tasks
How: Integrate Mellum2 into your AI workflows
Example: Use Mellum2 for complex problem-solving and decision-making
12B Mixture-of-Experts ModelAuto-discovered from news articles.
- ▾Gemini 3.5
Google · 128K tokens · api
Best for: General AI applications
How: Integrate with Google I/O 2026
Example: Watch 9 videos showing the capabilities of Gemini 3.5
Advanced capabilitiesHigh performanceAuto-discovered from news articles.
- ▾Gemini Omni
Google · 128K tokens · api
Best for: General AI applications
How: Integrate with Google I/O 2026
Example: Watch 9 videos showing the capabilities of Gemini Omni
Advanced capabilitiesHigh performanceAuto-discovered from news articles.
- ▾NVIDIA Blackwell
NVIDIA · 128K tokens · api
Best for: financial trading landscape
How: Enables sophisticated analysis
Example: revolutionizing financial trading landscape
sophisticated analysisvast amounts of unstructured dataAuto-discovered from news articles.
- ▾ChatGPT
OpenAI · 128K tokens · api
Best for: conversational AI and content generation in Portuguese
How: Use ChatGPT API to integrate with applications
Example: Generate news articles in Portuguese
dialoguecontent creationinformation retrievalAuto-discovered from news articles.
- ▾NVIDIA Cloud Partner (NCP) reference architecture
NVIDIA · N/A · api
Best for: governments, enterprises, and telcos
How: N/A
Example: N/A
sovereign AI factoriesbased on NCP reference architectureAuto-discovered from news articles.
- ▾NVIDIA Vera Rubin Platform
NVIDIA · 128K tokens · api
Best for: Agentic inference workloads
How: Integrate with NVIDIA's platform for inference
Example: Use for non-deterministic trajectories in AI
Solving Agentic AI’s Scale-Up ProblemRuntime dynamics of inference workloadsAuto-discovered from news articles.
- ▾DeepSeek-V4-Flash
DeepSeek · api
Best for: enabling highly efficient operations
How: Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints
Example: DeepSeek just launched its fourth generation of flagship models
highly efficientAuto-discovered from news articles.
- ▾DeepSeek-V4-Pro
DeepSeek · api
Best for: enabling highly efficient operations
How: Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints
Example: DeepSeek just launched its fourth generation of flagship models
highly efficientAuto-discovered from news articles.
- ▾Google TPU 8th Generation
Google · N/A · api
Best for: powering AI applications
How: Deploy Google's 8th generation TPUs for your AI workloads
Example: Use the new TPUs for training and inference in AI applications
specialized chipsfuture of AIAuto-discovered from news articles.
- ▾Google TPUv8
Google · N/A · api
Best for: AI acceleration
How: deploy Google TPUv8 in your cloud environment
Example: use Google TPUv8 for AI model training and inference
specialized chipspower the future of AIAuto-discovered from news articles.
- ▾Google's 8th generation TPU
Google AI · N/A · api
Best for: AI acceleration
How: Deploy Google's 8th generation TPU for AI workloads.
Example: Use the TPU for training and inference of AI models.
specialized chipspower the future of AIAuto-discovered from news articles.
- ▾GPT-5.5
OpenAI · 128K tokens · api
Best for: coding, research, and data analysis
How: Integrate GPT-5.5 into your tools for advanced tasks.
Example: Use GPT-5.5 for coding assistance or data analysis.
fastermore capablecomplex tasksAuto-discovered from news articles.
- ▾Google's eighth generation TPU
Google · N/A · api
Best for: AI applications requiring high-performance computing
How: deploy on Google Cloud to leverage the new TPU capabilities
Example: use for training and inference of large AI models
powering the future of AItwo specialized chipsAuto-discovered from news articles.
- ▾OpenAI Privacy Filter
OpenAI · api
Best for: text privacy and compliance
How: Integrate into text processing workflows
Example: Automatically redact sensitive information from documents
detecting and redacting PIIstate-of-the-art accuracyAuto-discovered from news articles.
- ▾Google's TPU (eighth generation)
Google · api
Best for: AI acceleration
How: Deploy in Google Cloud for AI tasks
Example: Use for training and inference in AI applications
specialized chipspower the future of AIAuto-discovered from news articles.