API Pricing and Cost Comparison

Transparent pricing with flexible billing. Compare LLM prices and choose the best solution for you.

Most economical model
GPT-4o-mini
$0.15 / 1M tokens
Best performance
GPT-4o
128K context
Longest context
Claude 3.5
200K tokens
Free credits
$10
For new users

GPT-4 Series

推荐

GPT-4o

gpt-4o
Input price$5 / 1M tokens
Output price$15 / 1M tokens
Context window128K
Vision understandingFunction CallingJSON ModeStrongest reasoning

GPT-4o-mini

gpt-4o-mini
Input price$0.15 / 1M tokens
Output price$0.6 / 1M tokens
Context window128K
Cost-effectiveFast responseFunction Calling

GPT-4-Turbo

gpt-4-turbo
Input price$10 / 1M tokens
Output price$30 / 1M tokens
Context window128K
Vision understandingHigh-quality outputHigh stability

Claude 3.5 Series

推荐

Claude 3.5 Sonnet

claude-3-5-sonnet
Input price$3 / 1M tokens
Output price$15 / 1M tokens
Context window200K
Ultra-long contextExcellent coding abilitySafety alignmentVision understanding

Claude 3.5 Haiku

claude-3-5-haiku
Input price$0.25 / 1M tokens
Output price$1.25 / 1M tokens
Context window200K
Ultra-fast responseCost optimizationLong-text processing

GPT-3.5 Series

GPT-3.5 Turbo

gpt-3.5-turbo
Input price$0.5 / 1M tokens
Output price$1.5 / 1M tokens
Context window16K
Classic modelStable and reliableFast response

GPT-3.5 Turbo 16K

gpt-3.5-turbo-16k
Input price$3 / 1M tokens
Output price$4 / 1M tokens
Context window16K
Extended contextGood compatibility

Other Models

Gemini Pro

gemini-pro
Input price$0.5 / 1M tokens
Output price$1.5 / 1M tokens
Context window32K
By GoogleMultimodalStrong reasoning

Mistral Large

mistral-large
Input price$8 / 1M tokens
Output price$24 / 1M tokens
Context window32K
Open-source spiritMultilingual supportCode generation

Cost optimization tips

  • Choose the right model: use GPT-3.5 or Claude Haiku for simple tasks; use GPT-4o for complex ones
  • Optimize prompts: shorter prompts reduce input tokens, directly lowering cost
  • Use caching: cache repeated queries to avoid duplicate API calls
  • Batch processing: combine multiple requests to reduce API calls
  • Set max_tokens: limit output length to avoid unnecessary tokens

Free tier details

New users receive $10 in free credits, usable for:

  • • ~2M GPT-3.5 Turbo tokens
  • • ~400K GPT-4o-mini tokens
  • • ~250 FLUX Pro images
  • • ~1000 minutes of speech-to-text

Free credits are valid for 30 days; unused portions expire. Top-ups never expire.