CloudCodeTree LogoCloudCodeTree
AI NewsTutorialsAbout
CloudCodeTree Logo
CloudCodeTree
  • AI News
  • Tutorials
  • About
← Back to AI News
One API Key for 500+ Models: Get Started with OpenRouter Hosted Inference

One API Key for 500+ Models: Get Started with OpenRouter Hosted Inference

Chris Harper

2 min read

Jun 28, 2026 · 04:03 UTC

AI
Tutorial
LLM
Developer Tools

TL;DR: Point your OpenAI SDK at openrouter.ai/api/v1 and you instantly route to 500+ models across 60+ providers — one key, one bill, automatic cost optimization.

What you'll be able to do after this:

  • Call Claude, GPT, Gemini, Llama, and Mistral through one OpenAI-compatible endpoint without per-provider setup
  • Route requests to the cheapest available provider by appending :floor to any model slug
  • Run free inference on open-weight models like Llama 3.3 70B and DeepSeek R1 with zero cost

Setup: 3 minutes

Get a free API key at openrouter.ai — no credit card required for the free tier.

pip install openai   # OpenRouter is fully OpenAI-SDK-compatible
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="<YOUR_OPENROUTER_KEY>",
)

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-6",   # any slug from openrouter.ai/models
    messages=[{"role": "user", "content": "Explain PagedAttention in two sentences."}],
)
print(response.choices[0].message.content)

Change base_url and api_key, swap the model slug — every other line of your existing code stays the same. Streaming works out of the box with stream=True.

Cost routing: the :floor suffix

Append :floor to route requests to the cheapest provider currently serving that model:

model="anthropic/claude-haiku-4-5:floor"           # cheapest provider for Haiku 4.5
model="meta-llama/llama-3.3-70b-instruct:floor"    # cheapest Llama 3.3 70B provider

OpenRouter publishes live per-provider pricing and routes you there automatically — no code logic on your side.

Free-tier models (zero marginal cost)

meta-llama/llama-3.3-70b-instruct:free
deepseek/deepseek-r1:free
google/gemma-3-27b-it:free

Rate-limited but real inference — useful for development, batch classification, or routing simple queries away from paid models.

Auto Router

openrouter/auto selects the model best suited to each request based on prompt shape and complexity. Useful when you want automatic tier-routing without hardcoding a specific model for every call path.

Built-in provider fallbacks

If your primary provider returns a 429 or 5xx, OpenRouter retries across providers automatically. You get multi-provider reliability without writing retry logic.

Sources: OpenRouter Quickstart | Provider Routing docs | How OpenRouter routing works | Lowest-Cost Inference Guide