What Is inclusionAI?
inclusionAI is a new model provider focused on AI agents. Their thesis: agent workflows demand a different kind of model — one optimized not just for single-turn Q&A, but for multi-step planning, tool calling chains, and long-running autonomous tasks. Their model family has three members, all accessible via OpenRouter:
- Ring-2.6-1T — Thinking/reasoning model, 1T total params, 63B active. This is the flagship.
- Ling-2.6-1T — Instant/instruct model, same scale but optimized for speed over depth.
- Ling-2.6-flash — 104B total, 7.4B active. Fast, efficient, for lightweight agent tasks.
Ring is free on OpenRouter. Ling and Ling-flash are paid. This article focuses on Ring.
Ring-2.6-1T Specs
| Spec | Value |
|---|---|
| Total parameters | 1,000B (1 trillion) |
| Active parameters | 63B (MoE, ~6% activated per token) |
| Architecture | Mixture-of-Experts with adaptive reasoning |
| Context window | 262,144 tokens |
| Max output | 65,536 tokens |
| Reasoning modes | High, XHigh (adaptive effort based on task complexity) |
| Modality | Text in, text out |
| Released | May 8, 2026 |
| API format | OpenAI-compatible (via OpenRouter) |
| Price | Free (OpenRouter free tier) |
| Rate limit | OpenRouter free tier limits (varies by demand) |
What Makes Ring Different?
Ring was built for agent workflows — not chat. That means it was optimized against benchmarks that measure tool use, multi-step planning, and autonomous task execution:
- PinchBench — Measures how well models handle real-world coding tasks with multiple files, dependencies, and iterative fixes.
- ClawEval — Evaluates long-horizon agent performance: can the model stay coherent across 50+ tool calls?
- TAU2-Bench — Tests tool-augmented understanding: retrieving data, calling APIs, and reasoning over structured outputs.
- GAIA2-search — Multi-step research and information synthesis benchmark.
The key innovation is adaptive reasoning. Instead of burning the same compute on every request, Ring dynamically allocates its reasoning budget based on task complexity. A simple "what's 2+2" doesn't waste tokens. A complex debugging session across 5 files gets the full xhigh treatment.
Ring vs. Other Free Thinking Models
| Model | Params (active) | Context | Free? | Best For |
|---|---|---|---|---|
| Ring-2.6-1T | 1T (63B) | 262K | Yes | Coding agents, multi-step tool use |
| DeepSeek V4 Pro | 1.6T (49B) | 1M | Yes | General reasoning, long context |
| MiniMax M2.7 | — | 260K | Yes | Multimodal, coding |
| GLM 5.1 | — | 200K | Yes | 8h+ autonomous work |
| DeepSeek R1 | 671B (37B) | 128K | Yes | Math, science reasoning |
Ring's sweet spot is coding agents. If you're using Claude Code, OpenClaw, OpenCode, or Hermes and need deep reasoning across multi-file edits with tool calls, Ring is purpose-built for that workflow. DeepSeek V4 Pro has a larger context window; GLM 5.1 is better for truly autonomous 8-hour sessions. But for agentic coding at speed — Ring delivers.
How to Get an API Key
Ring-2.6-1T is available through OpenRouter. Getting a key takes under 2 minutes:
- Go to openrouter.ai/keys and sign up (email or GitHub).
- Click "Create Key". Copy it.
- No credit card required for free models.
You can also browse Ring on our model directory: inclusionAI: Ring-2.6-1T details →
Config Snippets
Ring uses the OpenRouter endpoint — OpenAI-compatible format. Here's the config for each tool:
Claude Code (cc)
# ~/.bashrc or ~/.zshrc
export ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1"
export ANTHROPIC_AUTH_TOKEN="sk-or-v1-YOUR_OPENROUTER_KEY"
export ANTHROPIC_MODEL="inclusionai/ring-2.6-1t:free" Cursor
Settings → Models → Add Model:
- Model name:
inclusionai/ring-2.6-1t:free - Base URL:
https://openrouter.ai/api/v1 - API Key: Your OpenRouter key
Codex CLI
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="sk-or-v1-YOUR_OPENROUTER_KEY"
export CODEX_DEFAULT_MODEL="inclusionai/ring-2.6-1t:free" OpenCode / Hermes / OpenClaw
# All accept OpenAI-compatible configs
OPENAI_BASE_URL="https://openrouter.ai/api/v1"
OPENAI_API_KEY="sk-or-v1-YOUR_OPENROUTER_KEY"
# Set model to: inclusionai/ring-2.6-1t:free For more tool-specific configs (Gemini CLI, Kilo Code, etc.), use our Config Generator — select your tool, pick OpenRouter, and copy the ready-to-use snippet.
Rate Limits & Practical Usage
OpenRouter's free tier doesn't publish fixed RPM numbers — limits vary based on overall demand and model load. In practice:
- Expect 20-30 RPM during off-peak hours for free models.
- Free models may queue during peak demand (paid models get priority routing).
- Ring is a reasoning model, so individual requests may take longer (10-30s for complex tasks). This means rate limits are less of a bottleneck than with instant models — you're not firing 30 requests per minute anyway.
When to Use Ring (and When Not To)
Use Ring when:
- You're running coding agents that need sophisticated tool use and multi-step reasoning.
- You're debugging across multiple files and need the model to understand the full codebase context.
- You need strong performance on structured coding benchmarks (SWE-bench, etc.).
- You want cutting-edge agent architecture without paying for it.
Skip Ring when:
- You need fast, single-turn chat responses — use Ling-2.6-1T (paid) or Groq's Llama 3.3 70B (free, faster).
- You need a 1M context window — use Gemini 2.5 Flash.
- You need guaranteed throughput — OpenRouter free tier has no SLA. Paid plans start at credits-based pay-as-you-go.
View Ring-2.6-1T on our model directory → Model Details
Or browse all free models across 18 providers.