China's Free LLM Ecosystem: Baidu, Zhipu, SiliconFlow & ModelScope

Baidu Qianfan, Z.ai (Zhipu AI), SiliconFlow, and ModelScope collectively offer dozens of free models — many of which don't appear on US-centric directories. Here's every endpoint, how to get an API key, and ready-to-copy config snippets for each.

Why China's Free LLM Ecosystem Matters

Most free LLM directories focus on US providers (Google, Groq, NVIDIA). But China has a parallel free LLM ecosystem that's equally rich — and in some areas, further ahead. Baidu's ERNIE models rival GPT-4 class performance. Z.ai's GLM series holds its own against Claude for coding. SiliconFlow offers 1,000 RPM on the free tier (far beyond most US providers). ModelScope (Alibaba) gives free access to Qwen models.

For developers, this means more options, higher rate limits, and access to models optimized for Chinese + English bilingual tasks. If you're building apps that serve multilingual users — or just want the highest free-tier RPM available — China's ecosystem is worth knowing.

We track 14+ free models across these four providers.

1. Baidu Qianfan — ERNIE & CoBuddy

Endpoint: https://openrouter.ai/api/v1 (OpenRouter) · API Key: Get on OpenRouter → · OpenAI-compatible: Yes · No credit card: Yes (via OpenRouter)

Baidu's models are accessible via OpenRouter. Two free models stand out:

CoBuddy — Baidu's Coding Agent Model

CoBuddy is optimized for coding tasks and AI agent workflows. 131K context, up to 65K output, native tool calling and reasoning. Runs on fp8 quantization for fast inference. Free tier available.

# Claude Code / Cursor / Codex
export ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1"
export ANTHROPIC_AUTH_TOKEN="sk-or-v1-YOUR_KEY"
export ANTHROPIC_MODEL="baidu/cobuddy:free"

Qianfan-OCR-Fast — Domain-Specific OCR

Qianfan-OCR-Fast is a multimodal model purpose-built for OCR. 65K context, handles images and text. Useful for document processing pipelines.

# OpenAI SDK
from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-YOUR_KEY"
)
response = client.chat.completions.create(
    model="baidu/qianfan-ocr-fast:free",
    messages=[{"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": "https://example.com/doc.png"}},
        {"type": "text", "text": "Extract all text from this document."}
    ]}]
)

Baidu's paid ERNIE 4.5 models (21B-424B params) are also on OpenRouter for when you outgrow the free tier — same endpoint, swap the model name. Browse all Baidu models: Baidu on freellm.net →

2. Z.ai (Zhipu AI) — GLM Series

Endpoint: https://api.z.ai/v1 · API Key: Get on Z.ai → · OpenAI-compatible: Yes · No credit card: Yes

Z.ai (Zhipu AI) offers the GLM model family with its own API endpoint. The free tier includes:

ModelContextBest ForFree?
GLM 5.1200K8h+ autonomous coding, long-horizon agentsYes
GLM 5 Turbo200KFast inference, OpenClaw agent workflowsYes
GLM 4.7 Flash128KBalanced performance/efficiency, agentic codingYes
GLM 5V Turbo128KVision-based coding, multimodal agentsYes
GLM 4.6V128KVisual understanding, screenshot-to-HTMLYes

GLM 5.1 — The 8-Hour Autonomous Coder

GLM 5.1's standout feature: it can work independently on a single task for more than 8 hours — autonomously planning, executing, and self-correcting — delivering complete, engineering-grade results. This makes it uniquely suited for OpenClaw, Hermes, and long-running agent sessions.

Config for All Z.ai GLM Models

# Claude Code
export ANTHROPIC_BASE_URL="https://api.z.ai/v1"
export ANTHROPIC_AUTH_TOKEN="YOUR_ZAI_API_KEY"
export ANTHROPIC_MODEL="glm-5.1"

# Codex / Cursor / OpenCode / OpenClaw / Hermes
export OPENAI_BASE_URL="https://api.z.ai/v1"
export OPENAI_API_KEY="YOUR_ZAI_API_KEY"
# Set model to: glm-5.1, glm-5-turbo, glm-4.7-flash, etc.

Get API Key: Go to open.bigmodel.cn, sign up (email or phone), create an API key. No credit card required.

3. SiliconFlow — 1,000 RPM Free Tier

Endpoint: https://api.siliconflow.cn/v1 · API Key: Get on SiliconFlow → · OpenAI-compatible: Yes · No credit card: Yes · Free tier rate limit: 1,000 RPM, 50K TPM

SiliconFlow's free tier is the most generous among all providers we track. 1,000 RPM is 25× the typical free tier limit. It hosts a wide range of models — DeepSeek, Qwen, GLM, BGE embeddings, and specialized models like DeepSeek-OCR.

Notable Free Models on SiliconFlow

  • DeepSeek R1 Distill Qwen 7B/8B — Lightweight reasoning models distilled from DeepSeek R1.
  • GLM-4-9B Chat — Compact Zhipu model, good for lightweight tasks.
  • DeepSeek-OCR — Purpose-built OCR model with 131K context.
  • BAAI/bge-large-zh-v1.5 — Chinese embedding model (free embeddings are rare).

Config

# Any OpenAI-compatible tool
export OPENAI_BASE_URL="https://api.siliconflow.cn/v1"
export OPENAI_API_KEY="YOUR_SILICONFLOW_KEY"
# Use model IDs like: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

How to Get an API Key

  1. Go to cloud.siliconflow.cn/account/ak
  2. Sign up (phone verification required for Chinese users; email may work for international).
  3. Navigate to Account → API Keys → Create.
  4. Copy the key. Starts with sk-.

Browse all SiliconFlow models: SiliconFlow on freellm.net →

4. ModelScope (Alibaba) — Qwen Models for Free

Endpoint: https://api-inference.modelscope.cn/v1 · API Key: Get on ModelScope → · OpenAI-compatible: Yes · No credit card: Yes · Free tier rate limit: 2,000 RPD total; ≤500 RPD per model

ModelScope is Alibaba's model hub. The free inference API gives access to Qwen models — Alibaba's flagship LLM family — plus community models.

Notable Free Models

  • Qwen3.5-35B-A3B — Alibaba's latest 35B MoE model with 3B active params.
  • Qwen3.5-27B — Dense model for balanced performance.
  • Qwen-Image — Text-to-image generation endpoint.

Config

export OPENAI_BASE_URL="https://api-inference.modelscope.cn/v1"
export OPENAI_API_KEY="YOUR_MODELSCOPE_TOKEN"
# Model IDs: Qwen/Qwen3.5-35B-A3B, Qwen/Qwen3.5-27B, etc.

How to Get an API Key

  1. Go to modelscope.cn/my/myaccesstoken
  2. Sign up (Alibaba Cloud account or phone).
  3. Create an access token. This is your API key.

Quick Comparison

ProviderFree RPMNo CardBest ModelEndpoint Type
Baidu (via OpenRouter)VariesYesCoBuddy (coding agent)OpenRouter proxy
Z.ai (Zhipu AI)VariesYesGLM 5.1 (8h autonomy)Direct API
SiliconFlow1,000YesDeepSeek R1 Distill QwenDirect API
ModelScope— (2K RPD)YesQwen3.5-35B-A3BDirect API

Which Should You Choose?

  • For coding agents (Claude Code, OpenClaw): Z.ai's GLM 5.1 — built for long autonomous sessions. Use the direct API for lowest latency.
  • For highest rate limits: SiliconFlow — 1,000 RPM crushes every other free tier. Ideal if you're running batch jobs or have multiple users.
  • For Chinese + English bilingual tasks: Baidu ERNIE via OpenRouter — strong multilingual performance. Or ModelScope's Qwen models.
  • For the simplest setup: Baidu via OpenRouter — if you already have an OpenRouter key, you already have access. Same endpoint, just swap the model name.

Browse all 14+ free Chinese models → View on Model Directory

Or use the Config Generator to get ready-to-copy snippets for your tool + any of these providers.