How to Get a Free Z AI (Zhipu AI) API Key (2026)
2 free models available — no credit card required. Get your Z AI (Zhipu AI) API key → Test free models →
Z AI (Zhipu AI) FreeLLM Score
A usable option, though it may have noticeable restrictions or older models.
What is Z AI (Zhipu AI)?
Zhipu GLM-4.7-Flash — 200K context, no credit card, 1 concurrent request.
Z AI (Zhipu AI) is a leading Chinese AI lab offering GLM series models for free. The GLM-4.7-Flash model provides 200K context at no cost with no credit card required. Rate limited to 1 concurrent request — best for solo developers and prototyping rather than concurrent workloads.
- GLM-4.7-Flash: 200K context
- No credit card required
- Chinese + English bilingual
- OpenAI-compatible endpoint
API Compatibility: OpenAI SDK-compatible (Chat Completions)
How to Get a Z AI (Zhipu AI) API Key
- 1 Sign up at open.bigmodel.cn Phone verification (China). No credit card.
- 2 Go to API Keys in user center
- 3 Create an API key
- 4 Choose a model GLM-4.7-Flash for free tier with 200K context.
- 5 Configure OpenAI client Base URL: https://open.bigmodel.cn/api/paas/v4
All Free Z AI (Zhipu AI) Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Released | Status |
|---|---|---|---|---|---|---|
| GLM-4.7-Flash | 200K | 128K | 1 concurrent request | Jan 19, 2026 | Online | |
| GLM-4.6V-Flash | 128K | 4K | 1 concurrent request | — | Online |
Z AI (Zhipu AI) Free Tier Limits & Pricing
Z AI (Zhipu AI) API Setup Tutorial & Tools
Z AI (Zhipu AI) is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →
Use Cases
What Z AI (Zhipu AI)'s free models are best for, based on aggregated model capabilities:
Limitations & Caveats
- 1 concurrent request only — unusable for multi-user apps
- China-hosted — high latency outside Asia
- Chinese phone number required for registration
Frequently Asked Questions
Is GLM-4.7-Flash competitive with other free models?
GLM-4.7-Flash offers 200K context, which is larger than most free tier models. Quality is comparable to Qwen3 and Llama 3 for Chinese-English bilingual tasks. It excels in Chinese language understanding.
Can I use Z AI (Zhipu) from outside China?
Technically yes, but latency will be high since servers are in China. The bigger issue is registration — it requires a Chinese phone number, which blocks most international users.
What does "1 concurrent request" mean in practice?
You can only have one API request in-flight at a time. If you send a second request while the first is still processing, it will queue or fail. This makes it only suitable for single-user, sequential workloads.