Z AI (Zhipu AI) logo How to Get a Free Z AI (Zhipu AI) API Key (2026)

2 free models available — no credit card required. Get your Z AI (Zhipu AI) API key → Test free models →

Z AI (Zhipu AI) FreeLLM Score

51
👍 Good Option — Notable for generous free limits

A usable option, though it may have noticeable restrictions or older models.

🎁
Generosity Free limits
65/100
🌍
Accessibility Signup ease
65/100
📚
Breadth Model variety
20/100
Reliability Uptime
55/100
🔌
Compatibility Tool support
50/100
🧠
Quality Benchmarks
50/100

How we score →

What is Z AI (Zhipu AI)?

Zhipu GLM-4.7-Flash — 200K context, no credit card, 1 concurrent request.

Z AI (Zhipu AI) is a leading Chinese AI lab offering GLM series models for free. The GLM-4.7-Flash model provides 200K context at no cost with no credit card required. Rate limited to 1 concurrent request — best for solo developers and prototyping rather than concurrent workloads.

  • GLM-4.7-Flash: 200K context
  • No credit card required
  • Chinese + English bilingual
  • OpenAI-compatible endpoint

API Compatibility: OpenAI SDK-compatible (Chat Completions)

How to Get a Z AI (Zhipu AI) API Key

  1. 1
    Sign up at open.bigmodel.cn Phone verification (China). No credit card.
  2. 2
    Go to API Keys in user center
  3. 3
    Create an API key
  4. 4
    Choose a model GLM-4.7-Flash for free tier with 200K context.
  5. 5
    Configure OpenAI client Base URL: https://open.bigmodel.cn/api/paas/v4

All Free Z AI (Zhipu AI) Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Released Status
GLM-4.7-Flash 200K 128K text 1 concurrent request Jan 19, 2026 Online
GLM-4.6V-Flash 128K 4K text 1 concurrent request Online

Z AI (Zhipu AI) Free Tier Limits & Pricing

Credit Card Not required
Free Tier Permanently free
Context Range 128K – 200K
Total Models 2 free
Rate Limits 1 concurrent request
API Compatibility OpenAI SDK-compatible (Chat Completions)

Z AI (Zhipu AI) API Setup Tutorial & Tools

Z AI (Zhipu AI) is fully compatible with popular AI coding assistants like Cursor, Claude Code, and more. To see step-by-step API configuration instructions for your favorite tool, please visit our Global Configuration Guide →

Use Cases

What Z AI (Zhipu AI)'s free models are best for, based on aggregated model capabilities:

Chat 2 models

Limitations & Caveats

  • 1 concurrent request only — unusable for multi-user apps
  • China-hosted — high latency outside Asia
  • Chinese phone number required for registration

Frequently Asked Questions

Is GLM-4.7-Flash competitive with other free models?

GLM-4.7-Flash offers 200K context, which is larger than most free tier models. Quality is comparable to Qwen3 and Llama 3 for Chinese-English bilingual tasks. It excels in Chinese language understanding.

Can I use Z AI (Zhipu) from outside China?

Technically yes, but latency will be high since servers are in China. The bigger issue is registration — it requires a Chinese phone number, which blocks most international users.

What does "1 concurrent request" mean in practice?

You can only have one API request in-flight at a time. If you send a second request while the first is still processing, it will queue or fail. This makes it only suitable for single-user, sequential workloads.

See our FAQ for common questions about free LLM APIs