GLM-4.5-Flash — Free API

z-ai-zhipu-ai/glm-4-5-flash
chat

What is GLM-4.5-Flash?

Z AI's GLM-4.5-Flash offers a free LLM for text tasks, generating up to 8,000 tokens of context from 128,000 tokens input, with no credit card needed and OpenAI compatibility, ideal for chat applications with a rate limit of one concurrent request.

Model ID
glm-4-5-flash
Base URL
https://open.bigmodel.cn/api/paas/v4

GLM-4.5-Flash API Code Example

Paste your API key and run. See the config generator for Claude Code, Cursor, and more tools.

from openai import OpenAI

client = OpenAI(
    base_url="https://open.bigmodel.cn/api/paas/v4",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="glm-4-5-flash",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://open.bigmodel.cn/api/paas/v4",
  apiKey: "YOUR_API_KEY",
});

const completion = await openai.chat.completions.create({
  model: "glm-4-5-flash",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(completion.choices[0].message.content);
curl https://open.bigmodel.cn/api/paas/v4/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "glm-4-5-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Other Free Models from Z AI (Zhipu AI)

More About Z AI (Zhipu AI)

How to get an API key, rate limits, platform limitations, and tool configuration — everything you need to set up Z AI (Zhipu AI) as a free LLM API backend.

View Z AI (Zhipu AI) full guide →