Z

Zhipu AI GLM (GLM-4.7 / GLM-4-Flash)

LLM

by Zhipu AI

Zhipu AI's GLM series offers strong Chinese-English bilingual capabilities. GLM-4-Flash is free, GLM-4.7 provides frontier performance. The platform also offers free vision, reasoning, and image generation models.

Registration & API Key Steps

1

Step 1

Visit Zhipu AI Open Platform and click "Register".

Open link
2

Step 2

Register with email or phone number (Chinese phone supported).

3

Step 3

Complete personal or enterprise real-name verification.

4

Step 4

Navigate to User Center and create an API Key.

5

Step 5

Free models (GLM-4-Flash, GLM-4V-Flash, etc.) are immediately available.

6

Step 6

Claim additional free tokens from the platform's promotion page.

Pricing

TierPriceFeatures
GLM-4-FlashFreeFree model. 128K context. Good for general tasks.
GLM-4-Air¥1 / ¥1 per 1M tokensInput / Output (~$0.14). Cost-effective.
GLM-4.5$0.35 / $1.55 per 1M tokensInput / Output. Strong performance.
GLM-4.7$0.40 / $1.50 per 1M tokensInput / Output. Latest flagship. 203K context.

Application Tips

Tip 1

GLM-4-Flash is free with 128K context — one of the best free Chinese LLM APIs.

Tip 2

6 free models available: text, vision, reasoning, image generation, video generation.

Tip 3

New users get 5M GLM-4 tokens free — apply for fine-tuning to get 5M additional training tokens.

Tip 4

API is compatible with OpenAI format for easy integration.

Tip 5

CogView-3-Flash offers free AI image generation.

Tip 6

GLM-4.7 at $0.40/1M input tokens is very competitive with international models.

China Access Solutions

Access Solution

Directly accessible in China. Zhipu AI is a Beijing-based company spun off from Tsinghua University. Supports Chinese phone, domestic payment.

Code Example

JavaScript / TypeScript
from zhipuai import ZhipuAI

client = ZhipuAI(api_key="your-api-key")

response = client.chat.completions.create(
    model="glm-4-flash",  # Free model
    messages=[
        {"role": "user", "content": "请推荐几本关于人工智能的书"}
    ]
)

print(response.choices[0].message.content)

# --- OpenAI-compatible format ---
# from openai import OpenAI
# client = OpenAI(api_key="xxx", base_url="https://open.bigmodel.cn/api/paas/v4")
# response = client.chat.completions.create(model="glm-4-flash", ...)

Rate Limits

TierLimits
DefaultGLM-4-Flash (free): 2 concurrent requests, can apply for higher limits. GLM-4.7: varies by tier.

Recommended Use Cases

Chinese-English bilingual tasksKnowledge Q&ACode generationImage understandingAI image/video generation
Last Updated: 2026-02-10

Related API Guides