Zhipu AI GLM (GLM-4.7 / GLM-4-Flash)
LLMby Zhipu AI
Zhipu AI's GLM series offers strong Chinese-English bilingual capabilities. GLM-4-Flash is free, GLM-4.7 provides frontier performance. The platform also offers free vision, reasoning, and image generation models.
Registration & API Key Steps
Step 2
Register with email or phone number (Chinese phone supported).
Step 3
Complete personal or enterprise real-name verification.
Step 4
Navigate to User Center and create an API Key.
Step 5
Free models (GLM-4-Flash, GLM-4V-Flash, etc.) are immediately available.
Step 6
Claim additional free tokens from the platform's promotion page.
Pricing
| Tier | Price | Features |
|---|---|---|
| GLM-4-Flash | Free | Free model. 128K context. Good for general tasks. |
| GLM-4-Air | ¥1 / ¥1 per 1M tokens | Input / Output (~$0.14). Cost-effective. |
| GLM-4.5 | $0.35 / $1.55 per 1M tokens | Input / Output. Strong performance. |
| GLM-4.7 | $0.40 / $1.50 per 1M tokens | Input / Output. Latest flagship. 203K context. |
Application Tips
Tip 1
GLM-4-Flash is free with 128K context — one of the best free Chinese LLM APIs.
Tip 2
6 free models available: text, vision, reasoning, image generation, video generation.
Tip 3
New users get 5M GLM-4 tokens free — apply for fine-tuning to get 5M additional training tokens.
Tip 4
API is compatible with OpenAI format for easy integration.
Tip 5
CogView-3-Flash offers free AI image generation.
Tip 6
GLM-4.7 at $0.40/1M input tokens is very competitive with international models.
China Access Solutions
Access Solution
Directly accessible in China. Zhipu AI is a Beijing-based company spun off from Tsinghua University. Supports Chinese phone, domestic payment.
Code Example
from zhipuai import ZhipuAI
client = ZhipuAI(api_key="your-api-key")
response = client.chat.completions.create(
model="glm-4-flash", # Free model
messages=[
{"role": "user", "content": "请推荐几本关于人工智能的书"}
]
)
print(response.choices[0].message.content)
# --- OpenAI-compatible format ---
# from openai import OpenAI
# client = OpenAI(api_key="xxx", base_url="https://open.bigmodel.cn/api/paas/v4")
# response = client.chat.completions.create(model="glm-4-flash", ...)Rate Limits
| Tier | Limits |
|---|---|
| Default | GLM-4-Flash (free): 2 concurrent requests, can apply for higher limits. GLM-4.7: varies by tier. |
Recommended Use Cases
Related API Guides
OpenAI GPT-4o / GPT-4.1 / o3
OpenAI
OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.
Anthropic Claude (Sonnet 4.5 / Opus 4.5)
Anthropic
Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.
Google Gemini (2.5 Pro / 2.5 Flash)
Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.
Meta Llama 4 (Scout / Maverick)
Meta
Meta's open-source Llama 4 models are free to use and available through multiple cloud providers. Llama 4 Scout and Maverick offer competitive performance at extremely low cost through partner APIs.