Zhipu AI GLM (GLM-4.7 / GLM-4-Flash)

LLM

by Zhipu AI

Zhipu AI's GLM series offers strong Chinese-English bilingual capabilities. GLM-4-Flash is free, GLM-4.7 provides frontier performance. The platform also offers free vision, reasoning, and image generation models.

Official API DocumentationFree: Free

API Endpoint

https://open.bigmodel.cn/

Documentation

Official Docs

Pricing

View Pricing

Registration & API Key Steps

Step 1

Visit Zhipu AI Open Platform and click "Register".

Open link

Step 2

Step 3

Complete personal or enterprise real-name verification.

Step 4

Navigate to User Center and create an API Key.

Step 5

Free models (GLM-4-Flash, GLM-4V-Flash, etc.) are immediately available.

Step 6

Claim additional free tokens from the platform's promotion page.

Pricing

Tier	Price	Features
GLM-4-Flash	Free	Free model. 128K context. Good for general tasks.
GLM-4-Air	¥1 / ¥1 per 1M tokens	Input / Output (~$0.14). Cost-effective.
GLM-4.5	$0.35 / $1.55 per 1M tokens	Input / Output. Strong performance.
GLM-4.7	$0.40 / $1.50 per 1M tokens	Input / Output. Latest flagship. 203K context.

Application Tips

Tip 1

GLM-4-Flash is free with 128K context — one of the best free Chinese LLM APIs.

Tip 2

6 free models available: text, vision, reasoning, image generation, video generation.

Tip 3

New users get 5M GLM-4 tokens free — apply for fine-tuning to get 5M additional training tokens.

Tip 4

API is compatible with OpenAI format for easy integration.

Tip 5

CogView-3-Flash offers free AI image generation.

Tip 6

GLM-4.7 at $0.40/1M input tokens is very competitive with international models.

China Access Solutions

Access Solution

Directly accessible in China. Zhipu AI is a Beijing-based company spun off from Tsinghua University. Supports Chinese phone, domestic payment.

Code Example

JavaScript / TypeScript

from zhipuai import ZhipuAI

client = ZhipuAI(api_key="your-api-key")

response = client.chat.completions.create(
    model="glm-4-flash",  # Free model
    messages=[
        {"role": "user", "content": "请推荐几本关于人工智能的书"}
    ]
)

print(response.choices[0].message.content)

# --- OpenAI-compatible format ---
# from openai import OpenAI
# client = OpenAI(api_key="xxx", base_url="https://open.bigmodel.cn/api/paas/v4")
# response = client.chat.completions.create(model="glm-4-flash", ...)

Rate Limits

Tier	Limits
Default	GLM-4-Flash (free): 2 concurrent requests, can apply for higher limits. GLM-4.7: varies by tier.

Recommended Use Cases

Chinese-English bilingual tasksKnowledge Q&ACode generationImage understandingAI image/video generation

Last Updated: 2026-02-10

Related API Guides

OpenAI GPT-4o / GPT-4.1 / o3

OpenAI

OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.

Anthropic Claude (Sonnet 4.5 / Opus 4.5)

Anthropic

Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.

Google Gemini (2.5 Pro / 2.5 Flash)

Google

Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.

Registration & API Key Steps

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Pricing

Application Tips

Tip 1

Tip 2

Tip 3

Tip 4

Tip 5

Tip 6

China Access Solutions

Access Solution

Code Example

Rate Limits

Recommended Use Cases

Related API Guides

OpenAI GPT-4o / GPT-4.1 / o3

Anthropic Claude (Sonnet 4.5 / Opus 4.5)

Google Gemini (2.5 Pro / 2.5 Flash)

Meta Llama 4 (Scout / Maverick)