Moonshot AI Kimi (K2 / K2.5)

LLM

by Moonshot AI

Moonshot AI's Kimi models are known for long-context capabilities (256K) and OpenAI-compatible API. Kimi K2 is a trillion-parameter open-source model optimized for agent tasks.

Official API Documentation

API Endpoint

https://platform.moonshot.cn/

Documentation

Official Docs

Pricing

View Pricing

Registration & API Key Steps

Step 1

Visit Moonshot AI Platform and sign up.

Open link

Step 2

Step 3

New users automatically receive ¥15 in free credits.

Step 4

Navigate to "API Key Management" in the left sidebar.

Step 5

Click "New" to create an API key.

Step 6

The API is OpenAI-compatible — just change the base URL.

Pricing

Tier	Price	Features
Moonshot-v1-8K	¥12 / ¥12 per 1M tokens	Input / Output (~$1.65). 8K context.
Moonshot-v1-32K	¥24 / ¥24 per 1M tokens	Input / Output (~$3.30). 32K context.
Moonshot-v1-128K	¥60 / ¥60 per 1M tokens	Input / Output (~$8.25). 128K context.
Kimi K2	Competitive pricing (please verify)	1T parameter model. Open-source. Available via API and partners.

Application Tips

Tip 1

API is fully OpenAI-compatible — change base_url to api.moonshot.cn and you're set.

Tip 2

Kimi excels at long document processing with 256K context window.

Tip 3

¥15 free credits for new users — no credit card needed.

Tip 4

Kimi K2 (open-source, 1T parameters) offers agent-level capabilities.

Tip 5

Supports file content extraction via dedicated API endpoint.

Tip 6

Also available through Alibaba Bailian platform.

China Access Solutions

Access Solution

Directly accessible in China. Moonshot AI is a Beijing-based company. Supports Chinese phone registration and domestic payment.

Code Example

JavaScript / TypeScript

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://api.moonshot.cn/v1"
)

response = client.chat.completions.create(
    model="moonshot-v1-8k",
    messages=[
        {"role": "system", "content": "你是Kimi，由月之暗面提供的人工智能助手。"},
        {"role": "user", "content": "请帮我分析一下这篇文章的核心观点"}
    ],
    temperature=0.3
)

print(response.choices[0].message.content)

# --- cURL example ---
# curl https://api.moonshot.cn/v1/chat/completions \
#   -H "Authorization: Bearer sk-xxx" \
#   -H "Content-Type: application/json" \
#   -d '{"model":"moonshot-v1-8k","messages":[{"role":"user","content":"你好"}]}'

Rate Limits

Tier	Limits
Default	Default: 3 RPM, 32K TPM for free tier. Paid: higher limits based on account level. Concurrency: 1-5 depending on tier.

Recommended Use Cases

Long document analysisAgent developmentChinese content creationFile content extractionResearch assistance

Last Updated: 2026-02-10

Related API Guides

OpenAI GPT-4o / GPT-4.1 / o3

OpenAI

OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.

Anthropic Claude (Sonnet 4.5 / Opus 4.5)

Anthropic

Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.

Google Gemini (2.5 Pro / 2.5 Flash)

Google

Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.

Registration & API Key Steps

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Pricing

Application Tips

Tip 1

Tip 2

Tip 3

Tip 4

Tip 5

Tip 6

China Access Solutions

Access Solution

Code Example

Rate Limits

Recommended Use Cases

Related API Guides

OpenAI GPT-4o / GPT-4.1 / o3

Anthropic Claude (Sonnet 4.5 / Opus 4.5)

Google Gemini (2.5 Pro / 2.5 Flash)

Meta Llama 4 (Scout / Maverick)