Moonshot AI Kimi (K2 / K2.5)
LLMby Moonshot AI
Moonshot AI's Kimi models are known for long-context capabilities (256K) and OpenAI-compatible API. Kimi K2 is a trillion-parameter open-source model optimized for agent tasks.
Registration & API Key Steps
Step 2
Register with your phone number (Chinese phone supported).
Step 3
New users automatically receive ¥15 in free credits.
Step 4
Navigate to "API Key Management" in the left sidebar.
Step 5
Click "New" to create an API key.
Step 6
The API is OpenAI-compatible — just change the base URL.
Pricing
| Tier | Price | Features |
|---|---|---|
| Moonshot-v1-8K | ¥12 / ¥12 per 1M tokens | Input / Output (~$1.65). 8K context. |
| Moonshot-v1-32K | ¥24 / ¥24 per 1M tokens | Input / Output (~$3.30). 32K context. |
| Moonshot-v1-128K | ¥60 / ¥60 per 1M tokens | Input / Output (~$8.25). 128K context. |
| Kimi K2 | Competitive pricing (please verify) | 1T parameter model. Open-source. Available via API and partners. |
Application Tips
Tip 1
API is fully OpenAI-compatible — change base_url to api.moonshot.cn and you're set.
Tip 2
Kimi excels at long document processing with 256K context window.
Tip 3
¥15 free credits for new users — no credit card needed.
Tip 4
Kimi K2 (open-source, 1T parameters) offers agent-level capabilities.
Tip 5
Supports file content extraction via dedicated API endpoint.
Tip 6
Also available through Alibaba Bailian platform.
China Access Solutions
Access Solution
Directly accessible in China. Moonshot AI is a Beijing-based company. Supports Chinese phone registration and domestic payment.
Code Example
from openai import OpenAI
client = OpenAI(
api_key="sk-xxx",
base_url="https://api.moonshot.cn/v1"
)
response = client.chat.completions.create(
model="moonshot-v1-8k",
messages=[
{"role": "system", "content": "你是Kimi,由月之暗面提供的人工智能助手。"},
{"role": "user", "content": "请帮我分析一下这篇文章的核心观点"}
],
temperature=0.3
)
print(response.choices[0].message.content)
# --- cURL example ---
# curl https://api.moonshot.cn/v1/chat/completions \
# -H "Authorization: Bearer sk-xxx" \
# -H "Content-Type: application/json" \
# -d '{"model":"moonshot-v1-8k","messages":[{"role":"user","content":"你好"}]}'Rate Limits
| Tier | Limits |
|---|---|
| Default | Default: 3 RPM, 32K TPM for free tier. Paid: higher limits based on account level. Concurrency: 1-5 depending on tier. |
Recommended Use Cases
Related API Guides
OpenAI GPT-4o / GPT-4.1 / o3
OpenAI
OpenAI's flagship LLM family including GPT-4o for multimodal tasks, GPT-4.1 for long-context coding, and o3 for advanced reasoning. Industry-leading models with the largest developer ecosystem.
Anthropic Claude (Sonnet 4.5 / Opus 4.5)
Anthropic
Anthropic's Claude model family excels in nuanced reasoning, safety, and long-context tasks. Claude Sonnet 4.5 offers the best balance of cost and performance, while Opus 4.5 delivers frontier intelligence.
Google Gemini (2.5 Pro / 2.5 Flash)
Google's Gemini models offer a generous free tier, 1M token context window, and strong multimodal capabilities. Gemini 2.5 Pro leads in reasoning, while Flash models provide cost-effective alternatives.
Meta Llama 4 (Scout / Maverick)
Meta
Meta's open-source Llama 4 models are free to use and available through multiple cloud providers. Llama 4 Scout and Maverick offer competitive performance at extremely low cost through partner APIs.