How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf gabriellarson/Kimi-Dev-72B-GGUF:
# Run inference directly in the terminal:
llama-cli -hf gabriellarson/Kimi-Dev-72B-GGUF:
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf gabriellarson/Kimi-Dev-72B-GGUF:
# Run inference directly in the terminal:
llama-cli -hf gabriellarson/Kimi-Dev-72B-GGUF:
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf gabriellarson/Kimi-Dev-72B-GGUF:
# Run inference directly in the terminal:
./llama-cli -hf gabriellarson/Kimi-Dev-72B-GGUF:
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf gabriellarson/Kimi-Dev-72B-GGUF:
# Run inference directly in the terminal:
./build/bin/llama-cli -hf gabriellarson/Kimi-Dev-72B-GGUF:
Use Docker
docker model run hf.co/gabriellarson/Kimi-Dev-72B-GGUF:
Quick Links


We introduce Kimi-Dev-72B, our new open-source coding LLM for software engineering tasks. Kimi-Dev-72B achieves a new state-of-the-art on SWE-bench Verified among open-source models.

  • Kimi-Dev-72B achieves 60.4% performance on SWE-bench Verified. It surpasses the runner-up, setting a new state-of-the-art result among open-source models.

  • Kimi-Dev-72B is optimized via large-scale reinforcement learning. It autonomously patches real repositories in Docker and gains rewards only when the entire test suite passes. This ensures correct and robust solutions, aligning with real-world development standards.

  • Kimi-Dev-72B is available for download and deployment on Hugging Face and GitHub. We welcome developers and researchers to explore its capabilities and contribute to development.

Kimi Logo

Performance of Open-source Models on SWE-bench Verified.

Citation

@misc{kimi_dev_72b_2025,
  title        = {Introducing Kimi-Dev: A Strong and Open-source Coding LLM for Issue Resolution},
  author       = {{Kimi-Dev Team}},
  year         = {2025},
  month        = {June},
  url          = {\url{https://www.moonshot.cn/Kimi-Dev}}
}
Downloads last month
649
GGUF
Model size
73B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for gabriellarson/Kimi-Dev-72B-GGUF

Base model

Qwen/Qwen2.5-72B
Quantized
(24)
this model