Enable javascript in your browser for better experience. Need to know to enable it? Go here.

Kimi K2: What’s all the fuss and what’s it like to use?

A new large language model — Kimi K2 — has been making waves across the technology industry. Developed by Chinese artificial intelligence company Moonshot AI, which is backed by digital behemoth Alibaba, its release is being hailed as another DeepSeek moment. Like DeepSeek, Kimi K2 is open-weight, which means its trained parameters are freely available to be downloaded and adapted; also like DeepSeek, it’s demonstrated impressive performance over established models.

 

How does Kimi K2 work?

 

Kimi uses a mixture-of-experts model architecture. This means it consists of a system of separate subnetworks that each specialize in distinct parts of a given problem. The benefit of such an approach is efficiency in terms of both speed and computation.

 

Despite having 32 billion active parameters (and a trillion total parameters), it’s relatively inexpensive to use. While Claude Opus 4 costs $15 per million input tokens and $75 per million output tokens, Kimi is a fraction of that at $0.15 per million input tokens and $2.50 per million output tokens.

 

Kimi is also notably described as an agentic LLM. Moonshot AI says in its product copy that it has been “meticulously optimized for agentic tasks”. This sets it apart from the reasoning approach commonly used in other established models, where the model is constructed in such a way to follow a sophisticated step-by-step approach to problem solving. By contrast, the agentic approach emphasized by Moonshot is intended to allow the model ‘learn’ from external experiences — a point made by researchers David Silver and Richard Sutton in their paper The Era of Experience (which the Moonshot AI team specifically cite).

 

Is Kimi K2 really another DeepSeek?

 

The release of Kimi K2 has been described by some as another DeepSeek moment. It’s not hard to see why — once again we have an open model built by a Chinese company apparently outperforming established players. While this is true, it hasn’t had the same cultural and economic impact as the story earlier this year.

 

Perhaps, though, that in itself is important — that high performance open-weight models coming out of China no longer surprises us is a signal that the AI market is shifting. 

 

That’s not to say the likes of OpenAI and Anthropic won’t continue to dominate, but the technical innovations we’ve seen from both DeepSeek and Kimi represent the diversity of the AI field at the moment and will undoubtedly encourage more innovation and experimentation.

 

Kimi K2’s impressive coding performance

 

One of the areas Kimi K2 has shown significant promise is in coding tasks. According to Moonshot, the model outperformed other established models against multiple benchmarks. Of course, proof of just how effective it is for coding will be in actual real-world use, but sentiment has, so far, been positive from many software developers who have used it.

 

The fact that it’s possible to integrate Kimi K2 with Claude Code — Anthropic’s agentic coding tool — means we’re likely to hear much more about Kimi K2’s performance in the weeks to come.

To get a first-hand perspective on Kimi K2, Thoughtworks software engineer Zhenjia Zhou kindly answered a few questions. Since the model was launched they’ve been experimenting with the model in personal projects.

Richard Gall: Why did you start using Kimi K2? What drew you to it?

 

Zhenjia Zhou: I used Kimi K2 the day it launched! I’ve been finding Claude Sonnet 4 is too expensive, especially for personal projects. So I’ve found a way to use it with Claude Code, mainly for backend python code.

 

RG: Are there any notable differences to other models?

 

ZZ: When I’m using Cursor I typically use openAI o1. Compared with o1, Kimi is more intelligent when it comes to tool calling. So, for instance, I like to use the Sequential Thinking MCP server — which o1 doesn’t really like to call. Most of the time it only calls it if I specifically ask it to in my prompt: in other words, I have to write ‘Please use sequential thinking to solve this problem.’ Claude sonnet 3.7 has similar issues.

 

RG: What do you like about it?

 

ZZ: It’s cheap and open source! Claude sonnet 4 is quite expensive. For example, one task with Sonnet 4 can cost me between $10-20. But with Kimi K2, I can do about ten similar tasks for just 50 RMB ($7 USD). And as it’s open source, I could potentially deploy it by myself, which would make it even cheaper.

 

This means I can be much more efficient and productive. I can work on tasks in parallel — if those ten tasks I’m working on don’t conflict with each other, I can simply open up ten separate instances of Claude Code and use Kimi K2 to work on each task. 

 

I imagined working like that when Claude Code was first released, but if you’re using Claude Sonnet 4 you’d be spending quite a lot of money.

 

RG: Are there any challenges or things you don’t like about using Kimi K2? 

 

ZZ: I’ve found Kimi K2 quite slow. Currently it takes more time to generate a response compared to Sonnet 4. I also think the context window is pretty small compared to Sonnet 4.

 

RG: When do you think you might use it over other, more established models?

 

ZZ: Based on using it so far I don’t really think Claude Code is the best tool for Kimi K2 — even if it’s cheaper to use Kimi. Claude Code is, after all, designed for Claude Sonnet 4 — when I'm using Kimi K2 with Claude Code, it’s like there’s a different soul in Claude Code’s body!

 

That said, if Kimi K2 eventually gets a better interface than Claude Code then maybe I’d start using it over Claude.

 

RG: I’ve seen this being called another DeepSeek moment — is it really? 

 

ZZ: I think it shows that open source language models can play an important part in the AI landscape — not just in terms of cost but also in terms of performance.

 

RG: What do you like about open models?

 

ZZ: I think there are two appealing things about the open models. One is that for companies that really care about privacy, they can deploy the model by themselves. Another is that openness means more providers. Currently, for example, Claude Sonnet 4 is only available on AWS and Claude. That means they have control of the API price. 

 

For the open source models, there will inevitably be more providers in the marketplace and perhaps even price battles, which could result in cheaper API prices.

 

Thanks to Zhenjia for taking the time to talk. It's certainly very early days when it comes to Kimi K2's adoption — we'll be watching it closely and maybe even running our own experiments with it in the months to come.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.

Discover fresh perspectives and new ideas on the Technology Podcast