A new large language model — Kimi K2 — has been making waves across the technology industry. Developed by Chinese artificial intelligence company Moonshot AI, which is backed by digital behemoth Alibaba, its release is being hailed as another DeepSeek moment. Like DeepSeek, Kimi K2 is open-weight, which means its trained parameters are freely available to be downloaded and adapted; also like DeepSeek, it’s demonstrated impressive performance over established models.

How does Kimi K2 work?

Kimi uses a mixture-of-experts model architecture. This means it consists of a system of separate subnetworks that each specialize in distinct parts of a given problem. The benefit of such an approach is efficiency in terms of both speed and computation.

Despite having 32 billion active parameters (and a trillion total parameters), it’s relatively inexpensive to use. While Claude Opus 4 costs $15 per million input tokens and $75 per million output tokens, Kimi is a fraction of that at $0.15 per million input tokens and $2.50 per million output tokens.

Kimi is also notably described as an agentic LLM. Moonshot AI says in its product copy that it has been “meticulously optimized for agentic tasks”. This sets it apart from the reasoning approach commonly used in other established models, where the model is constructed in such a way to follow a sophisticated step-by-step approach to problem solving. By contrast, the agentic approach emphasized by Moonshot is intended to allow the model ‘learn’ from external experiences — a point made by researchers David Silver and Richard Sutton in their paper The Era of Experience (which the Moonshot AI team specifically cite).

Is Kimi K2 really another DeepSeek?

The release of Kimi K2 has been described by some as another DeepSeek moment. It’s not hard to see why — once again we have an open model built by a Chinese company apparently outperforming established players. While this is true, it hasn’t had the same cultural and economic impact as the story earlier this year.

Perhaps, though, that in itself is important — that high performance open-weight models coming out of China no longer surprises us is a signal that the AI market is shifting.

That’s not to say the likes of OpenAI and Anthropic won’t continue to dominate, but the technical innovations we’ve seen from both DeepSeek and Kimi represent the diversity of the AI field at the moment and will undoubtedly encourage more innovation and experimentation.

Kimi K2’s impressive coding performance

One of the areas Kimi K2 has shown significant promise is in coding tasks. According to Moonshot, the model outperformed other established models against multiple benchmarks. Of course, proof of just how effective it is for coding will be in actual real-world use, but sentiment has, so far, been positive from many software developers who have used it.

The fact that it’s possible to integrate Kimi K2 with Claude Code — Anthropic’s agentic coding tool — means we’re likely to hear much more about Kimi K2’s performance in the weeks to come.