DEVELOPING AI MODELS COSTS HUMOUNGOUSLY! BUT STEALING VERY LITTLE!

March 5, 2026

115

Of late, the most common talks, debates, and dispassionate articulations are centered around generative artificial intelligence (AI), showcasing how much the new technology would potentially transform the whole world in a matter of years. The advent of AI has not only revolutionized the way technology can be optimally utilized, suiting the needs of the modern-day world, but also amply provided the governments/leadership with an extremely critical tool to press ahead with the envisaged economic developments. Little wonder that the global leadership is spending billions of dollars to ensure the necessary infrastructure allowing AI to flourish and expand its reach in governance is being accorded top-most priority everywhere.

PC: AWS Builder Center

However, as things stand presently, only a few countries, led by the USA controls the levers of developing the AI ecosystem. The race for owning up the AI infra is picking up pace with several countries jumping into the fray. Note that top-notch AI models cost billions to build, but almost cost nothing to steal. As such, the AI firms should be on their guard. Estimates vary, but AI firm Anthropic has spent over $8bn to train its Claude model. Apparently, it doesn’t want rivals cloning Claude for free, but that’s what three Chinese firms, including last year’s newsmaker DeepSeek, allegedly tried doing. Anthropic says they used Claude as a teacher for their student models. Two weeks back, Anthropic’s bigger rival OpenAI had accused DeepSeek of extracting its model.

PC: Geek Ireland

Further, last year, DeepSeek’s market-shaking debut was clouded by similar allegations. But how is it possible when OpenAI and Anthropic have closed-source models? Chinese firms can’t just copy them like MP3s. This is where distillation, a decade-old idea that was rejected when first presented at a conference, enters. The way it works is that a rookie AI poses millions of questions to a leading AI model like ChatGPT. It seeks not only final answers but also the steps used to arrive at them. This reveals the larger model’s thinking, which the new model copies to deliver pretty good answers most of the time, using a fraction of hardware and energy. How convenient. At a time when top AI models are all American, China and others will use distillation as a shortcut.

PC: Scan

For one, consider cost savings. As reported, DeepSeek last year claimed it had spent only $5.6mn to build its ChatGPT rival. Maybe it did, but around the same time, Berkeley researchers recreated OpenAI’s reasoning model in 19 hours, spending all of $450. Then, a Stanford and University of Washington team distilled its own reasoning model from Google’s Gemini in 26 minutes and $50. Of course, Gemini has also been under attack from AI model distillers. Distillation isn’t always a bad thing. AI firms distill their own models for speed and efficiency regularly. However, it’s unfair when rivals use distillation to catch up. It can also be dangerous. Recall that Claude was reportedly used in America’s Venezuela operation to extract Maduro. AI defences should be built forthwith.