Posted: 13 Minute read

DeepSeek-R1 AI: A Primer

This is getting ridiculous. I finish an article about a significant breakthrough in AI, and only days later another arrives. So, what is the latest event that has upended the world of AI (and by implication digital marketing)? DeepSeek; China’s answer to ChatGPT, Grok AI, Claude and others. 

What is DeepSeek-R1? 

So, what is DeepSeek? In short, it’s China’s answer to OpenAI, xAI, and Anthropic. 

Formally named Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., DeepSeek was founded in only July 2023 and remained under the radar and relatively unheard of outside of China until January 2025 when the company launched its DeepSeek-R1 model. 

And, what a model it is!

DeepSeek-R1 can:

  • Solve complex problems. 

  • Perform logical inference. 

  • Make real-time decisions.

In short, DeepSeek-R1 is the real deal - able to hold its own against the latest models from OpenAI and others.

Where it particularly stands out, however, is the fact that it is a reasoning based model. Try it, and you’ll find that it won’t just spit out an answer ala ChatGPT 3.5. Instead, it will engage in logical inference, going ‘back and forth’ and figuring out the best way to provide a comprehensive, accurate answer. 

Not only does it do that, but DeepSeek-R1 actually shows its working in real-time so you can see how it is making decisions and which sources of information it is basing those decisions on. 

I appreciate that things are getting a little wordy here, so let me show you an example of DeepSeek-R1 in action:

See how it thinks in real time?

Now, DeepSeek-R1 isn’t the first AI model capable of reasoning. Open AI launched its own reasoning model - o1 in late 2024. 

So, if DeepSeek-R1 isn’t the first reasoning model out there, why the big fuss? Why did its appearance cause tech share prices to crash? 

Let’s take a look…

Note: want to give DeepSeek-R1 a go yourself? Then you can try it for free within the AI-driven search engine Perplexity. Just click the drop-down menu in the answer box and select ‘Reasoning with R1’.

DeepSeek: challenging American hegemony

Until DeepSeek’s whale-like visage surfaced in early 2025 the centre of global AI development was America - in particular, Silicon Valley. 

All the major AI companies were based in California - from nascent upstarts like Anthropic to the established giants like Google. 

America ruled the AI roost. Until January 2025. And then China flipped the metaphorical table on them.

So, what makes DeepSeek-R1 so special? Why the furore? 

Because, despite being as capable as ChatGPT, Gemini, Claude, and Grok, DeepSeek-R1: 

  • Has been vastly cheaper to train. DeepSeek claims it was able to develop R1 for only $6 million. Compare this to GPT-4 which cost OpenAI a frankly staggering $100 million to train. Bear in mind, however, that this number has been widely questioned and is the subject of considerable scepticism!

  • DeepSeek has made the R1 model freely available to anyone. Whilst not strictly open-source (in reality, R1 is open weight, whereby you can run and build the model yourself, but you don’t have access to the underlying training data). This contrasts starkly to the likes of OpenAI which maintain strict proprietary control of their AI models.

  • R1 is computationally lightweight, meaning that once it is in the inference stage (e.g. everyday use) it is significantly cheaper than the likes of ChatGPT-4 and Gemini Flash 2.0.

As I’ve written previously, until the appearance of DeepSeek-R1, AI was incredibly energy intensive - so much so that AI leaders like OpenAI are in the running to lose $5 billion a year.

Although it's early days, it looks like DeepSeek-R1 is going to slash these associated energy costs. Praful Saklani, the CEO of Pramata, an AI-software company, believes that R1 “may also cost 90% less” than current US models whilst being “95% as good as state-of-the-art”.

TL;DR: DeepSeek-R1 is seemingly cheaper to develop, cheaper to run, and is open weight meaning you can use it to your own commercial ends. That’s why Silicon Valley is so worried. 

The fallout

We’re only two months into 2025 and DeepSeek-R1 has given the incumbent AI giants a proverbial kicking. 

The impact has been brutal, with NVIDIA (the manufacturer of the world's best AI chipsets) seeing the largest drop in US stock market history, with the company losing $600 billion in market value.

Ironically, R1’s success may have occurred precisely because the US Government had restricted China’s access to the latest NVIDIA chipsets. This resulted in DeepSeek having to ‘make do with less’ and train their model to be more efficient using older chipsets. 

It is reported that DeepSeek had to resort to training R1 on a few thousand H800 chips - older and less powerful versions of NVIDIA’s H100 GPUs (which cost $40k a piece!). 

The story of DeepSeek-R1 will surely come to be a case study in the concept of ‘innovation by deprivation’. 

Experts within experts: a peek under the hood of DeepSeek-R1

How has DeepSeek managed to achieve the seemingly impossible with its R1 model? 

Prior to this the working assumption amongst AI companies was that further progression in AI would require ever larger corpuses of training data and ever bigger volumes of compute (e.g. more and bigger data centres). 

From what’s been uncovered so far (kudos to Hugging Face for rapidly figuring out the fundamental framework of the R1 model) it appears that R1 is able to run so much more efficiently and cheaply because it uses a Mixture of Experts architecture. 

Without going into too much technical detail, this architecture uses myriad smaller LLM models (called ‘experts’) which are only active when needed. This allows the model to run more efficiently and simply call upon specialist models as and when needed (as opposed to running one giant model that tries to do it all as per ChatGPT-4). 

To provide some additional contextual information, researchers estimate that although R1 contains 671 billion parameters across its total ‘experts’, only 37 billion parameters are required in a single ‘forward pass’ from an initial input. This reduces the initial amount of compute required to process an input.

As I pointed out in my article on domain-specific LLMs, this approach seems to be the way forward for AI development.

Much like humans, it makes more sense to embrace Ricardian specialisation via highly tailored domain-specific models that can work together rather than one omnipotent ‘all-knowing’ model that requires vast resources to run. 

R1 also appears to stand out from other AI models in its novel use of reinforcement learning in conjunction with supervised fine-tuning. This allows R1 to learn how to verify its own answers and follow chain-of-though reasoning - breaking down complex problems into smaller steps - much like a human would do.

Refreshingly, DeepSeek explains exactly how this reinforcement learning works in a detailed 22-page paper. This level of transparency is something we’ve not seen yet from the other major AI players.  

What’s next? 

Just last week DeepSeek announced in a social media post that it will be making the underlying code behind R1 even more accessible, releasing five open source repositories.

Expect this to rapidly accelerate AI development. 

“Why pay expensive fees for proprietary AI models when there is an open source alternative that’s just as good?”

It’s that question that’s likely to be occupying the minds of Silicon Valley executives over the coming months. 

This is an especially pertinent question given the forthcoming capex investments from the big tech majors this year. Consider the following: 

  • Alphabet (Google’s parent company) has said it will be laying out $75 billion in capital expenditures in 2025. 

  • Microsoft is spending approximately $80 billion on AI-linked capital projects this year.

  • Meta Platforms is expecting to shell out $65 billion on capital projects. 

These are huge numbers, and it’ll be interesting to see how the tech majors recoup their investments given the collapsing cost of running AI models (if I were to hazard a guess, they will easily remake their money given the nature of Jevon’s Paradox (as tech becomes more energy efficient, aggregate energy demand increases)). 

Putting these speculations aside, it’s clear that we’re now entering into a hugely exciting era for AI development. The emergence of open source/weight models like DeepSeek-R1 is especially exciting - effectively democratizing AI and taking it out of the hands of the incumbent giants like OpenAI. 

And, it’s already happening. 

Databricks - a data business that helps companies customise AI models - has revealed that of their 12,000 customers, 1,000 are already using DeepSeek-R1 to power their AI models. 

Things are about to get really exciting - and the likes of OpenAI are, ironically, going to have to become more open

Deep reasoning AI and e-commerce

I appreciate that this article has gone into considerable detail about a fairly niche subject area of AI. 

You’re perhaps sitting there wondering; what does this mean for my e-commerce business? (And, quite rightly!). 

The answer very much comes down to the second-order impacts of DeepSeek-R1 - namely, that it’s going to become considerably cheaper to deploy AI into a wide variety of online settings. 

It’s going to become easier to build highly-intelligent chatbots that know your products intimately - allowing visitors to really understand what products are best for them. This is likely to be very impactful in the world of B2B where products/services/solutions can be very complex and where purchase journeys can take a considerable amount of time and research.

Reasoning AI models like R1 are also going to play a huge role behind the scenes, transforming inventory management, ERP systems, the management of large swathes of data and more. 

It's still early days, but we're already exploring the potential applications of reasoning AI models in e-commerce here at Velstar.

Velstar: thinking about the future

We’re thinking about the future of AI and how it will impact the Internet and e-commerce websites. 

Can your current agency say the same? 

Don’t get left behind - speak to the Velstar team today about how we can use AI to boost your brand. 

Written by:

Photo of Matt Donnelly
Matt Donnelly
Senior Content Manager
Matt has been a content writer for over 10 years and is a passionate advocate of the written word. He specialises in creating technical content on a variety of scientific topics, and enjoys communicating complex ideas with clarity. Matt was previously a staff writer for an engineering magazine and keenly follows contemporary environmental issues.