DeepSeek, a Chinese AI startup, has unveiled its latest large language model called DeepSeek R1, positioning itself as a potential competitor to leading Western AI models. The model demonstrates impressive capabilities in coding, mathematics, and reasoning tasks, performing at levels comparable to GPT-4 in certain benchmarks. DeepSeek R1 was trained on a massive dataset of 2 trillion tokens and employs a unique training approach that combines supervised fine-tuning with direct preference optimization. The model has shown particular strength in coding tasks, achieving a 94.4% pass rate on the HumanEval coding benchmark, surpassing many existing models. Notable features include its ability to handle complex mathematical problems, maintain longer context windows, and generate more precise responses compared to earlier versions. The company has made the model’s base version openly available to researchers and developers, though the chat version remains proprietary. This development represents a significant step forward for China’s AI sector, which has been working to reduce dependence on Western technology amid growing geopolitical tensions. The emergence of DeepSeek R1 suggests that the global AI landscape is becoming increasingly competitive, with Chinese companies making substantial progress in developing sophisticated AI models that can rival those from established Western tech giants.
Source: https://www.businessinsider.com/what-is-deepseek-r1-china-ai-2025-1