DeepSeek is a big hit

The technology circle has been swept by a name – DeepSeek, which is like a new star born, quickly making waves around the world. In a short period of time, DeepSeek not only won the double first place in the App Store free list in the United States and China, becoming the first AI assistant application to surpass OpenAI ChatGPT, but also triggered a violent shock in the United States stock market, so that the market value of industry giants such as Nvidia was greatly reduced.

DeepSeek, Chinese name Deepseek, is an artificial intelligence company focusing on the development of large language models and related technologies, which was officially established in July 2023. Although it has not been established for a long time, it has already left a strong mark in the field of large models.

Since its establishment in July 2023, DeepSeek’s development has been a wild ride, constantly making major breakthroughs in the field of large models. On October 28, 2023, DeepSeek launched the first open source code large model, DeepSeek-Coder, which supports code generation, debugging and data analysis tasks in multiple programming languages, and is free for commercial use and fully open source, which provides developers with powerful tools. It also put DeepSeek on the map in the open source community.

Just a month later, on November 29, DeepSeek released the 67 billion parameter scale general purpose large model Deepseek-LLM, including base and chat versions of 7B and 67B, with performance close to GPT-4. This achievement marks DeepSeek’s initial success in the field of large models, and also makes the industry sit up and take notice of this emerging company. On December 15, 2023, DeepSeek took advantage of the win and launched the 3D generation model DreamCraft3D, which further expanded its technological boundaries in the field of multi-modes and demonstrated its innovation capabilities in different fields.

As we enter 2024, DeepSeek is growing at an even more intense pace. On January 11, the hybrid expert architecture model DeepSeek-MoE was released, with a total parameter of 236 billion, and the reasoning cost was reduced to only 1 yuan per million tokens, which achieved effective cost control while making technological breakthroughs. On February 5, DeepSeek released DeepSeekMath, which achieved an excellent score of 51.7% in the competition-level MATH benchmark test, close to the performance level of Gemini-Ultra and GPT-4, demonstrating strong strength in the field of mathematics.

In May 2024, DeepSeek opened source the second-generation hybrid expert architecture model DEEPseek-V2, with a total parameter of 236 billion, and its API pricing was only 1 yuan per million tokens and 2 yuan per million tokens, and the price was only one percent of that of GPT-4 Turbo. Triggered the price war of China’s AI large model, attracting a large number of users with high cost performance. On June 17, DeepSeek released Deepseek-Coder V2, which achieved performance comparable to GPT4-Turbo in code-specific tasks, once again proving its technical strength in the field of code.

On August 16, 2024, DeepSeek simultaneously released DeepSeek-Prover-V1.5 and DeepSeek-Prover-V1, further consolidating its technological advantages in the field of mathematical proof. On September 5, DeepSeek announced the merger of DeepSeek Coder V2 and DeepSeek V2 Chat, upgrading the launch of the new DeepSeek V2.5, integrating technical advantages to provide users with more powerful services.

On November 20, 2024, DeepSeek released its first inference model, Deepseek-R1-Lite, laying the foundation for subsequent inference model development. On December 13, DeepSeek-VL2, an expert hybrid visual language model for advanced multimodal understanding, was released, expanding the technical capabilities of the multimodal field. On December 26, DeepSeek opened source DeepSeek-V3, with a total parameter of 671 billion, training cost of only 5.576 million US dollars, performance beyond the Qwen2.5-72B and LLaMA 3.1-405B open source models, Shocked the industry again with low cost and high performance.

On January 20, 2025, DeepSeek opened source a new generation of inference model, Deepseek-R1, with performance comparable to OpenAI’s o1 official version. Just seven days later, on January 27, DeepSeek intelligent Assistant surpassed ChatGPT in the download list of Apple App Store in the United States and topped the list of free applications, and on the same day, DeepSeek released the multi-modal large model Janus-Pro, which became the focus of global attention.

From its establishment to the release of a number of important models, DeepSeek has risen rapidly in the field of large models in just over a year, with its strong technical strength and innovation ability, and has become a force that cannot be ignored in the global AI field.

In terms of language understanding and generation, DeepSeek presents a unique advantage. Compared with the GPT series, DeepSeek performs better in the Chinese context, and the generated text is more in line with Chinese expression habits and language logic. When writing an article about traditional Chinese culture, DeepSeek is able to accurately understand the relevant cultural connotation and historical background, use rich vocabulary and appropriate expressions, and generate a rich and logically coherent article, while the GPT series may have some semantic bias or insufficient understanding of cultural background. In multiple rounds of conversations, DeepSeek was also able to maintain a high degree of coherence, and was able to make reasonable responses based on the previous content, making the dialogue more natural and smooth.

DeepSeek was also stable when it came to reasoning and logic. In the face of complex mathematical problems and logical reasoning tasks, it can quickly analyze problems and use reasonable reasoning methods to get accurate answers. When solving a complex mathematical proof, DeepSeek can clearly explain the reasoning process and give rigorous proof steps, and GPT-4, while strong at reasoning tasks, occasionally suffers from “hallucination” problems, that is, generating inaccurate or fictional content.

DeepSeek’s training and reasoning costs are relatively low, which is a big advantage in the market competition. According to relevant data, the cost of OpenAI training ChatGPT-4 is as high as $78 million and may even reach $100 million, while the training cost of DeepSeek large model is less than $6 million, which is only 5% to 10% of the same performance model. In terms of inference cost, DeepSeek only costs 1 yuan per million tokens, while the price of GPT-4 Turbo is relatively high. This cost advantage makes DeepSeek more competitive in the market and able to attract more cost-sensitive enterprises and users.

For enterprises, using DeepSeek can reduce the development and operation costs of AI applications and improve the economic efficiency of enterprises. Some small and medium-sized enterprises choose to use the DeepSeek model when developing intelligent customer service systems, which can not only meet business needs, but also save a lot of costs. The low cost also helps to promote the popularization and application of AI technology, so that more people can enjoy the convenience brought by AI.

DeepSeek has adopted an open source strategy, opening its models and code to the MIT protocol, which has positively promoted technology development and community collaboration. Through open source, DeepSeek has attracted a large number of developers and researchers to participate in the project, forming an active open source community.

Developers can customize and optimize DeepSeek’s model according to their needs, and apply it to different domains and scenarios. In the area of code generation, DeepSeek supports 338 programming languages, and developers can use its open source model to develop more efficient code generation tools. Researchers can also conduct further research and innovation on the basis of open source, driving the continuous development of AI technology. Open source return

DeepSeek’s open source strategy promotes the sharing and cooperation of artificial intelligence technology globally, and promotes technological progress across the industry. With the continuous development and application of large model technologies such as DeepSeek, artificial intelligence will be popularized and deepened in more fields, providing a strong driving force for the digital transformation and innovative development of various industries.

DeepSeek is a big hit

熱門頭條新聞

其他動漫資訊

動漫世界網絡中國站

DeepSeek is a big hit

訂閱電子報

熱門頭條新聞

其他動漫資訊

動漫世界網絡中國站