The DeepSeek Effect: AI ecosystem on fire
Chinese companies are hosting and deploying DS models, offering cyber protection, and pursuing collaboration, giving a huge boost to the entire Chinese AI sector
This is the latest in a series of posts about the impact of Chinese AI startup darling DeepSeek, which has morphed into what I have dubbed the DeepSeek Effect (#DSE). The DSE is now a global phenomenon. In China in particular, the capability of the small but plucky firm’s V3 and R1 models, and expectations of improved performance, have pushed other AI firms to release new and better models. At the same time, the attention generated by DeepSeek globally has galvanized the entire AI sector in China to both support the firm and its models, and also unite around an effort to develop a fully indigenous AI stack—a herculean task given US technology controls and US China technology competition broadly speaking. Game on… Incidentally, there is still no sign of those 50,000 H100s that some have alleged DeepSeek used to train V3 and R1. It is looking more and more like a disinformation campaign by some well-paid cheerleaders for US export controls1.
How deep is the DeepSeek Effect? The good, the better, and the max
Let’s start from tackling the shallow and work our way to the deeper impacts of DeepSeek on the Chinese AI ecosystem, before looking outside China. First, it is important to note again that DeepSeek is not and has never been attempting to become an OpenAI, Anthropic, or Meta. CEO Liang Wenfen has kept a low profile, focused on the technology, and is seeking to develop something resembling artificial general intelligence (AGI). He has kept the company small and recruited an elite group of young engineers, giving them leeway to focus on the technology and not on business development. He has deep pockets, and does not need or likely want to bring in other investors—Alibaba has denied reports that it was investing $1 billion in the firm. There has been talk about a valuation of DeepSeek, but this seems very premature and may not be a priority for the firm; Liang is likely considering some type of investment round this year given the costs of scaling compute, but this will be done carefully. He is willing to have his company work with technology leaders such as Huawei, but is not looking to significantly expand the firm’s size or footprint. DeepSeek has begun to hire staff to focus on safety/security issues related to the firm’s models, which have not done well in independent tests of guardrails, but retains a flat management structure and relaxed working conditions compared to normal Chinese tech startups. Hence, DeepSeek is in some ways a unique player in the China AI ecosystem. This is important to understand as we dive deeper into the DeepSeek Effect.
In addition, the context of the DeepSeek Effect is important, as China’s AI ecosystem and its drivers are very different from Silicon Valley and other development and deployment strategies for LLMs outside China. While the recent high-level meeting (see below) saw President Xi Jinping praise DeepSeek, in general, cutting edge research in AI is viewed in terms of companies’ ability to bring commercially valuable deployments to market quickly to boost economic growth and leverage gains for strategic benefit. Chinese AI firms have struggled to come up with ways to monetize AI model deployments, and the emergence of DeepSeek and the growing conversion of China’s AI ecosystem into an open source/weight sector means that the application side is critical: quickly leveraging an advanced model that is deployed across multiple platforms and drives down costs is now seen as the goal. This accounts for the rapid deployment of DeepSeek’s R1 model across the commercial space in China, which was not done via a top-down directive but organically as companies recognized the benefits of leveraging the firm’s model. This was the “psyop” if there was one here: the timing of the release of R1 came as Chinese companies experimenting with deployment of AI models for business operations were primed for the release of such a capable open source/weight model which dramatically lower costs for deployment.
The expanding impact of DeepSeek in China breaks down as follows:
The short-term, superficial, but still important:
Boosting the price of Chinese AI and tech stocks and interest of foreign investors. Over the past week Chinese tech stocks have suddenly become more “investable”. The stocks of most major tech players in China listed on western, Hong Kong, or Chinese exchanges saw a significant boost in the wake of the spotlight on DeepSeek and domestic innovation across the AI stack. It is not clear how this plays out over the medium term. The impact as of this week was something like a $1.3 trillion increase in Chinese tech stocks, on top of the $1.2 trillion drop in US tech stocks on 27 January caused by concern over the DeepSeek model releases.
Forcing government to show greater interest in the sector and highlight support going forward. President Xi Jinping invited DeepSeek, Huawei, Tencent, Xiaomi, DeepSeek, Alibaba, and other tech entrepreneurs including Jack Ma, along with the other Hangzhou “Six Little Dragons” to a meeting this week in the runup to the National People’s Congress and other important meetings in May. Clearly Xi wants to show support for a company and city which has produced such important tech leaders. The impact has been dramatic already, with a growing number of local governments using DeepSeek models for public-facing online applications. Message received.
“Deeply study and master the use of AI models such as DeepSeek, and make full use of AI to support decision-making, analysis and problem-solving”—Zhengzhou Party secretary An Wei
The Hangzhou Six Little dragons include DeepSeek and humanoid robot leader Unitree, along with Deep Robotics, video game studio player Game Science, the firm behind last year’s hit title Black Myth: Wukong, brain-machine interface innovator BrainCo, and 3D interior design software developer Manycore.
The medium-term but still unclear impact effects…inference, inference, inference
Smartphone deployment. The widespread deployment of DeepSeek models on popular Chinese social media apps. Tencent has been testing the integration of DeepSeek’s R1 model with the search function within the WeChat app, putting the model essentially on billions of smartphones all over China.
Brokerage firm deployment. In quick succession, as many as 20 Chinese brokers and fund managers, including Sinolink Securities, CICC Wealth Management, and China Universal Asset Management, have begun to integrate DeepSeek models into their business operations, impacting research, risk management, investment decisions, and client interactions.
Electric vehicle deployment. Over the past several weeks, at least a dozen automakers, from EV leader BYD to Stellantis-backed start-up Leapmotor, along with Geely, Great Wall Motor, Chery Automobile, and SAIC Motor have announced plans to produce EVs that will use DeepSeek AI models for some platforms. These deployments will not be for ADAS platforms for autonomous driving but for in-cabin systems including entertainment.2
Cloud services deployments. These are growing rapidly, with major hyperscalers such as Alibaba, Baidu, Tencent, China Mobile, China Telecom, and other cloud providers all offering access to DeepSeek’s API in the cloud. This of course is also happening outside of China, despite efforts by some countries to ban the DeepSeek app. Thus far, there have been no serious efforts to try and restrict access to the API if it is being hosted on local cloud providers. Alibaba in particular has been bullish on hosting DeepSeek models, with Alibaba Cloud and the firm’s AI unit providing access to six new DeepSeek models through the firm’s large language model (LLM) service platform Bailian.
Telecom service provider deployments. In early February, the Ministry of Industry and Information Technology (MIIT) announced that the big three telecommunications carriers—China Mobile, China Telecom, and China Unicom have “fully accessed the DeepSeek open source large model to achieve application in multiple scenarios and multiple products, and provide exclusive computing power solutions and supporting environments for the popular DeepSeek-R1 model to help development of the performance of domestic large models.” Giant China Mobile is already offering access to V3 and RI to business customers, allowing them to deploy APIs and create new AI agent applications on its platform.
Broader integration with different online platforms. Not to be seen as left out of the DeepSeek Effect, other Chinese AI companies are announcing the use of DeepSeek models to augment or improve their product. AI and robotaxi major Baidu has indicated it is combining features of DeepSeek with its Erniebot in its model for search. Tencent also integrated DeepSeek with the firm’s Hunyuan model in WeChat, QQ browser, AI assistant, QQ Music, doc assistant, ima copilot and other applications. It is not clear yet whether ByteDance and Alibaba will seek to integrate DeepSeek into their platforms or if these AI leaders will continue to use their own models Doubao and Qwen for their primary model and integrate DeepSeek for other apps. For now, Chinese users appear to be favoring DeepSeek over other models when offered both on the same platform.
The long-term impact remains uncertain, but will affect development of China AI Stack
Collaboration with Huawei. This may in the end be the most important outcome of the DeepSeek Effect, but it is also the most complex and uncertain. The goal of collaboration could be nothing less than developing a full-fledged alternative to the Nvidia/CUDA/TensorFlow/PyTorch AI stack, but success here is not guaranteed, though the incentives will only become greater. Huawei now occupies a unique position athwart both the semiconductor and AI industry in China, putting it in a position to drive development across the AI stack. As I have noted, this effort includes areas critical to the future access of Chinese AI firms to cutting-edge hardware. Huawei is set to release the Ascend 910C soon, and Huawei is claiming that it can achieve 60 percent of the performance of an H100. The numbers of Ascend 910C that Huawei will be able to produce from both SMIC and TSMC via a third party remain unclear, but likely number in the low millions. For more see here.
Huawei has also been quick to integrate DeepSeek into its inference solutions all across the firm’s hardware and software AI stack. DeepSeek has now been optimized for inference to run on Huawei’s FusionCube A3000 series, a distributed computing solution designed for AI workloads, which is offered in three tiers: Ultra, Pro, and Lite. All versions integrate with Huawei's DCS software stack for model deployment and management, though storage and networking configurations vary by tier. The platform provides comprehensive support for DeepSeek model deployments with varied inference speeds based on model size.3 Other major players such as SiliconFlow working with Huawei’s specialized AI cloud services are Shenteng, who are also deeply involved with offering DeepSeek models on their platforms. SiliconFlow’s website interestingly claims that it is “accelerating AGI to benefit humanity.”
Rough hardware equivalences (these comparisons are problematic as they do not account for system effects, internetworking, etc., and this will become increasingly difficult in the age of Nvidia Blackwell systems:
Nvidia A100 → Huawei Ascend 910B
Nvidia H100 → Huawei Ascend 910C
Nvidia B200 → Huawei Ascend 9XX?
Attracting AI talent to stay in China or return. Assessing the impact here is just preliminary, but this could be the most important lasting impact of the DeepSeek Effect. DeepSeek has recruited young software and hardware engineering talent directly from universities in China, with only a few of the DeepSeek staff having spent time abroad working for big US AI players. This reflects a broader, longer-term trend in which the most promising students in STEM fields and AI appear to be choosing to stay or return to China and work for leading technology firms, spurred by a number of factors, including the hostile environment in the US created by such government programs as the China Initiative. Hence it is not only export controls that are contributing to a process creating new incentives for innovation, but also US immigration policies and the increasing sense among top Chinese STEM students and researchers that the opportunities in China may be more attractive and offer long-term benefits. However, DeepSeek appears to be somewhat unique here in offering much higher salaries, focusing on small numbers of very capable engineers without much experience in the commercial world, and those who have scored highly in competitions. CEO Liang also has deliberately adopted a work culture that does not require employees to work long hours, as at most Chinese tech companies. “We are working on the most difficult problems, so we are attractive to them,” he has said.
Collaboration with cybersecurity firms. DeepSeek has developed a close relationship with leading Chinese cybersecurity firm Qihoo 360, which is using DeepSeek’s models for its QAX model to address issues such as threat assessment, security operations, penetration testing and vulnerability management, as well as identity and access management. In late January, Qihoo 360’s founder Zhou Hongyi, noting the rise of DeepSeek, said that, “We should have confidence that China will eventually win the AI war with the US.” Qihoo announced in late January that it would provide security services to DeepSeek without charge, and may have been involved in helping DeepSeek deal with a large DDoS attack in late January.
Impact of open source model cycle on overall Chinese capabilities along AI stack
This is another factor in the DeepSeek Effect which will likely have a critical impact on the Chinese AI sector and will be the most difficult for the Trump administration to tackle from a technology control perspective. As I have noted, the open source community embrace of DeepSeek has been rapid and positive. Mistral CEO Arthur Mensch at the Paris AI Action Summit in early February cited DeepSeek’s innovations on top of Mistral’s in areas such as Mixture of Experts (MoE) approaches and giving back to the community.
US government officials contemplating what to do about DeepSeek will have their hands full just attempting to figure out how the open source community actually works and how fast feedback loops work within it. For example, the Awesome DeepSeek Integration repository on Github presents a comprehensive catalog of how to integrate the DeepSeek LLM into various technical platforms—from cloud to edge—with a strong emphasis on regional compliance, enterprise requirements, and community-driven enhancements. It serves as a one-stop reference for developers, architects, and organizations looking to customize and deploy DeepSeek across different infrastructures and geographies. From this we can highlight some key takeaways about the spread of DeepSeek models within the open source/model community:
Diverse platform support: DeepSeek can be deployed on major public clouds, on-premise HPC clusters, container platforms like Kubernetes/OpenShift, and even IoT/edge devices.
Regional compliance: The list provides region-specific guidelines (North America, EU, APAC, etc.), reflecting data sovereignty and regulatory nuances.
Tooling & DevOps: There is a strong focus on containerization, automated deployment pipelines, monitoring, and logging.
Vertical specialization: Resources exist for enterprises, regulated industries, and domain-specific fine-tuning use cases.
Community-driven: Many integrations are community contributions, with open collaboration encouraged.
In addition, Nvidia has also taken steps to take advantage of DeepSeek’s R1 model, using it experimentally to generate GPU kernels. In short, Nvidia has taken this step with DeepSeek to simplify, automate, and streamline the path from high-level algorithms to highly optimized GPU code, reinforcing GPUs as the backbone for next-generation AI and HPC applications. R1 introduces an automated approach to generating high-performance GPU kernels, significantly reducing the manual effort typically required to optimize code for deep learning and HPC workloads. By integrating compiler-based techniques and inference-time scaling, DeepSeek R1 can dynamically tailor kernel behavior to the runtime environment, ensuring efficient use of GPU resources across varying problem sizes or hardware configurations.
This development is notable because it accelerates the adoption of GPU computing in fields where domain experts might not have deep GPU programming knowledge. With DeepSeek automating key optimization steps, researchers and engineers can spend more time on core problem-solving and less on the complexities of low-level kernel tuning. Ultimately, DeepSeek R1 helps streamline the path from high-level algorithm design to optimized GPU execution, boosting productivity and performance in HPC and AI pipelines.
US approach: a mixture of export controls? As Global South cheers?
Given the rapid spread of DeepSeek both within China and much more broadly globally, any US response to the phenomenon will be difficult and complex. There does not seem to be any playbook for this. While some countries are trying to ban downloads of the DeepSeek app from app stores, most recently Korea, this is only the tip of the iceberg of the spread of the DeepSeek models.
US officials at the Commerce Department and associated think tanks such as Rand are likely attempting to determine how to respond via measures similar to other governments, such as banning the app from app stores, but it remains unclear what other measures can prevent the widespread use of the API across so many platforms in the US and globally within the open source community.
Finally, it is possible, but not certain, that the DeepSeek effect on the Chinese AI and IT ecosystem will turn out to have some similarity to the impact of innovative Chinese companies in EVs and clean energy: meaning that while Chinese firms are not necessarily at the forefront of the most advanced cutting-edge systems, they are able to build on top of other innovators and become dominant players via cost-effective, scalable deployment that takes advantage of other systemic advantages Chinese firms may enjoy. Thus it is likely that the spread of DeepSeek’s models and innovation strategy, and teaming with Huawei and others in the Chinese IT sector, represents a significant step towards the ability of China to export an “AI Stack in a box” to the Global South along the Belt and Road.
I will explore some of the themes developed here in more detail in subsequent notes. Author would like to thank Jinhua Yip, an intern at DGA ASG and graduate student at SAIS, for sharing some of this thoughts on the significance of the DeepSeek Effect.
Footnotes
On this issue, the best analysis so far is from Glenn Luk here. In extensive discussions with Glenn on this, he notes that in analyzing the economics and performance of DeepSeek’s parent hedge fund High Flyer Capital Management, he concluded that its peak performance fees were in 2020-21 - which would have driven profits that CEO Liang Wenfeng could then reinvest in DeepSeek. But the fund was flat from 2022-24, in part due to the notable crackdown on these types of funds. Glenn believes (and I agree, because during this critical time, when US export controls were kicking in, HFCM could only really rely on management fees) this would have been nowhere close enough to fund 50,000 H100s, and the integration of such a cluster would have been a very heavy list, that DeepSeek itself would not have been capable of. It seems likely that HFCM could have funded a smaller cluster of 5-10K GPUs, which appears to be the case, based on DeepSeek noting access to a 10K cluster of A100s. Glenn notes that “….this would be consistent with its v1-v3 and r1 papers, the claim of 2,048 H800s for final training run, and DeepSeek’s obsessive-compulsive behavior to optimize H800s suggests they were compute-starved, not compute-rich.” Indeed, this is the most plausible read of the documentation and available understanding of the firm, its motives, and its capabilities.
While it’s unlikely that large language models (LLMs) such as those from OpenAI (e.g., GPT-4) or newer models like DeepSeek will directly control mission-critical ADAS or self-driving functions in electric vehicles, there are several other areas where they could play a significant role. Here are a few examples:
In-Cabin Virtual Assistants/Natural Language Understanding
LLMs can handle more complex, nuanced questions and commands than traditional rule-based assistants. Instead of needing to memorize specific voice commands, drivers (or passengers) could speak to the vehicle the way they would a human.
Intelligent Infotainment and Content Generation
With a broader range of capabilities, LLM-based systems can do more than just play music or tune into radio stations:
Real-time Content Generation: Generate personalized driving itineraries, turning your next road trip into a narrated experience about local history or points of interest along the route.
Live Summaries: Offer succinct news summaries or audiobook narration on request—effectively a “podcast on demand” of content you are most interested in.
Conversational Media Search: Rather than fiddling with complex menus, users can simply say, “Play me something uplifting,” and the system can pull from streaming services based on emotional sentiment or personal preferences
Huawei’s AI software stack components:
DCS ModelEngine
This is Huawei's core model serving and deployment platform
Handles model lifecycle management, including deployment, scaling, and monitoring
Integrates with Huawei's CANN (Compute Architecture for Neural Networks) for hardware acceleration
Provides APIs for model serving and inference
DCS eContainer
Container orchestration layer customized for AI workloads
Built on top of Kubernetes with AI-specific optimizations
Manages resource allocation and scheduling for AI tasks
Handles GPU/NPU resource pooling and allocation
DCS FusionCompute
Virtualization platform for compute resources
Manages underlying hardware resources (Atlas AI accelerators, GPUs, CPUs)
Provides unified resource management across different hardware types
Optimizes hardware utilization for AI workloads
DCS DME (Distributed Model Engine)
Handles distributed training and inference
Manages model parallelism and data parallelism
Optimizes communication between distributed components
Integrates with MindSpore (Huawei's deep learning framework)
This stack integrates with Huawei’s broader AI ecosystem:
Sits above CANN (Compute Architecture for Neural Networks)
Works with Ascend hardware (their AI processors)
Supports MindSpore and other frameworks
Integrates with ModelArts (Huawei's AI development platform)
The key difference from NVIDIA’s approach is that Huawei has created a more vertically integrated stack, with tighter coupling between hardware and software layers, while NVIDIA’s CUDA provides a more general-purpose computing platform with separate tools for AI-specific functionality.