Reports of DeepSeek "deception" deeply flawed
House Select Committee and "open source" consultancy reports full of rookie mistakes, betray lack of understanding of China and innovation
A recent House Select Committee on China report on DeepSeek called “DeepSeek Unmasked” reflects serious misunderstandings of the company and its technology. Interestingly, another report addressed below by a consulting firm also attempts to “unmask” DeepSeek’s “deception.” Because both of these reports could be used as justification for placing DeepSeek on the Commerce Department’s Entity List or taking other action against the firm, they require some real analysis. Rather than genuine investigations, as the reports purport to be, they are really just efforts to portray the company as linked to the Chinese government, and both reflect significant misconceptions of China, the firm, and technological innovation.
The House Committee “findings” are as follows:
“DeepSeek funnels Americans’ data to the PRC through backend infrastructure connected to a U.S. government-designated Chinese military company.
DeepSeek covertly manipulates the results it presents to align with CCP propaganda, as required by Chinese law.
It is highly likely that DeepSeek used unlawful model distillation techniques to create its model, stealing from leading U.S. AI models.
DeepSeek’s AI model appears to be powered by advanced chips provided by American semiconductor giant Nvidia and reportedly utilizes tens of thousands of chips that are currently restricted from export to the PRC.”
First, DeepSeek is not “funneling Americans’ data to the PRC.” The smartphone app operates like most AI chatbot apps, and is of course using some infrastructure in China to do inference. This is how models operate. Alleged, previously reported links to China Mobile, which runs data centers in China in addition to a huge 5G mobile network in China, are unsurprising, as this is how queries to AI chatbots would be challenged to GPUs in China for inference. The “data of Americans” here is basically just the queries, which do not contain any personal data. The amount of personal data the DeepSeek smartphone app requires is minimal, much less than other comparable mobile apps. In addition, DeepSeek’s models hosted on US servers and integrated into apps such as Perplexity are all run locally, and send no data back to China.
Second, there is no “covert manipulation” of results. All AI models developed by Chinese firms are subject to censorship requirements, as monitored by the Cyberspace Administration of China. This is widely known and hardly “covert.” Interestingly, when the app is run locally on US servers, the results are not censored.
Third, while there are allegations of distillation around DeepSeek’s use of the output of more advanced US models, there has been no confirmation of this, and it is unclear how important a role distillation played in the training of DeepSeek’s V3 and R1 models. The innovations detailed in DeepSeek’s publicly available papers appear to be much more important in explaining the capabilities of the models.
Fourth, as I have laid out in detail, the advanced GPUs that DeepSeek has used to train its models were all acquired during periods when their export to China was not constrained by US export controls. Reports, repeated by a small number of industry players, that DeepSeek has had access to a cluster of 50,000 H100 GPUs have proven to be unfounded, based on discussions with numerous industry sources and people familiar with the availability of advanced GPUs in China. In addition, the entire thrust of DeepSeek’s open sourcing of its models and research papers detailing its approach to training and optimization of its models makes clear that the optimizations were specifically designed to overcome the limitations of the hardware it had available, including a cluster of A100s and a smaller number of H800 GPUs, obtained before export restrictions were extended to the H800.
In any case, it is clear that DeepSeek will continue to use its existing access to Nvidia hardware, but also seek to develop future models using domestic sources of AI hardware from Huawei, including the firm’s 910C processor, which is included in the 384 CloudMatrix cluster and now being marketed domestically. DeepSeek is likely to take advantage of further improvements in Huawei’s hardware and cloud offerings, including the rumored 910D and the 920 Ascend processors.
Report on alleged Chinese military and government funding for DeepSeek research is quite a stretch
Around the same time as the “Unmasked” report, another report on DeepSeek by Exiger, a consulting firm, has been added to the mix, and is typical of some recent “research” reporting on Chinese companies. It comes replete with major rookie mistakes in attributing relationships and calling out “deception” by DeepSeek. The report’s title, DeepSeek’s Deception: How the Chinese Military and Government Funded DeepSeek’s AI Research, is completely inaccurate, as DeepSeek’s funding for R&D has come exclusively via investment from its parent, High Flyer Capital, a fact that is well documented.
The “deception” argument may be the most absurd of the claims, as the entire report is based on openly available research papers citing authors of the type common within the AI community broadly, and DeepSeek is even more open about its research and researchers than other AI companies. Asserting that anything is deceptive here strains credulity. And of course, one of the report’s authors previously worked for the Select Committee on China, without clearly disclosing this on the report. Hence, if there is any deception here, it is in the presentation of this report as a well-researched and objective analysis of DeepSeek. It is anything but.
The premise of the report itself is misleading. There is no evidence that “PLA funding” has contributed in any way to the success of DeepSeek. The firm is wholly funded by its founder Liang Wenfeng and his hedge fund High Flyer Capital, and most of the important research the firm does is internal and published openly, including in five major papers recently made available on Github during Open Source Week. See my post on this here. Exiger provides no evidence of links to any PLA funding for research projects that benefited DeepSeek in the report. The links Exiger cites are almost certainly inaccurate or misunderstood. And given DeepSeek’s heavy internal focus on R&D and access to a cluster of A100s and H800 Nvidia GPUs over the past 2-3 years, it is not clear what if any benefit DeepSeek researchers would derive from alleged “research sponsored by the PLA.” Having studied China’s S&T system and defense industry for over 35 years, and having done deep dives on DeepSeek and its origins, the idea that researchers at a firm like DeepSeek would need or derive any benefit from “PLA sponsored research” is highly implausible.
The Exiger report also makes some fundamental rookie mistakes in attributing something nefarious to an apparently very small number of researchers from DeepSeek having graduated from China’s leading STEM universities, for example. The so-called Seven National Defense Universities or Seven Sons (国防七子)1 are cited in the report as evidence of a military association based on assertions (undocumented) that a handful of DeepSeek researchers have graduated from one of the seven. In fact, the vast majority of DeepSeek personnel are from other prestigious Chinese universities such as Zhejiang University, but the idea that graduation from one of the Seven Sons itself carries major significance or proof of defense ties is typical of much “open source” research. It is a superficial attempt to link Chinese organizations without including analysis of the actual connections or how the Chinese education system or other institutions function. Here, any basic understanding of Chinese higher education reveals that these schools are now among leading STEM schools in China, akin to Harvard and MIT. The military connection, which was once important (in the 1960s), is now much less significant in terms of graduates and school choice. Reflecting this, the schools now fall under the Ministry of Industry and Information Technology and are not part of the military academic ecosystem.
Importantly, in a much more rigorous and less tendentious report from Hoover in April on DeepSeek’s talent base, which of course is fully and openly attributed in the five major research papers DeepSeek has published since 2023, Emerson Johnston and Amy Zegart do not even mention the Seven Sons.
They find that the major affiliations of DeepSeek researchers center around the Chinese Academy of Sciences (CAS) and other prestigious Chinese universities.
Johnston and Zegart also find that a small but important number of DeepSeek researchers have affiliations with US universities, but overall conclude that the makeup of DeepSeek researchers correlated with international experience suggests that China’s current AI talent pipeline is much less reliant on US experience than previous generations of researchers. Hence it should not be surprising that DeepSeek researchers’ affiliations are largely from institutions such as the Chinese Academy of Sciences (CAS), prestigious universities, with only a small number having any affiliation with the Seven Sons, with none actually attributed to the group in this detailed report.
Other academics who have studied the education of DeepSeek’s researchers also highlight that their backgrounds are primarily from the most well-known STEM universities:
“Many DeepSeek team members have worked on national-level AI initiatives – such as Tsinghua’s Air Lab and Peking University’s Wang Xuan Institute – where they combined cutting-edge academic research with practical industry experience. This smooth transition from lab work to product development has been central to DeepSeek’s rapid progress.”—Marina Zhang, associate professor at the University of Technology Sydney’s Australia-China Relations Institute.
The Exiger report also goes down the China “government talent recruitment program” rabbit hole much favored by “open source” intelligence consultancies fishing for alleged government affiliations, often displaying misunderstanding of what talent programs constitute and what their function is in China. Chinese government talent recruitment programs, such as the Thousand Talents Program (千人计划), Changjiang Scholars Program (长江学者) and Hundred Talents Program (百人计划), are primarily designed to attract overseas-trained Chinese nationals and foreign experts in strategic STEM fields. Participation in these programs usually indicates at least some temporary formal relationship or cooperation with Chinese state or government institutions, but many times it is simply a preferred way to gain prestige and funding.
Given that only a small fraction of DeepSeek researchers have overseas training and that the Hoover paper suggests that the goal of researchers affiliated with DeepSeek was always to return to China, the relevance of Exiger citing affiliation with talent programs appears to be at best irrelevant in understanding potential avenues of government support for DeepSeek.
These types of “open source intelligence” reports have previously served as precursors to US government actions against Chinese firms, for example domestic foundry leader SMIC, which was placed on the Entity List in December 2020 after the release of such a report in August 2020. The “evidence” of SMIC’s links to the Chinese military, like the Exiger report, was based on tenuous associations with organizations affiliated with China’s defense industry. But the justification in the Commerce Entity List action was:
“We will not allow advanced U.S. technology to help build the military of an increasingly belligerent adversary. Between SMIC’s relationships of concern with the military industrial complex, China’s aggressive application of military civil fusion mandates and state-directed subsidies, SMIC perfectly illustrates the risks of China’s leverage of U.S. technology to support its military modernization.”
When added to the Defense Department’s 1260H list as a “Chinese military company”, SMIC issued a denial, noting that “The Company strongly opposes the decision of United States Department of Defense, which reflects a fundamental misunderstanding by the United States Department of Defense regarding the end-uses of the Company’s business and technology.”
Analysis of SMIC’s customer base shows the firm works overwhelmingly with civilian firms, including foreign companies. The standard link to “military civilian fusion” used in US export controls has never been clearly defined, and alleged military associations for the 1260H list have been successfully challenged by Chinese firms, including smart device maker Xiaomi, suggesting that the evidentiary base of these kind of justifications can be very weak.
The mil civ fusion justification is a convenient catch-all that “open source” researchers prefer to fall back on when there is a lack of any clear evidence of association with China’s military or military-industrial complex. Typically, as is the case with SMIC, there is never an assessment around how much actual or inferred association with entities that may be part of government or military ecosystems constitutes grounds for labelling the entire firm as contributing to specific military aims. By any stretch, the business models of SMIC and DeepSeek, for example, have only a highly tangential association—if any—with actual military end uses.
The new “reports” on DeepSeek clearly fall into the category of weak justifications for US government actions, though that has not prevented measures from being taken against firms previously. Any serious understanding of DeepSeek in fact points to just the opposite conclusion, that DeepSeek’s success surprised bureaucrats in Beijing precisely because the firm was backed from the beginning by a hedge fund and driven by the vision of CEO Liang Wenfeng. It was not the product of any industrial policy initiative or government-funded support, despite the best efforts of “open source” researchers to “prove” otherwise.
The Seven Sons of National Defense are:
Beihang University in Haidian, Beijing
Beijing Institute of Technology in Haidian, Beijing
Harbin Engineering University in Harbin, Heilongjiang
Harbin Institute of Technology in Harbin, Heilongjiang
Nanjing University of Aeronautics and Astronautics in Nanjing, Jiangsu
Nanjing University of Science and Technology in Nanjing, Jiangsu
Northwestern Polytechnical University in Xi'an, Shaanxi
Thanks Paul for your in-depth well researched writings on this important and misunderstood topic.
There are a couple of issues that bother me about Deep Seek:
-Deep Seek scrapes the web to provide answers to users. Then how can deep seek scrape the when access to the web is restricted in China?
- in the event that the answer above is : deep seek has access to the web. Then can Chinese users use deep seek to get around web access restrictions?
- with regards to queries :does deep seek translate the query into foreign lsnguages and translate search results to incorporate into andwers to queries
- with regards to queries in foreign llanguages what are the linguistic capabilities of deep deep seek
- how are they developed?