Assessing China's support for open weight models, and the real meaning of DeepSeek's latest release
Open source/weight AI is not the same as open source software—and models are far from “free.” DeepSeek V3.1 reflects the #DeepSeekEffect, not the “real DeepSeek Moment”
Much breathless prose has been written about China and open source or, more precisely, open weight AI models. Chinese companies now sport four of the top five or so open such models, with commentators claiming these models being free means that China will lead on AI and that the US government should support more open source/weight models. Now is the time to step back and assess what “open source/weight” models are, how they differ from open source software, and what they mean for China and US-China competition.
Open source/weight models are not really open source, nor are they “free”
The “open sourcing” of advanced AI models—which in practice means making model weights public—is fast becoming the norm in the industry, though leading US and Chinese AI labs are likely to continue to keep their most advanced models proprietary, open sourcing previous model versions to ensure some uptake by developers as open source/model weight competition rises.
A few key clarifications: open source/weight models are not “open source” in the same way as deterministic software. What is actually released is model weights. AI models are probabilistic platforms, very different from deterministic software which has long been open sourced, starting with Linux and other alternative operating system software, and later applications such as Open Office. The term “open source” is also applied to hardware approaches such as the RISC-V architecture. The goals of open sourcing vary for different technologies. For open source deterministic software, the argument is that the broader open source community can examine the actual source code, identifying vulnerabilities, and come up with new ways to improve the original software code. This is not the case with AI models, where typically only the weights are released publicly. There are a range of ways to “open source” AI models; they are covered by different licensing arrangements, but there is no way for developers to change the weights.
“For AI systems to align with typical open-source soft ware, they must uphold the freedom to use, study, mod ify and share their underlying models. Although many AI models that use the ‘open source’ tag are free to use and share, the inability to access the training data and source code severely restricts deeper study and modification.”—Stefano Maffulli, Executive Director of the Open Source Initiative
Last year, there was a major backlash from the open source software community against the use of “open source” with respect to AI models. A central point of contention was the practice of “openwashing,” where companies label AI models as open source without fully meeting well-established criteria for open sourcing software. For traditional open source software, users have the freedom to use, study, modify, and share the code. The Open Source Initiative (OSI), a leading advocate, found that popular models from Meta, Microsoft, and Mistral AI do not align with their principles and has attempted to more clearly define what open source AI means. This definition remains controversial, including the issue of whether training data is part of the “source” of the AI model.
Secondly, open source/weight models are not “free.” For any developer using an open source/weight model, there will be multiple costs: optimizing for a particular application, training on a proprietary data set, and running inference using the modified model/platform—far from free. The real question is whether the business models around open source software and open source/weight AI models similar? Here the answer is yes and no. Yes, as I have noted, once DeepSeek was released, 01.AI CEO Kai-fu Lee determined that he could no longer compete on model development, and turned to support open source/weight deployments of DeepSeek’s model(s). Companies wishing to deploy a particular model require support in cleaning up data, training the model on proprietary data, and optimizing inference applications based on the model. This was a viable business model for RedHat with Linux, and should be a viable business model for 01.AI and the many other companies in China providing support for open source/weight models. In short, open weight AI models are neither entirely open, nor free.
Why are Chinese companies adopting elements of the open source approach for AI models?
Over the past year, Chinese companies and organizations have gone from using Western open source models and the beginnings of the current trend to move to open source/weight models, to dominating global releases of non-proprietary models. As Kendra Schaefer and I wrote last summer:
A small number of Chinese AI companies and organizations, including Alibaba and the Beijing Academy of Artificial Intelligence, have open sourced some models, while others are considering it. But there is not a clear counterpart to a large company like Meta open sourcing an advanced model such as Llama-2. Those pushing open sourcing of more advanced LLMs believe that it will spur innovation and accelerate research by fostering a collaborative environment, lead to broader application of LLMs, provide for community engagement and feedback, and provide educational benefits. Companies that do not favor open sourcing of powerful LLMs are concerned that the models could be leveraged by malicious actors and that loss of control could result in quality and reliability issues. Companies like leader OpenAI believe in keeping models proprietary to protect intellectual property and commercial interests.
The #DeepSeekEffect, of course, has been an unusual and potent catalyst in the decision of major Chinese AI developers to move to an open source/weight model approach. I have addressed this initially in terms of impact within China here and outside China here.
Fast forward to this summer and 1) large numbers of Chinese companies and organizations have open sourced most or all of their models, 2) Alibaba, Tencent, and Baidu have open sourced models, and 3) DeepSeek has grabbed headlines in the space by not only releasing open source/weight models, but coupling releases with detailed research papers laying out how key innovations were implemented in the models. This process will continue for DeepSeek with further model releases, despite claims from some quarters that DeepSeek will pull back on this process, or is under some type of “control” by the Chinese government. This reflects a mirror imaging of concerns in some quarters that as models become more capable, companies and governments will need to intervene in model releases. We are not there yet. Here some companies still believe that releasing weights for powerful models could mean malicious actors can leverage advanced AI in areas such as cyber operations, or to design weapons of mass destruction.
This debate about open versus proporietary models, the safety of proprietary or open models, continues apace; but China’s position, both within companies and the AI safety community, has not pulled firmly towards seeing open release of models as better and more “safe.” At the World AI Conference (WAIC), which I attended last month, this debate was front and center. Visiting xAI safety lead Dan Hendrycks made waves by calling for banning of all open source/weight models, arguing that releasing highly capable models was tantamount to handing every citizen a biological weapons capability. At the closed door meeting at WAIC that I attended, some Chinese participants argued that the US government should force US AI labs to release open source/weight models. Since that time, OpenAI and xAI have released open sourced versions of older models, but have not pledged to change their business models, which are predicated on charging for API access. Clearly we are far from an emerging consensus on this issue.
What reasons are Chinese companies giving for releasing model weights?
Circumventing hardware sanctions: enables innovation despite import restrictions
Developer ecosystem growth: accelerates global collaboration and improving models
Cost competitiveness: undercuts Western models with efficiency and affordability
Challenging Western AI dominance: aims to shift global model licensing and access norms
National strategic alignment: supports China's self-reliance and soft international power
Brand & recruitment benefit: builds ecosystem, reputation, and attracts talent
The issue of what licenses allow under “open source/weight” models differs from lab to lab. (See this table.1) “No improving other models” clauses: Meta and Tencent explicitly prohibit using their models or outputs to improve other models, with exceptions for their own families. Gemma goes further by defining Model Derivatives to include models trained on Gemma outputs. Qwen allows the use of outputs to train other models but requires attribution (“Built with Qwen”). Apache/MIT family (OpenAI gpt-oss, Microsoft, Baidu) doesn’t impose these use-based limits.
Where is Beijing, CAC, MIIT, CAICT, NDRC on the open source/weight issue?
Chinese government documents related to AI released over the past six months or so all now contain a nod to supporting open source/weight models. The new Opinions of the State Council on Deepening the Implementation of the “Artificial Intelligence Plus” Action released in late August puts the government’s view of open source/weight models this way:
“[We should] promote the prosperity of the open source ecosystem. Support the development of AI open source communities, promote the convergence and openness of models, tools, and datasets, and cultivate high-quality open source projects. Establish and improve evaluation and incentive mechanisms for AI open source contributions, and encourage universities to include open source contributions in student credit certification and faculty achievement recognition. Support enterprises, universities, and research institutions in exploring new models for inclusive and efficient open source applications. Accelerate the construction of a globally open open source technology system and community ecosystem, and develop open source projects and development tools with international influence.”
So, central authorities like open source/weight models. There is no hint here of the concerns raised by Hendryks and others around the dangers of releasing very capable models that anyone can use. Significantly, the Opinions contain no mention of AGI or ASI, and traditionally it has been Chinese companies, such as DeepSeek, which have emphasized this issue, with the particular objective of achieving AGI in mind. Some commentators have suggested Beijing is deliberately not stating a goal of AGI/ASI, but this seems more like mirror imaging rather than a clear-eyed assessment of Beijing’s actual day-to-day goals for AI deployment, including open source/weight models, as expressed in the Opinions. If anything can be asserted about the Chinese government and its relation to AGI/ASI, it is likely that it is the inverse of the US government in terms of focus and priority. No, senior leaders in Beijing are not laying awake at night worrying about who gets to AGI/ASI first. They have other fish to fry.
The other fish include open source/weight models. Over the past six months, here is a ministry-by-ministry read-out of what has actually been said about open source/weight models. The short take: Beijing has leaned into “open-source LLMs as productivity infrastructure,” while keeping CAC’s service-level filing regime front and center. No new rule targeted specifically at “weight releases” has so far surfaced; the emphasis is on how services built on models (open or closed) comply.
MIIT(工业和信息化部)
Has signaled support for open-sourcing models as industrial drivers. At the State Council mid-year briefing, MIIT highlighted that “China’s AI models that have been cultivated and open-sourced are accelerating application” across electronics, materials, and consumer goods.
Infrastructure to make (open) model training practical. MIIT’s June “Compute Interconnection Action Plan” aims at multi-cluster mixed training (万卡/十万卡混训) and flexible “card-hours/machine-hours” services—meant to lower training/serving frictions for big models, open or closed.
Explicit pro-open-source rhetoric. MIIT leadership at the OpenAtom events in July framed open source as “core infrastructure” and even cited DeepSeek’s “open-source-first” as a model for industrial innovation.
CAC(国家网信办)
Focuses on filing (备案) as the compliance linchpin for any generative AI service offered to the public—including apps that call a model via API. CAC’s Q1 bulletin re-states that such apps must register and publicly display the model name + filing number. (This applies regardless of whether the underlying model weights are open.)
Scaled approvals & open-source framing. By mid-June, CAC reported 433 models had completed filings; CAC’s own write-ups described China’s models as a new paradigm for global AI development that is “open-source, low-cost, high-efficiency.”
Policy tone favors “open co-building.” CAC’s May commentary—“Vigorously advance China’s LLMs”—explicitly says “open-source co-construction + full-chain collaboration” are twin drivers of the ecosystem, and notes that “many large-scale domestic AI models, through open source development, have attracted global developers to participate in technological iteration, breaking the Western technology monopoly and increasing accessibility.”
NDRC(国家发展改革委)
Has framed open-source LLMs as “inclusive AI.” NDRC’s July casebook on international AI cooperation highlights this, with the formulation, “Open-source large models drive AI technology toward inclusivity and broad accessibility.”( 开源大模型,推动AI技术走向普惠)
Macro briefings echo the open-source storyline. Late July press materials reference “open-source models…continuously emerging” as part of China’s innovation picture.
CAICT(中国信息通信研究院, under MIIT)
Has been programmatically promoting open-source LLM adoption. CAICT launched “开源大模型+” typical-use case selections in July, explicitly encouraging enterprise integration of open-source models (e.g., DeepSeek, Qwen).
Standards work with a safety slant. The MIIT AI Standardization Committee—with materials hosted by CAICT—lists new work items on “big-model deployment security requirements” and explainability specs—i.e., governance of use rather than licensing of weights. By codifying maturity and governance standards for open-source models, MIIT is signaling strong state support for “开源大模型” (open large models) as both innovation enablers and as tools for ensuring the universal accessibility (普惠) of AI technology. The document frames open source/weight AI models as a key pillar of China’s national AI standardization strategy for 2025, with planned standards around maturity evaluation, openness levels, community governance, and ecosystem integration.
Ecosystem measurement. CAICT’s 2025 “Big-Model Cloud Value Matrix” assesses LLM cloud providers and signals institutional support for industrial deployment (including for open model families). Here, CAICT calls out specifically collaboration between Baidu Smart Cloud and China Merchants Bank on computing power using the Kunlun Core P8002—to build a stable and efficient computing power infrastructure for China Merchants Bank, fully supporting the stable operation of various open source large models.
MOST(科学技术部)
Using central messaging: push “AI+” and scene pilots; local measures encourage open-source collaboration. MOST’s provincial/municipal documents over spring–summer backed open-source collaboration and model pilots , for example, Wuhan/East Lake “open-source ecosystem” support and rewards for model filings).
Senior-level commentary (not a formal rule): In July, Li Meng, a former MOST vice-minister, publicly called open source an important trend—while urging top-level risk controls on privacy/IP/misuse for open models. MOST has been involved in international outreach on AI governance, leading Chinese delegations to AI safety conferences in Seoul last year and Paris this year.
All of this said, the issue of open source/weight models breaks down along several geopolitical, technological, economic, and AI governance lines. As noted up front, leading model developers do not think making a distinction between closed and open models is meaningful in terms of adoption. Neither is “free,” and deployment comes down to capability and suitability for a particular task. At the geopolitical level, the challenge becomes how or if to regulate the use of open source/weight models across borders, as the perception is that leading open source/weight models could become dominant in some applications, with impacts in terms of uptake, diffusion, and economic impact. This is a complex issue, as many applications, for example, will use multiple models, including both open and closed models. Tesla’s recent decision to use DeepSeek and Bytedance models is a good example of this. At the AI governance level, there is at least the perception that open source/weight models are safer, but this is not necessarily the case, as the analogy with open source software is imperfect and, as Hendrycks stresses, releasing the weights for more and more capable models presents clear risks around their use by malicious actors. Over the next six months, with the releases of new models from DeepSeek, Alibaba, Moonshot, Minimax, and other model developers in China, alongside new open sourced versions of models from OpenAI and xAI, the debate around this issue will continue to dominate discussions at the geopolitical and AI governance levels in particular.
The real meaning of DeepSeek’s reference to FP8 standard
Recently, much attention has been focused during the debate around open source/weight models on DeepSeek’s role and the firm’s place in China’s AI hardware/software ecosystem. Recently, DeepSeek released its V3.1 model, aimed at improving performance for agentic AI applications. But DeepSeek also included a cryptic sentence which sent Chinese AI hardware stocks like Cambricon soaring on the STAR market. Some called this the real “DeepSeek Moment.” What is going on here?
It is really just more of the #DeepSeekEffect that I have called out, meaning continued focus on model innovation, cooperation internally on hardware/software ecosystem development, consumer enterprise, and government AI application deployments.
While DeepSeek’s new model release appears to be primarily directed at optimization for agentic AI applications, the single line reference to UE8M0 FP8 support has generated considerable debate and impacted the stocks of Chinese AI hardware players such as Cambricon, which supports the standard. Kevin Xu has called it the “real DeepSeek Moment.” Although I agree with Kevin that this is important, I do not believe it is a real gamechanger, as the whole sector has been moving in this direction, and it is not surprising that DeepSeek is well aware of this and moving to support the approach and align with hardware vendors—who are also already considering building in support for FP8.
FP8 is already becoming a mainstream approach, and in addition, UE8M0 is not a new data format but is encoding used for MXFP8 (microscaling FP8). Hence, MXFP8+E8M0 is becoming the practical default for high-efficiency training, because it reduces compute time and energy use. In releasing the nod to UE8M0, DeepSeek was likely asserting that domestic hardware producers will be supporting this approach in the future, including Huawei, based on the firm’s deep understanding of AI hardware in general, on industry trends in particular, and on discussions it is likely having with all major domestic hardware vendors. The current Huawei Ascend 910C that is part of the Cloud Matrix 384 super cluster I saw at the WAIC does not support native FP8 calculations, making it difficult for DeepSeek to use Huawei hardware for training. We heard this from a number of key AI infrastructure players at the WAIC. Smaller players such as Cambricon, Moore Threads, Hygon, MetaX, and Biren will all almost certainly be supporting the new approach. Here, UE8M0 is best understood as an emerging consensus format in China for training and inference that is seen as bridge to domestic silicon that may not yet support native FP8.
Critically, Huawei is likely adding native support for FP8 as it redesigns its Ascend processor to be a real general purpose GPU (GPGPU). Huawei’s current Ascend 910C, like earlier Ascend-series processors, does not natively support FP8. This has been a noted limitation for some AI training workloads AND has led companies like DeepSeek to fall back to Nvidia GPUs (with FP8) for model training. the A100s that DeepSeek has do not support FP8, but the A800 and H800 GPUs do.3 Announced as a successor, the Ascend 920—and variant 920C—is built on a 6 nm process with high-performance metrics: over 900 TFLOPS and 4 TB/s memory bandwidth using HBM3. However, while there is presently no direct confirmation that this series includes native FP8 support, and the disclosed specs focus on throughput and memory rather than precision formats, it is highly likely that any new Huawei Ascend processor will include FP8 support. This is what DeepSeek was referring to in the V3.1 release. The Ascend 910D is entering testing and sampling, leveraging advanced packaging and multiple die stacking. Its goal is to rival NVIDIA’s H100. Public disclosures do not yet state whether it supports FP8 natively; but if it is a rival to the H100, it would likely need to include such support.
In addition, there is an outside possibility that DeepSeek is referring to something akin to UE8M0. Huawei may also be exploring alternate FP8-like formats—HiFloat8 (HiF8), a floating-point 8-bit format developed by HiSilicon—though that currently remains a research concept. Industry commentary aligns with the idea that Huawei is thinking a lot about FP8: this format is part of deep vertical integration, showing Huawei’s ambition to optimize at the hardware‑software stack level for maximum efficiency. Tests indicate it outperforms well-tuned FP8 baselines in some scenarios, offering incremental but valuable gains in large-scale training contexts.
The Information also reported this week that DeepSeek was using some Huawei GPUs for training some models, while still using Nvidia GPUs for its more advanced models. This is hardly surprising: all Chinese AI developers are looking at domestic alternatives from some applications and platforms, while still sticking with Nvidia for training their most advanced models. This is just a transition; the real inflection point will come when Huawei Ascends and other domestic GPU makers all support FP8! This is coming sooner than people think.
Indeed, reporting this week notes that Alibaba has a new AI ASIC under development. Alibaba has been working on such chips for some time, and the latest design—likely manufactured by SMIC—probably supports FP8. It may even be among the processors DeepSeek was hinting at. On Alibaba and its Hanguang AI ASICs, I wrote extensively about it here, touching on how all the major AI developers were pursuing their own AI-optimized ASICs. I will have more on this as it develops.
Finally, more speculation about the state of domestic semiconductor manufacturing hit the media in late August, with FT reporting that Huawei’s indigenous production capabilities for advanced node semiconductors could kick off by the end of the year, with the goal of tripling output of advanced node semis, including Ascend AI GPGPUs, next year. Much of this has been known or rumored for some time—but a lot of murkiness remains around what is and isn’t real.
In an upcoming deep dive on China’s domestic semiconductor industry slated to run in a major geopolitical journal in October, I will tackle this issue, which is complex and invovles extrapolating from a lot of data, while getting through the hype. Suffice to say, that a lot of effort is going into expanding the capacity of certain firms to produce AI hardware that will support the next generaiton of Chinese AI models.
OpenAI gpt-oss (Apache-2.0) — “Available under the flexible Apache 2.0 license.” OpenAI
Meta Llama 3 license (MAU>700M; no using outputs to improve other LLMs). Hugging Face
Google Gemma terms (derivative definition includes output-trained models; pass-through restrictions; no rights in outputs). Google AI for Developers
Microsoft Phi (MIT on HF). Hugging Face
NVIDIA Open Model License (commercially usable; derivatives allowed; no output ownership). NVIDIA
Alibaba Qwen license (MAU>100M gate; attribution if you train on Qwen outputs). Hugging Face
Tencent Hunyuan community license (territory exclusions; MAU>100M gate; no using outputs to improve other models). Hugging Face
Baidu ERNIE 4.5 (Apache-2.0). Hugging Face
Zhipu GLM licenses (registration for commercial; some variants MIT). Hugging Face+1
DeepSeek R1 MIT; DeepSeek-V3 separate model license. GitHubHugging Face
For more on the P800 and Baidu’s cooperation with Samsung ot manufacture this AI semiconductor, see “Small Yard, High Fence”, Becoming “Small Gain, High Cost.”
H800 is essentially a modified, export-compliant variant of H100, tuned for the China market under U.S. export controls (reduced interconnect bandwidth, slightly altered perf/watt envelope).
Architecturally, it is still Hopper generation, so it retains the H100’s FP8 capabilities — meaning native E4M3 and E5M2 FP8 support via the Transformer Engine.
Great piece, very helpful and clears up some ongoing confusion.
On UE8M0, if this is becoming the new standard, do we know what/who is driving this specifically? Is this Huawei or DeepSeek-led or more the emergence of loose coordination across hardware and model developers?