Goldman Sachs: What Does DeepSeek V4 Mean for China's AI?

wallstreetcn · Apr 26 16:35

Goldman Sachs believes that DeepSeek V4 significantly reduces the cost of long-context inference through architectural innovations such as mixed attention. Its core significance lies in supporting the deployment of complex intelligent applications at a lower cost. V4 explicitly bets on Huawei's Ascend 950, with expectations that API pricing will drop notably following mass production in the second half of 2026. Competition among domestic AI models is intensifying, with programming capabilities and multimodal functionalities becoming key differentiators. Goldman Sachs maintains its top pick rating for cloud computing and data centers.

Goldman Sachs believes that the core significance of DeepSeek V4 lies in supporting more complex agent applications at a lower cost, thereby opening up new space for the scaling of AI applications.

According to TradingView Wind, on April 24, the Goldman Sachs team led by Ronald Keung published a research report stating that the newly open-sourced V4 model is a continuation of DeepSeek's efficiency-first and open-source approach.

At the technical level, V4 achieves significant cost reduction for long context windows through architectural upgrades and explicitly bets on Huawei's domestic chips. In the market dimension, this release accelerates the intensification of competition among AI models in China, with programming capabilities, task completion rates, and multimodality becoming the key dividing lines for pricing power.

Goldman Sachs maintains its recommendation rating for the cloud computing and data center sectors, as continuous improvements in computational cost efficiency will drive the accelerated adoption of AI applications. The dual drivers of enterprise AI agent growth and consumer AI assistant proliferation will support sustained improvement in cloud service pricing power.

V4 architecture upgrade supports longer context with less memory.

DeepSeek V4 was released in two versions: Pro and Flash.

The Pro version is the flagship scale, with 1.6 trillion parameters (49 billion active parameters), while the Flash version is relatively lightweight, with 284 billion parameters (13 billion active parameters). Both models support an ultra-long context window of 1 million tokens, matching top U.S. models (SOTA), but requiring significantly less memory and KV cache.

According to the Goldman Sachs report, under a 1-million-token context scenario, the floating-point operations (FLOPs) required per token inference for V4 Pro are only 27% of those for DeepSeek V3.2, with KV cache usage being just 10%; V4 Flash is even more aggressive, reducing FLOPs to 10% and compressing KV cache to 7%.

This efficiency leap was achieved through three key architectural innovations:

In terms of hybrid attention mechanisms, V4 introduces a mixed architecture of Compressed Sparse Attention (CSA) and Highly Compressed Attention (HCA). CSA first compresses the KV cache along the sequence dimension before performing sparse attention calculations, while HCA adopts more aggressive compression but retains dense attention. Together, they significantly reduce the temporary memory required for long inputs.
In terms of training stability, V4 introduces the mHC mechanism to enhance the stability of information transmission across multi-layer networks.
At the same time, Muon is adopted as the main training optimizer (with some modules retaining AdamW) to accommodate a more complex network architecture than V3 and improve the convergence quality of the training process.

Goldman Sachs pointed out that the aforementioned efficiency gains are particularly significant for long-duration task scenarios, with a typical use case being long-cycle proxy tasks requiring the processing of large amounts of context.

It is worth noting that DeepSeek remains focused on fundamental text models, while internet giants such as Alibaba, ByteDance, and MiniMax, along with independent model developers, are leaning towards multimodal/full-modal approaches, indicating a clear divergence in AGI exploration paths.

Domestically produced chips are accelerating their deployment, with Huawei Ascend 950 paving the way for price reduction.

Another key signal from this V4 release is that DeepSeek explicitly incorporates the mass production of Huawei Ascend 950 super nodes into its commercial roadmap.

DeepSeek anticipates that with the large-scale supply of Huawei Ascend 950 super nodes by the second half of 2026, the API pricing for the V4 Pro version will see a significant decrease.

A Goldman Sachs report indicates that this statement carries dual implications.

First, DeepSeek’s cost competitiveness will be further strengthened, creating conditions for broader application deployment.
Second, against the backdrop of tightening chip supplies, the trend of China's leading AI models migrating to domestic computing power has been explicitly endorsed by top players.

According to Goldman Sachs data, the pricing of V4 Pro on mainstream API platforms has already become competitive. This advantage is expected to further expand in the second half of 2026 as domestic computing power supply increases.

Domestic AI model competition enters a phase of differentiation.

The open-source release of DeepSeek V4 quickly triggered a new wave of intensive follow-ups within China’s AI model camp.

According to Goldman Sachs’ summary, players that have recently launched new models include: Kimi K2.6, Alibaba Qwen3.6-Max, Tencent Hy3 Preview Edition, Xiaomi V2.5, and MiniMax M3/Hailuo, which is expected to be released in May.

In Goldman Sachs' view, the key differentiating factors that will determine pricing power among models in the future will focus on two dimensions:

Programming/task completion success rate, with Zhipu's GLM model ranking at the top in terms of coding capabilities.
Multimodal capabilities, where ByteDance, Alibaba, and MiniMax have made the most profound investments in this direction.

The research report points out that the strengths and weaknesses of two types of players are clear:

Independent AI players like MiniMax, characterized by high organizational efficiency and short decision-making chains, can still achieve a 40% gross margin even with extremely low basic text API pricing, according to Goldman Sachs' forecast data.
Internet giants such as ByteDance, Tencent, and Alibaba, with ample cash flow from core businesses, are more suited to invest in AI infrastructure and cloud tracks. They need to establish independent AI team incentive plans to retain talent, such as ByteDance’s DouBao team, which already has an independent incentive program.

Notably, citing news reports, the Goldman Sachs report mentioned that Tencent and Alibaba are in talks to invest in DeepSeek at a valuation exceeding 20 billion US dollars. Meanwhile, the latest market valuations of Zhipu and MiniMax are approximately 53 billion US dollars and 31 billion US dollars, respectively. This potential deal reflects the logic of major players competing for scarce top-tier AI capabilities.

The primary investment thesis remains unchanged: cloud computing and data centers.

Goldman Sachs maintains its view that cloud computing and data centers are the preferred sub-sectors within China's internet industry, based on the following rationale:

The continuous growth in demand for AI tokens will drive an increase in cloud service procurement volumes.
The expansion of enterprise clients and AI agents is enhancing the pricing power for cloud services and AI tokens.
The ongoing penetration of consumer-grade AI assistants is contributing incremental demand.

In the To-B enterprise cloud market, Alibaba leads with the largest external AI cloud revenue scale; in the To-C consumer market, ByteDance currently operates the platform with the highest daily token consumption for AI chatbots. The DAU of China’s AIGC applications has maintained robust growth overall, with a month-over-month growth rate of 36% as of March 2026.

Regarding key recommended stocks, Goldman Sachs continues to emphasize four core holdings—GDS Holdings, 21Vianet, Alibaba, and Kingsoft Cloud—as the primary allocation direction to capture the expansion红利 of China’s AI infrastructure.

Additionally, the second tier includes e-commerce and mobility sectors, the third tier comprises AI model-related stocks, and the fourth tier covers gaming and entertainment sectors.

Editor/Lee

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to EleBank. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.