AI’s Cost Dilemma: How Infrastructure Economics Will Reshape the Next Phase of the Market

460 0 0

Compiled by: Gonka.ai

AI's Cost Dilemma: How Infrastructure Economics Will Reshape the Next Phase of the Market

AI is expanding at a staggering pace, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world’s computing power, when training costs approach $1 billion, and when inference bills catch startups off guard—the true cost of this computing arms race is quietly reshaping the entire AI industry’s value distribution.

This article is not about who will build the most advanced model. It explores a more fundamental question: Is the current economic model of AI infrastructure truly sustainable at scale? And how will the transformation of computing power allocation mechanisms reshape the entire market’s value distribution?

1. The Hidden Cost of Intelligence

Training a cutting-edge large model often requires tens or even hundreds of millions of dollars. Anthropic has publicly stated that training Claude 3.5 Sonnet cost “tens of millions of dollars,” and its CEO Dario Amodei previously estimated that the training cost for the next-generation model could approach $1 billion. According to industry media reports, the training cost for GPT-4 may have already exceeded $1亿.

However, training costs are just the tip of the iceberg. The real structural pressure comes from inference costs—the fees incurred each time a model is called. According to OpenAI’s public API pricing, inference is billed per million tokens. For high-usage applications, this means that even before scaling, daily inference costs could already amount to thousands of dollars.

AI is often described as software. But its economic essence is increasingly resembling capital-intensive infrastructure—with both high upfront investment and continuous operational expenses.

This shift in economic structure is quietly altering the competitive landscape of the entire AI industry. Those who can afford the computing power are the giants who have already built large-scale infrastructure; startups trying to survive in the cracks are being gradually eroded by inference bills.

2. Capital Intensity and 市场 Concentration

According to Holori’s 2026 Cloud Market Analysis, AWS currently holds about 33% of the global cloud market share, Microsoft Azure about 22%, and Google Cloud about 11%. Together, these three control about two-thirds of the global cloud infrastructure share, and the vast majority of the world’s AI workloads run on their infrastructure.

The practical implication of this concentration is: when OpenAI’s API goes down, thousands of products are affected simultaneously; when a major cloud service provider experiences an outage, services across industries and regions are disrupted.

Concentration is not narrowing; infrastructure spending continues to expand. Taking NVIDIA as an example, its data center business annualized revenue has exceeded $80 billion, indicating sustained strong demand for high-performance GPUs.

More noteworthy is an implicit structural inequality. According to SEC 文件 and market reports, leading labs like OpenAI and Anthropic secure GPU resources at near-cost prices as low as $1.30–$1.90 per hour through multi-billion dollar “equity-for-compute” agreements. Meanwhile, small and medium-sized companies lacking strategic partnerships with NVIDIA, Microsoft, or Amazon are forced to purchase at retail prices exceeding $14 per hour—a premium of up to 600%.

This pricing chasm is driven by NVIDIA’s recent strategic investments totaling $40 billion in leading labs. Access to AI infrastructure is increasingly determined by capital-intensive procurement agreements, not open market competition.

In the early adoption phase, this concentration can appear “efficient.” But at scale, it brings pricing risk, supply bottlenecks, and infrastructure dependency—a triple vulnerability.

3. The Overlooked Energy Dimension

The cost issue of AI infrastructure has another often-overlooked dimension: energy.

According to data from the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, and AI-driven demand growth could significantly increase this proportion in the coming years.

This means compute economics is not just a financial issue, but also an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of power supply will become increasingly prominent—the country that can provide the most stable computing power at the lowest energy cost will hold a structural advantage in the industrial competition of the AI era.

When Jensen Huang announced at GTC26 that NVIDIA’s order visibility had surpassed $1 trillion, he was describing not just the commercial success of one company, but the grand process of civilization converting electricity, land, and scarce minerals into intelligent computing power.

4. Rethinking Infrastructure Mechanisms

While centralized data centers continue to expand, another type of exploration is quietly emerging—attempting to fundamentally re定义ne how computing resources are coordinated.

Decentralized Inference: A Structural Alternative

"(《世界人权宣言》) Gonka protocol is a representative practice in this direction. It is a decentralized network designed specifically for AI inference, with the core design goal of minimizing network synchronization and consensus overhead, and directing as much computing resources as possible to real AI workloads.

At the governance level, Gonka adopts the principle of “one compute unit, one vote”—governance weight is determined by verifiable compute contribution, not capital shareholding. At the technical level, the protocol uses short-cycle performance measurement intervals (called Sprints), requiring participants to demonstrate real GPU computing power in real-time through a Transformer-based Proof-of-Work (PoW) mechanism.

The significance of this design is that nearly 100% of the network’s computing power is directed towards AI inference workloads themselves, rather than being consumed on infrastructure overhead like maintaining consensus and coordinating communication.

The Economic Logic of Distributed Computing Power

From an economic perspective, the value proposition of decentralized compute networks has three layers.

The first is the cost layer. The pricing structure of centralized cloud service providers inherently includes massive fixed asset depreciation, data center operational costs, and shareholder profit expectations. Decentralized networks can significantly compress these costs by monetizing idle GPU resources. Taking Gonka as an example, the inference services currently provided through its USD-billing gateway GonkaGate are priced at approximately $0.0009 per million tokens—while centralized service providers like Together AI price similar models (e.g., DeepSeek-R1) at around $1.50, a difference of over a thousand times.

The second is the supply elasticity layer. The compute supply of centralized service providers is rigid, with expansion cycles measured in months or even quarters. Participants in decentralized networks can elastically join or exit with demand fluctuations, theoretically responding more quickly to demand peaks—just as Amazon Web Services was born from holiday traffic peak demands, the peaks and valleys of AI inference similarly require elastic infrastructure to handle.

The third is the sovereignty layer. This dimension is particularly prominent from the perspective of sovereign nations. When a country’s public services deeply depend on an external cloud service provider, compute dependency becomes a strategic vulnerability. Decentralized networks offer a possibility: local data centers can join the global distributed network as nodes, ensuring data sovereignty while obtaining sustainable commercial returns by providing computing power to the global market.

5. A Moment of Value Distribution Restructuring

Returning to the core question at the beginning of the article: Is the current economic model of AI infrastructure sustainable at scale?

The answer is: For the top players, sustainable; for everyone else, increasingly unsustainable.

AWS, Azure, and Google Cloud have built moats through decades of capital accumulation, and their scale advantages are almost unshakable in the short term. But this structural advantage also means that pricing power, data access, and infrastructure dependency are highly concentrated in the hands of a few private entities.

Historically, every major monopoly in technological infrastructure has ultimately spawned alternative distributed architectures—the internet itself was a rebellion against telecom monopolies, BitTorrent disrupted centralized content distribution, and Bitcoin challenged centralized currency issuance.

The decentralization of AI infrastructure may not be an ideological choice, but an economic inevitability—when the cost of centralization becomes high enough to drive large-scale user migration, the demand for alternatives will truly explode. Jensen Huang used the analogy “every financial crisis pushes more people towards Bitcoin” to describe this logic, which also applies to the compute market.

The emergence of DeepSeek has already proven one thing: in a world where open-source models’ capabilities approach closed-source frontiers, inference cost will become the core variable determining the speed of AI application scaling. Whoever can provide the lowest-cost, highest-availability inference computing power holds the entry ticket to this competition.

Conclusion: The Infrastructure War Has Just Begun

The next phase of AI competition will not be decided on model capability leaderboards, but in the economic game of infrastructure.

Centralized compute giants hold capital and scale advantages, but also bear the burden of fixed cost structures and pricing pressure. Decentralized networks are entering the market with extremely low marginal costs, but need to prove they can reach real commercial thresholds in stability, ease of use, and ecosystem scale.

The two paths will coexist long-term and pressure each other. The tension between centralization and decentralization will be one of the most important structural themes to track in the AI industry over the next five years.

This infrastructure war has just begun.

About the Author

Anastasia Matveeva is a Senior Product Manager and Researcher at Product Science, and also a co-founder of the Gonka protocol. Her research focuses on machine learning infrastructure, large language model inference, and distributed computing systems.

She holds a Ph.D. in Mathematics from the Universitat Politècnica de Catalunya (UPC Barcelona), where she also served as a researcher and lecturer. Since joining Product Science in 2021, she has led the development of a suite of AI engineering tools, now adopted by over a hundred engineers and used in several Fortune 500 companies.

About Gonka.ai

Gonka is a decentralized network designed to provide efficient AI computing power, with the goal of maximizing the utilization of global GPU resources for meaningful AI workloads. By eliminating centralized gatekeepers, Gonka provides developers and researchers with permissionless access to compute resources, while rewarding all participants through its native token, GNK.

Gonka is incubated by the US AI developer Product Science Inc. The company was founded by Web 2 industry veterans, former Snap Inc. core product directors, the Libermans siblings, and successfully raised $18 million in 2023, with an additional $51 million in 2025. Investors include OpenAI backer Coatue Management, Solana backer Slow Ventures, Bitfury, K5, Insight and Benchmark partners, among others. Early contributors to the project include notable leaders in the Web 2-Web 3 space such as 6 blocks, Hard Yaka, and Gcore.

本文来源于互联网： AI’s Cost Dilemma: How Infrastructure Economics Will Reshape the Next Phase of the Market

Related: Dissecting 112,000 Polymarket Addresses: The Top 1% Who Truly Profit Are All Doing These Five Things

Compiled by | Odaily (@OdailyChina); Translator | Asher (@Asher_ 0210) After systematically sorting and analyzing over 112,000 Polymarket wallets and six months of on-chain data, a rather intuitive yet surprising result emerged. Approximately 87.3% of users ultimately lost money trading on the platform. This analysis covered multiple key dimensions, including every on-chain transaction record, trading volume, win rate, profit/loss, market types participated in, entry timing, and position size. The entire data processing took three weeks, and the final conclusion contradicted many people’s intuition. Many believe that top players in prediction markets often possess some obvious advantage, such as having insider information or using obscure, complex computational models. However, the data shows this is not the case. The top 1% of players consistently and persistently do a few things right and…