icon_install_ios_web icon_install_ios_web icon_install_android_web

Research Report on AI Agent Economic Infrastructure (Part 2)

分析9小時前發佈 懷亞特
586 0

Chapter 5 OpenClaw: Specialized Research on the Application Ecosystem

5.1 Project Background and Explosive Growth

In November 2025, Austrian developer Peter Steinberger posted a weekend project to GitHub. Four months later, in March 2026, this project surpassed React to become the software project with the most Stars in GitHub history—over 250,000 Stars, a number it took React 13 years to reach.

Amid the major trend of AI products evolving from passive tools to active Agents, the change OpenClaw made is: AI no longer waits for users to find it; instead, it proactively helps users on their existing platforms. It resides on the user’s computer, simultaneously connecting to over 20 channels including WhatsApp, Telegram, Slack, Discord, Signal, iMessage, and Lark, operating email, calendars, browsers, file systems, and code editors via the MCP protocol. Andrej Karpathy coined a term for such systems: Claws; locally hosted AI Agents that run in the background, capable of autonomous decision-making and task execution. This term quickly became the common parlance for locally hosted AI Agents in Silicon Valley.

Every major model release headlines Agent capabilities because Agents are the demand multiplier justifying AI infrastructure investment: one chat query consumes a few hundred tokens, while one Agent run with tool calls and multi-step reasoning consumes tens to hundreds of thousands of tokens.

Although the founder banned discussions about 加密貨幣currency on Discord, the Crypto community spontaneously built a complete set of on-chain economic infrastructure on top of OpenClaw: token launches, identity registration, payment protocols, social networks, reputation systems, etc. The explosive growth of OpenClaw allowed us, for the first time, to observe the interaction between Agents and on-chain infrastructure in a real, large-scale scenario and provided the Crypto community with a host with a real user base to attach economic activities to.

5.2 Technical Architecture Analysis

First Layer: Messaging Channels — The Identity Problem

OpenClaw connects to 20+ platforms simultaneously. From the Agent’s internal perspective, it knows it’s the same entity, with unified memory, unified configuration, and a unified SOUL.md. But from an external perspective, how do others know that the Agent on Telegram and the one on Discord are the same? Each platform has its own user ID system, which is not interoperable and cannot view behavioral records across platforms. This is precisely the core problem ERC-8004 attempts to solve.

Second Layer: Gateway — The Security Problem

The Gateway is OpenClaw’s brain and dispatch center: routing user messages to the correct Agent, loading that Agent’s conversation history and available Skills, and delineating permission boundaries before the Agent starts thinking (whitelist mechanism: when a message arrives at the Gateway, the system dynamically generates a tool whitelist based on information such as the message source channel, user ID, group ID, etc. Only tools on the whitelist are injected into the Agent’s context. The Agent simply cannot see tools outside the whitelist, so it cannot call them).

The benefit of this design is security-first. However, its permission control relies entirely on the Gateway as a single point of failure. If compromised or misconfigured, the Agent could gain permissions it shouldn’t have.

Third Layer: Agent Core (ReAct Loop) — The Predictability Problem

The Agent’s operational logic is the ReAct (Reasoning + Acting) loop: receive input → think (call LLM) → decide action → call tool → get result → think again → loop. Engineering optimizations made by OpenClaw include: high-frequency message scheduling (four strategies: Steer/Collect/Followup/Interrupt), LLM dual-layer fault tolerance (authentication rotation + model degradation), and an optional thinking tier mechanism (6 levels).

But LLMs are probabilistic by nature; their output is non-deterministic. Agents are non-deterministic executors performing irreversible actions in non-deterministic environments.

First is constraint loss due to context compression: security constraints themselves are part of the context. When the context is lossily compressed, security constraints may be discarded. Second is prompt injection: someone intentionally embeds hidden instructions in content the Agent will process, causing the Agent to treat the content as user commands to execute. The common root of both is: Agent behavioral boundaries are 德菲ned in natural language, and natural language is ambiguous, manipulable, and can be lossily compressed.

An example is when Meta’s Superintelligence Lab alignment lead, Summer Yu, asked an Agent to “suggest some emails that could be deleted,” but the Agent directly deleted hundreds of emails (context window overflow triggered compression, and the key constraint “suggest” was lost).

In such cases, what we need is not better prompt engineering but structural security mechanisms: auditable operation logs, programmable permission boundaries, and an economic system for accountability and compensation when things go wrong. These are precisely what smart contracts and on-chain infrastructure excel at.

Fourth Layer: Memory System — The Persistence and Portability Problem

OpenClaw implements two types of memory: daily working memory (YYYY-MM-DD.md files) and long-term essential memory (MEMORY.md, deduplicated, categorized, and refined key preferences). Retrieval uses a hybrid mode of vector search + BM25.

Sessions are reset by default at 4 AM daily. The context window is constantly compressed and summarized. When the context approaches the token limit, OpenClaw’s approach is to trigger session compression, using an LLM to summarize previous conversations into a shorter version. Before compression, a Memory Flush is executed, giving the Agent one chance to write key information into persistent memory. This essentially bets that the Agent itself knows what information is crucial. A non-deterministic system judging what is key information is itself non-deterministic.

All OpenClaw memory exists on the local file system; it’s gone if you change computers. There’s no shared memory mechanism when collaborating with other Agents. An Agent’s knowledge and experience are locked to the machine it runs on. Sub-Agent collaboration is limited to within the same OpenClaw instance. Once cross-instance, cross-organization Agent collaboration is involved, the system is powerless. Feedback from developers on GitHub: decision records are in chat history but not persisted as artifacts, handovers are ambiguous, and knowledge transfer is incomplete.

5.3 Structural Issues in the Agent Economy

Non-Flowing Context: The Root of All Problems

  • Spatial Lock-in: An Agent’s memory and knowledge reside on the machine running it; gone with a computer change.
  • Trust Isolation: Agent A claims “the user stated preference X last week,” but Agent B has no way to verify its truth.
  • No Discovery: Want to find an “Agent skilled in DeFi analysis”? No standardized discovery mechanism.
  • Unpriced Value: The domain knowledge and user preferences accumulated by an Agent clearly have economic value, but currently, there’s no way to price or trade them.
  • Default Temporariness: Context can be compressed, summarized, or lost at session reset at any time.

For context to truly flow, it needs to simultaneously possess five attributes: able to cross trust boundaries, have economic properties, be discoverable without a gatekeeper, retain decision traces, and adapt to consumer needs. Currently, no single protocol provides all five. MCP solves “how AI models call tools.” A2A solves “how Agents talk to Agents.” x402 solves “how Agents pay.” But “how Agents autonomously discover, evaluate, and use context data in untrusted environments” remains unanswered.

The Coordination Paradox

An Agent only needs sufficient context to reason. But cross-organization coordination requires all historical context.

An Agent thinking “should I book this flight?” needs only the streamlined information from the current session. But when it needs to coordinate with a supply chain Agent, a finance Agent, and a calendar Agent (potentially on different platforms, operated by different organizations): what context do they share? How is it verified? Who owns it?

Gartner predicts that by 2027, over 40% of Agentic AI projects will be canceled due to escalating costs, unclear business value, or insufficient risk control. But 70% of developers report the core issue is integration problems with existing systems. The fundamental reason is that Agents are non-deterministic executors, while enterprises demand deterministic outcomes. An uncertain executor collaborating with uncertain partners in an uncertain environment, without a verifiable trust layer, cannot produce reliable output.

Currently, the demand for cross-platform Agent collaboration is minimal. Users just want an AI that can help them get things done; they don’t care if it can collaborate with other Agents. The coordination paradox is a real technical problem, but whether it evolves into a large-scale commercial problem depends on whether Agent usage evolves from personal tools to multi-Agent collaborative networks.

Combining the above analysis yields an architectural concept:

The bottom layer is where Agents perform reasoning—ephemeral, token-bound. OpenClaw, Claude Code, Cursor operate here. Requires fast response, focused on the current task.

The upper layer is where coordination occurs—persistent, verifiable, economically priced. Cross-organizational knowledge accumulates here, provenance chains are maintained here, reputation operates here.

The two layers have different needs: Agents need simplicity; organizations need historical records. Agents need speed; audit trails need permanence. Agents operate probabilistically; enterprises need deterministic results. Most current architectures attempt to merge the two layers, which cannot succeed.

So, can we add a modular add-on, horizontally deployable without permission, applicable to all Agent systems—with credible neutrality, persistence, and verifiability? This component provides a controlled interface between the upper and lower layers, allowing context to flow down when needed and commitments to flow up when made. Before execution, parse and inject the relevant context subgraph from a decentralized knowledge graph; after execution, submit the operation as a verifiable transaction on-chain, with provenance and reputation updates. The core assumption of this layer is also that context liquidity has value: if most Agent users don’t need cross-platform collaboration (e.g., one person uses only one OpenClaw for everything), then the middle layer has no real demand.

A middle layer focusing only on context portability will likely fail. But if it focuses on use cases with clear economic incentives, such as verifiability of economic activities and portability of reputation in multi-party, mutually distrustful scenarios, the probability of success is much higher. IronClaw is also an attempt moving towards an abstract middle layer—separating the execution environment and credential management into a verifiable security layer. But it remains a solution within the Near ecosystem, lacking cross-platform universality.

Crypto’s Real Entry Point

Most Agent economy needs can actually be solved with Web2 solutions. Crypto’s irreplaceability in the Agent economy exists only in one scenario: when you need cross-organization, cross-platform, permissionless interoperability, and there is no pre-established trust relationship between participants. For example: Agent A (running on OpenClaw, owned by User A) needs to hire Agent B (running on Claude Code, owned by User B) to complete a task. They share no common platform, no common account system, no pre-existing business relationship. In this scenario, on-chain identity (8004), on-chain payment (x402), and on-chain reputation are indeed more suitable than any centralized solution—because no single centralized platform can cover all Agent frameworks.

Furthermore, an Agent being able to pay doesn’t mean it should pay. F500 companies lost $400 million due to Agents repeatedly paying in retry loops. After Agents gain autonomous payment capabilities, the most valuable infrastructure is that which helps Agents decide whether they should make a payment.

Currently, Crypto is “nice to have” for the Agent economy, unless cross-platform economic interactions between Agents reach sufficient scale. But when enough Agents are no longer tied to a specific human’s bank account (when Agents themselves become independent economic entities rather than human tools), traditional financial rails can no longer cover them. At that point, stablecoins are the best (and arguably the only) way for their large-scale fund transactions. Three possible triggers for it becoming a must-have:

  1. Agents start hiring other Agents at scale: e.g., different vendor Agent systems need to interoperate in enterprise IT environments (similar to today’s enterprise API integration, but more complex).
  2. Agents start 24/7 cross-border transactions: An Agent-orchestrated workflow might simultaneously call a US LLM endpoint, a European data provider, and a Southeast Asian compute cluster. It shouldn’t need three different payment rails. Stablecoins are global and 24/7. This advantage is more pronounced for Agents’ always-on, cross-timezone scenarios than for humans.
  3. Micropayments reach a frequency traditional rails cannot handle: Currently, microtransactions made by Agents on-chain (API calls, data queries, compute resources) average only $0.09 per transaction, while Stripe’s fees alone are $0.35+2.5%, 4 times more expensive than the transaction itself. When an Agent needs to call APIs tens of thousands of times, traditional payment processors cannot underwrite this type of merchant risk, and the fee structure becomes a real bottleneck.

Security Threats and the Necessity of On-Chain Infrastructure

The “Siri Paradox” is the key framework for understanding the entire Agent track: Siri is safe because it’s castrated; OpenClaw is useful because it’s dangerous. For AI to truly do things (handle emails, book flights, deploy code), it must have broad system permissions. Broad permissions naturally mean a larger attack surface.

The most famous positive case on OpenClaw is: a user asked an Agent to book a restaurant, but OpenTable had no availability. The Agent didn’t give up; it found AI voice software, downloaded and installed it, called the restaurant, and successfully made a reservation. This kind of autonomous problem-solving ability is what people dream of. But the same autonomy means that if judgment fails, consequences spread at machine speed.

Some call Steinberger joining OpenAI the “iPhone moment for AI Agents.” But before that, there must be a phase where security infrastructure is ready. Otherwise, mass adoption means mass loss. If Chopping Block’s predicted “AI-generated $100M+ hacks” actually occur, there are two directions: either public panic leads to a regression in Agent adoption (similar to Ethereum’s low after the 2016 DAO incident), or it spawns real Agent security infrastructure (similar to the explosion of the smart contract auditing industry after the DAO incident). We lean towards the latter. Because the demand for Agents is real:

  • Malicious Agent Identification >> 8004 Reputation System. If every Agent has an on-chain identity and public reputation record, malicious behavior leaves an immutable record. Other Agents can check on-chain reputation before trusting. Of course, the reputation system needs to be mature enough—not simple scoring, but a multi-dimensional, time-weighted, anti-sybil trust model.
  • Malicious Skills Audit >> Validation Registry. If the code audit results for Skills are recorded in 8004’s Validation Registry—audited by independent validators (staked services, zkML validators, TEE oracles)—the effectiveness of typosquatting is greatly reduced. Just check the on-chain verification status before installing a Skill.
  • Credential Leakage >> x402’s “Payment as Authorization.” x402 eliminates the API Key management problem. Agents don’t need to store long-term credentials—they directly pay for temporary access rights each time a service is needed. Combined with EIP-712 signature binding (binding service usage rights to the payment address), even if a token is leaked, it cannot be used by others.
  • Behavioral Loss of Control >> On-Chain Audit Logs + Programmable Permissions. Whether it’s an external attacker injecting instructions (prompt injection) or the system itself losing constraints during compression (context loss), the result is the Agent performing actions beyond expectations. Smart contracts can define an Agent’s behavioral boundaries—e.g., “single transaction not exceeding X amount,” “delete operations require multi-signature confirmation.” On-chain operation logs are immutable, allowing traceability when problems occur. This is far more reliable than adding “please seek consent first” in the prompt, because prompt-level constraints can be lost during compression, but smart contract-level constraints cannot.

Of course, on-chain infrastructure can only mitigate the consequences of security problems, not prevent them. Smart contracts can limit “single transaction not exceeding X amount,” but what if an Agent, after being injected, continuously does bad things within the limit? 10,000 malicious transactions at $0.09 each is still $900. Truly solving security requires a two-pronged approach at both the Agent runtime layer (TEE/sandbox) and the on-chain layer (permissions/audit). Doing only the on-chain layer is insufficient.

Chapter 6 Comprehensive Industry Analysis

Traditional technology moats (engineering capability, team size, execution efficiency) are being homogenized by AI tools. Anyone with an idea can implement a product prototype in an extremely short time using OpenClaw or Claude Code. This means:

  • The window of opportunity for small teams is shorter than ever (large teams using the same tools will catch up faster).
  • The value of first-mover advantage at the idea level is higher than ever because your Agent can iterate faster than any competitor.
  • The scarcest resource is not technical ability but the judgment to identify the right problems.

The Real Competition is Not Within Crypto

Many compare which L1/L2 does Agents better—Base vs Solana vs Ethereum vs Near. But the real competition is between Crypto solutions and Web2 solutions.

For example, Sapiom raised $15.75M, pursuing a Web2 route for Agent service access management. In an extreme scenario, if Sapiom’s solution is good enough—Agents get access to all Web2 services through it without needing to touch on-chain

本文源自網路: Research Report on AI Agent Economic Infrastructure (Part 2)

Related: As Scaling Accelerates, Ethereum Foundation Launches “Hardness” to Uphold Core Principles

Original Source: Ethereum Foundation Original Compilation: TechFlow Introduction: The Ethereum Foundation recently announced three major protocol cluster priorities: Scaling, User Experience (UX), and Hardness. The first two are easy to understand, but what is the third one? Simply put, Hardness is a protocol-level commitment to Ethereum’s core attributes, including censorship resistance, privacy, security, and permissionlessness. This article is written by three Foundation members responsible for the Hardness direction, detailing the specific work content and priorities in this area. The full text is as follows: What is Hardness The Ethereum Foundation recently published a blog outlining three protocol cluster priorities: Scaling, User Experience (UX), and Hardness. Each addresses different needs for Ethereum’s long-term success. Scaling ensures the network can handle global-level demand, UX ensures people can actually use it, and Hardness ensures…

© 版權聲明

相關文章