Source: Cyber
Cyber, EigenLayer, Sentient, and 14 other blockchain and artificial intelligence projects announced today the establishment of the Crypto AI Benchmark Alliance (CAIBA). This open-source, community-driven alliance will focus on establishing transparent and trustworthy evaluation standards for AI models and agents in the crypto industry.
The founding members—Alchemy, Cyber, Dune, EigenLayer, Goldsky, IOSG, LazAI, Magic Newton, Metis, MyShell, OpenGradient, RootData, Sentient, and Thirdweb—will collaborate to contribute datasets, tools, and expertise to build the evaluation framework together. Each benchmark will include tasks, reference answers, and scoring scripts, and will be released on platforms such as GitHub and Hugging Face under open licenses (where permitted).
As AI's applications in the crypto field continue to expand, covering everything from trading strategies to research assistants, traditional AI benchmarks have struggled to reflect the industry's unique needs. CAIBA aims to address this gap by introducing specialized evaluations for the crypto scene.
"Transparent and rigorous testing is crucial," stated Ryan Li, Co-founder of Cyber. "Models must not only answer questions correctly but also execute reliably, giving users more confidence in their decisions."
The alliance's first achievement, a Benchmark for Crypto AI Agents (CAIA), is now live. It evaluates AI capabilities from three main dimensions:
· Knowledge: Accurately answering questions about protocols, tokens, etc.
· Planning: Devising multi-step task plans
· Action: Completing operations using a block explorer and API
CAIA covers scenarios such as tokenomics, on-chain analysis, project research, and transaction processes, evaluating both general large models like GPT-4o, Claude 4, Gemini 2.5, DeepSeek-R1, and various crypto-native models.
By testing models in real-world tasks, CAIBA has established a unified, reproducible measurement standard for crypto AI, empowering the industry to build more trustworthy intelligent applications. The alliance is working on developing more benchmarks and welcomes new members. Developers, researchers, and protocol teams can submit models for evaluation or propose entirely new tasks.
The Crypto AI Benchmark Alliance is a community-governed open alliance dedicated to developing AI benchmarking standards tailored to the crypto scene. Through open datasets, reproducible tasks, and public leaderboards, CAIBA provides tools for developers, researchers, and protocols to measure and enhance AI systems in blockchain applications. For more information, please visit caiba.ai.
This article is a contributed content and does not represent the views of BlockBeats.
Welcome to join the official BlockBeats community:
Telegram Subscription Group: https://t.me/theblockbeats
Telegram Discussion Group: https://t.me/BlockBeats_App
Official Twitter Account: https://twitter.com/BlockBeatsAsia