What is the way out for homogenized AI infrastructure?

This article is for educational and informational purposes only and does not constitute any investment advice. Please cite the source if you wish to republish and contact the IOSG team for authorization and guidelines on reproduction. All projects mentioned in the article are not endorsements or investment recommendations. Thanks to the feedback from Zhenyang@Upshot, Fran@Giza, Ashely@Neuronets, Matt@Valence, and Dylan@Pond. This research aims to explore which areas of artificial intelligence are most important for developers, and which might be the next emerging opportunities in the Web3 and AI fields. Before sharing new research insights, we are excited to announce our participation in RedPill’s $5 million Series A funding round and look forward to growing together with RedPill! TL;DR As the integration of Web3 and AI becomes a focal point in the cryptocurrency world, the construction of AI infrastructure is booming. However, there are still relatively few actual applications utilizing AI or built for AI, and issues of homogenization in AI infrastructure are gradually emerging. Our recent involvement in RedPill’s Series A funding round has led to some deeper insights. The main tools for building AI DApps include decentralized OpenAI access, GPU networks, inference networks, and proxy networks.GPU networks are even more popular than during the “Bitcoin mining era” because the AI market is larger, growing rapidly and steadily; AI supports millions of applications daily; AI requires a variety of GPU models and server locations; technology is more advanced than before; and the customer base is broader.Inference networks and proxy networks share similar infrastructure but focus on different aspects. Inference networks are mainly for experienced developers to deploy their own models, and running non-LLM models doesn’t necessarily require GPUs. Proxy networks, however, focus more on LLMs, where developers don’t need to bring their own models but instead concentrate on prompt engineering and connecting different proxies. Proxy networks always require high-performance GPUs.AI infrastructure projects hold tremendous promise and are continually introducing new features. Most native crypto projects are still in the testnet phase, with issues related to stability, complexity, limited functionality, and the need for more time to prove their security and privacy.If AI DApps become a major trend, there are still many undeveloped areas, such as monitoring, infrastructure related to RAG, Web3-native models, decentralized proxies with built-in crypto-native APIs and data, and evaluation networks.Vertical integration is a significant trend, with infrastructure projects aiming to provide one-stop services to simplify the work of AI DApp developers. The future will likely be hybrid, with some inference performed on the front end and some on-chain computation, balancing cost and verifiability factors. Source: IOSG introduction The integration of Web3 and AI is one of the most talked-about topics in the current cryptocurrency field. Talented developers are building AI infrastructure for the crypto world, aiming to bring intelligence to smart contracts. Building AI DApps is an extremely complex task, requiring developers to manage data, models, computational power, operations, deployment, and integration with blockchain. In response to these needs, Web3 founders have developed several preliminary solutions, such as GPU networks, community data annotation, community-trained models, verifiable AI inference and training, and proxy stores.However, despite the flourishing infrastructure, there are relatively few applications actually utilizing AI or built for AI. Developers searching for AI DApp development tutorials often find a scarcity of resources related to native crypto AI infrastructure, with most tutorials only covering how to call the OpenAI API on the front end. Source: IOSG Ventures Current applications have not fully leveraged the decentralized and verifiable features of blockchain, but this situation is expected to change soon. Most AI infrastructure focused on the crypto field has now launched testnets and plans to go live within the next 6 months.This research will provide a detailed overview of the major tools available in the AI infrastructure within the cryptocurrency space. Let’s get ready for the GPT-3.5 moment in the crypto world! 1. RedPill: Providing Decentralized Authorization for OpenAIThe RedPill we mentioned earlier is a great introduction. OpenAI has several world-class models such as GPT-4-vision, GPT-4-turbo, and GPT-4o, which are ideal for building advanced AI DApps. Developers can integrate these models into their DApps by calling the OpenAI API through oracles or front-end interfaces. RedPill consolidates different developers’ OpenAI API access into a single interface, providing global users with fast, cost-effective, and verifiable AI services, thereby democratizing access to top-tier AI models. RedPill’s routing algorithm directs developers’ requests to individual contributors. API requests are executed through its distribution network, circumventing potential limitations from OpenAI and addressing common issues faced by crypto developers, such as: TPM Limits (Tokens Per Minute): New accounts have limited token usage, which may not meet the needs of popular and AI-dependent DApps.Access Restrictions:Some models have access limitations for new accounts or certain countries. By using the same request code but changing the hostname, developers can access OpenAI models at a lower cost, with high scalability and no restrictions. 2. GPU NetworkIn addition to using OpenAI’s API, many developers choose to host models on their own. They can leverage decentralized GPU networks, such as io.net, Aethir, and Akash, to build GPU clusters and deploy various powerful internal or open-source models. These decentralized GPU networks provide flexible configurations, more server location options, and lower costs by utilizing the computational power of individuals or small data centers. This allows developers to conduct AI-related experiments within a limited budget. However, due to their decentralized nature, such GPU networks may still face limitations in functionality, availability, and data privacy. In recent months, the demand for GPUs has surged, surpassing the previous Bitcoin mining boom. The reasons for this phenomenon include: Increased target audience: GPU networks now serve AI developers, who are numerous and more loyal, and are not affected by cryptocurrency price fluctuations.Versatility: Decentralized GPUs offer a wider range of models and specifications compared to mining-specific hardware. Large models require higher VRAM, while smaller tasks can use more suitable GPUs. Decentralized GPUs can also serve end-users more closely, reducing latency.Technological maturity: GPU networks rely on advanced technologies such as high-speed blockchains like Solana for settlement, Docker virtualization, and Ray computing clusters.Investment returns: The AI market is expanding with numerous opportunities for new applications and models. The expected return on H100 models is 60-70%, whereas Bitcoin mining is more complex, with limited returns and a winner-takes-all scenario.Industry adoption: Bitcoin mining companies like Iris Energy, Core Scientific, and Bitdeer are beginning to support GPU networks for AI services and are actively purchasing AI-specific GPUs like the H100. Recommendation: For Web2 developers who are less concerned with SLA, io.net offers a simple and cost-effective experience, making it a high-value option. 3. Inference networkThis is the core of crypto-native AI infrastructure. It will support billions of AI inference operations in the future. Many AI Layer1 or Layer2 solutions provide developers with the ability to natively call AI inference on-chain. Market leaders include Ritual, Valence, and Fetch.ai. These networks differ in the following aspects: Performance (latency, computation time)Supported modelsVerifiabilityPrice (on-chain consumption costs, inference costs)Developer experience 3.1 ObjectivesIn an ideal scenario, developers should be able to easily access custom AI inference services from anywhere, with minimal barriers to integration. Inference networks provide all the foundational support developers need, including on-demand proof generation and verification, inference computation, relay and validation of inference data, Web2 and Web3 interfaces, one-click model deployment, system monitoring, cross-chain operations, synchronous integration, and scheduled execution. Source: IOSG Ventures With these features, developers can seamlessly integrate inference services into their existing smart contracts. For example, when building a DeFi trading bot, these bots use machine learning models to identify optimal trading opportunities for specific trading pairs and execute corresponding strategies on the underlying trading platform. In an ideal scenario, all infrastructure is cloud-hosted. Developers simply upload their trading strategy models in a universal format like Torch, and the inference network will store and provide the models for Web2 and Web3 queries. Once the model deployment steps are completed, developers can directly invoke model inference via Web3 APIs or smart contracts. The inference network will continuously execute these trading strategies and provide the results back to the underlying smart contract. If the developer manages a large community fund, they will also need to validate the inference results. Once the inference results are received, the smart contract will execute trades based on those results. Source: IOSG Ventures 3.1.1 Asynchronous and synchronous Theoretically, asynchronous execution of inference operations can offer better performance; however, this approach might be less convenient for developers in terms of development experience. When using an asynchronous method, developers need to first submit tasks to the inference network’s smart contract. Once the inference task is completed, the smart contract will return the results. In this programming model, the logic is divided into two parts: inference calling and inference result processing. Source: IOSG Ventures If developers have nested inference calls and extensive control logic, the situation can become even more complex. Source: IOSG Ventures The asynchronous programming model makes it difficult to integrate with existing smart contracts, requiring developers to write substantial additional code, handle errors, and manage dependencies. Conversely, synchronous programming is more intuitive for developers but introduces issues related to response time and blockchain design. For instance, if input data includes rapidly changing information such as block time or prices, the data may become outdated by the time inference is complete. This could lead to situations where smart contract execution might need to be rolled back. For example, making a trade based on outdated price information could have significant consequences. Source: IOSG Ventures Most AI infrastructures use asynchronous processing, but Valence is working on addressing these issues. 3.2 RealityIn practice, many new inference networks are still in the testing phase, such as the Ritual network. According to their public documents, these networks currently offer limited functionality, with features like verification and proof yet to be rolled out. They do not provide cloud infrastructure to support on-chain AI computation but instead offer a framework for self-hosting AI computations and transmitting results to the blockchain. This is an architecture for running AIGC NFTs. The diffusion model generates NFTs and uploads them to Arweave. The inference network then uses this Arweave address to mint the NFT on-chain. Source: IOSG Ventures This process is quite complex, requiring developers to deploy and maintain most of the infrastructure themselves, such as Ritual nodes with custom service logic, Stable Diffusion nodes, and NFT smart contracts. Recommendation: Currently, inference networks are quite complex for integrating and deploying custom models, and most networks at this stage do not support verification functions. Applying AI technology at the frontend provides a relatively simpler option for developers. If verification functionality is crucial, ZKML provider Giza is a good choice. 4. Agent networkAgent networks allow users to easily customize agents. These networks consist of entities or smart contracts that can autonomously perform tasks, interact with each other, and engage with blockchain networks without direct human intervention. They are primarily focused on LLM technology. For example, they might provide a GPT-based chatbot with deep knowledge of Ethereum. However, the tools available for such chatbots are currently limited, and developers cannot yet build complex applications on this basis. Source: IOSG Ventures In the future, agent networks will provide more tools for agents, including not only knowledge but also the ability to call external APIs and perform specific tasks. Developers will be able to connect multiple agents to build workflows. For example, writing a Solidity smart contract may involve several specialized agents, including a protocol design agent, a Solidity development agent, a code security review agent, and a Solidity deployment agent. Source: IOSG Ventures We coordinate the collaboration of these agents using prompts and scenarios. Examples of agent networks include Flock.ai, Myshell, and Theoriq. Recommendation: Currently, most agents have relatively limited functionality. For specific use cases, Web2 agents can serve better and come with mature orchestration tools, such as Langchain and Llamaindex. 5. The difference between agent network and inference networkAgent networks focus more on LLMs and provide tools like Langchain to integrate multiple agents. Typically, developers don’t need to develop machine learning models themselves; agent networks simplify the model development and deployment process. Developers only need to link the necessary agents and tools, while end-users will interact directly with these agents. Inference networks, on the other hand, serve as the infrastructure support for agent networks. They offer developers lower-level access, and end-users typically don’t interact directly with inference networks. Developers need to deploy their own models, not limited to LLMs, and can access them via off-chain or on-chain endpoints. Agent networks and inference networks are not entirely separate products. We are beginning to see some vertical integration products that offer both agent and inference capabilities due to their reliance on similar underlying infrastructure. 6. New lands of opportunityIn addition to model inference, training, and agent networks, there are several emerging areas in the Web3 space worth exploring: Datasets: Transforming blockchain data into machine learning-ready datasets is crucial. Machine learning developers need more specific and specialized data. For example, Giza offers high-quality datasets related to DeFi for training purposes. Ideal datasets should go beyond simple tabular formats and include graphical data that captures interactions within the blockchain world. This area is still developing, but projects like Bagel and Sahara are addressing the gap by incentivizing the creation of new datasets while ensuring privacy.Model Storage: Storing, distributing, and version-controlling large models is essential for on-chain machine learning performance and cost. Projects such as Filecoin, AR, and 0g are making progress in this field.Model Training: Distributed and verifiable model training presents challenges. Projects like Gensyn, Bittensor, Flock, and Allora have made significant strides in addressing these issues.Monitoring: Since model inference occurs both on-chain and off-chain, new infrastructure is needed to help Web3 developers track model usage, detect potential issues, and adjust as needed. Effective monitoring tools can help Web3 machine learning developers continuously optimize model accuracy.RAG Infrastructure: Distributed RAG requires a new infrastructure environment with high demands for storage, embedding computation, and vector databases, while ensuring data privacy. This differs from current Web3 AI infrastructure, which often relies on third parties for RAG, such as Firstbatch and Bagel.Web3-Customized Models: Not all models are suitable for Web3 contexts. Models often need retraining to fit specific applications like price prediction and recommendations. With the growth of AI infrastructure, more Web3-native models are expected to emerge. For example, Pond is developing a blockchain GNN for various applications including price prediction, recommendations, fraud detection, and anti-money laundering.Evaluation Networks: Assessing agents without human feedback is challenging. As agent creation tools become more prevalent, there will be numerous agents in the market. A system is needed to showcase the capabilities of these agents and help users determine which performs best in specific scenarios. Neuronets is an example of a participant in this field.Consensus Mechanisms: For AI tasks, PoS may not always be the best choice. Challenges such as computational complexity, verification difficulties, and lack of determinism are significant issues with PoS. Bittensor has developed a new intelligent consensus mechanism that rewards nodes contributing to machine learning models and outputs. 7. Future prospectsWe are currently observing a trend towards vertical integration. By establishing a foundational computing layer, networks are able to support various machine learning tasks, including training, inference, and agent network services. This approach aims to provide Web3 machine learning developers with a comprehensive, all-in-one solution. Currently, on-chain inference, despite its high cost and slower speed, offers excellent verifiability and seamless integration with backend systems such as smart contracts. I believe that the future will move towards a hybrid approach. Some inference processes will be handled on the front end or off-chain, while critical, decision-making inference will be performed on-chain. This model has already been applied in mobile devices, where it leverages the inherent characteristics of these devices to quickly run smaller models locally and offload more complex tasks to the cloud, utilizing larger LLMs. statement: This article is reproduced from [IOSG Ventures], the copyright belongs to the original author [IOSG Ventures], if you have any objections to the reprint, please contact the Gate Learn team, and the team will handle it as soon as possible according to relevant procedures. Disclaimer: The views and opinions expressed in this article represent only the author’s personal views and do not constitute any investment advice. Other language versions of the article are translated by the Gate Learn team and are not mentioned in Gate.io, the translated article may not be reproduced, distributed or plagiarized.

GATE.IO芝麻开门

GATE.IO芝麻开门交易所(原比特儿交易所)是全球前10的交易所，新用户注册可免费领取空投，每月可得50-200U

点击注册更多入口

What is the way out for homogenized AI infrastructure?

GATE.IO芝麻开门

更多交易所入口

相关阅读

目录[+]