Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
420 Million! Cloudwalk Red Envelope, Won Zhanjiang AI Inference Thousand-Card Cluster Project
(Source: Yuntian Lifei)
Recently, Yuntian Lifei won the bid for the Zhanjiang City AI Penetration Support New Quality Productivity Infrastructure Construction Project. According to the project plan, the company will build an AI inference computing cluster based on its self-developed domestic AI inference acceleration cards, and promote the adaptation and deployment of domestically produced large models like DeepSeek in relevant application scenarios, providing computing infrastructure support for government and industrial digitalization applications.
Building Inference Computing Infrastructure for Large Model Applications
The AI inference computing cluster constructed in this project will be systematically designed around the requirements of large model inference tasks.
During large model inference, different computational stages have varying system resource needs. The industry commonly adopts a “Prefill–Decode separation” inference architecture, optimizing resource allocation for different stages to improve overall system efficiency.
Under this architecture, the Prefill stage mainly handles long-context understanding and computation, requiring high computing power and bandwidth; the Decode stage continuously generates tokens and is more sensitive to system latency. During the project construction, resource allocation and system optimization will be tailored to the characteristics of each stage.
At the same time, as the context length of models increases, a large number of intermediate states need to be stored in KV Cache. In response to this, the system design will optimize the coordination between computation, storage, and network to improve data access efficiency and overall system performance.
Regarding network architecture, the system will adopt a unified high-speed interconnection architecture, building the cluster’s physical layer network with 400G optical networks to achieve high bandwidth and low latency communication between nodes. It will support scaling from dozens of cards in a single node to thousands of cards in a cluster, meeting the needs of AI applications of different scales.
Once the project is fully completed, it will form a computing infrastructure for large model inference tasks, providing stable computing support for related application scenarios.
Continuously Advancing AI Inference Chip and Computing System Technology R&D
According to the project plan, the AI inference computing cluster will be built in three phases, using Yuntian Lifei’s self-developed domestically produced AI inference acceleration cards.
The first phase will deploy Yuntian Lifei’s X6000 inference acceleration cards; in the future, the latest generation of company chips will be prioritized for deployment.
In terms of AI inference chip R&D, Yuntian Lifei is actively advancing technological layouts for different inference stages. According to the company’s strategic plan, it will gradually launch chips optimized for the Prefill stage and inference chips designed for low-latency requirements during the Decode stage, further improving overall inference efficiency through system-level collaborative optimization.
Among these, the company’s first chip optimized for long-context inference scenes, DeepVerse100, is expected to complete tape-out within the year and plans to deploy it in related computing systems.
In the long-term technology plan, the company has proposed the “1001 Plan,” aiming for a long-term goal of “one billion tokens for one penny.” Through collaborative optimization of chip architecture and computing systems, it will continuously drive down the costs of large model inference.
In the future, the company will continue to promote R&D related to AI inference chips and advance the widespread application of artificial intelligence technology across more industries.