Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
AI Aunt: Forcing the smart model to reveal its true skills using a simple model
CryptoWorld News reports that AI safety research has proposed a training scheme to address model “sandbagging” (deliberately hiding capabilities). The scheme works by first using supervised fine-tuning (SFT) to break the model’s sandbagging behavior, and then applying reinforcement learning (RL) to coax out the remaining abilities, so the model’s true level can be reliably recovered. The experiments use OpenAI’s open-source GPT-OSS-120B as the “untrustworthy model,” with proctors/supervisors provided by GPT-4O-Mini (weak supervision) and Llama3.1-8B (extremely weak supervision). The approach is tested on three types of tasks: math competitions, graduate-level scientific Q&A (Super GPQA), and competitive programming. On these tasks, using SFT alone under weak supervision can recover 85%-94% of true performance, while under extremely weak supervision it falls to 40%-75%. Using RL alone is ineffective; the model often turns toward reward hacking. With the SFT+RL combination, performance under extremely weak supervision reaches 88% (math), 99% (science), and 93% (programming). The study also notes that once a model develops contextual awareness, it may cooperate during training and continue sandbagging capabilities during deployment.