英伟达开源 120B 智能体模型 Nemotron 3 Super,吞吐量最高提升 5 倍

Gate News 消息,3 月 12 日,英伟达发布开源大语言模型 Nemotron 3 Super,面向多智能体应用场景设计。模型总参数量 1200 亿,采用混合 Mamba-Transformer MoE 架构,推理时每个 token 仅激活 120 亿参数。其核心技术「潜在 MoE」(Latent MoE)将 token 嵌入压缩到低秩潜在空间后再路由至专家网络,实现以单个专家的计算成本同时激活 4 个专家,推理吞吐量较上一代 Nemotron Super 最高提升 5 倍。模型原生支持 100 万 token 上下文窗口,适用于需要长时间保持工作流状态的自主智能体。在评估智能体工作负载的 PinchBench 基准测试中,Nemotron 3 Super 得分 85.6%,是同类开源模型中的最高分。英伟达同步开源了超过 10 万亿 token 的训练数据集、15 个强化学习训练环境和评估方案,采用 NVIDIA Nemotron Open Model License 许可协议。模型已上线 Hugging Face、build.nvidia.com、Perplexity、OpenRouter 等平台,并支持通过 Google Cloud、Oracle、AWS Bedrock、Azure 等云服务部署。Perplexity、CodeRabbit、Cadence、达索系统、西门子等公司已率先采用。

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Opmerking
0/400
Geen opmerkingen