百度智能云全面升级基础设施 李彦宏:未来会有百万量级智能体
lang3344
发表于 2024-9-25 17:23:16
1057
0
0
生成智能体需要AI的基础设施。本次大会上,百度智能云分别针对算力、模型、AI应用,全面升级百舸AI异构计算平台4.0、千帆大模型平台3.0两大AI基础设施,并升级代码助手、智能客服、数字人三大AI原生应用产品。
百度集团执行副总裁、百度智能云事业群总裁沈抖详细介绍了升级的具体效果和实现的技术原理。例如,在模型训练阶段,稳定和高效是衡量GPU集群水平的“硬指标”,一张GPU出现故障就会导致集群整体停摆,大量时间、成本会浪费在故障恢复和数据回滚上。由此,百舸AI异构计算平台4.0就克服了此项难题,在万卡集群上实现了有效训练时长占比99.5%以上,其技术原理是,百舸4.0能够自动筛查集群状态,并提前预测GPU故障,及时转移工作负载,从而降低故障发生频次。
沈抖表示,大模型以及配套的算力管理平台、模型和应用开发平台,正在迅速成为新一代基础设施。
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.