첫 페이지 News 본문

On June 14th local time, Nvidia opened up the Nemotron-4 340B (340 billion parameter) series model. According to NVIDIA, developers can use this series of models to generate synthetic data for training Large Language Models (LLMs) for commercial applications in healthcare, finance, manufacturing, retail, and other industries.
The Nemotron-4 340B includes the base model, instruction model, and reward model. Nvidia used 9 trillion tokens (text units) for training. In common sense reasoning tasks such as ARC-c, MMLU, and BBH benchmark tests, Nemotron-4 340B-Base can be comparable to Llama-3 70B, Mixture 8x22B, and Qwen-2 72B models.
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
您需要登录后才可以回帖 登录 | Sign Up

本版积分规则

海角七号 注册会员
  • Follow

    0

  • Following

    1

  • Articles

    29