첫 페이지 News 본문

It seems that Musk made a completely different choice from Altman to demonstrate his unwavering commitment to open source AI models. On March 17th, Musk announced the open-source Grok-1, making it the largest open-source large language model with the largest number of parameters currently available, with 314 billion parameters, far exceeding the 175 billion of OpenAI GPT-3.5.
Interestingly, Grok-1 announced that its open source cover image will be generated by Midjournal, making it an "AI help AI".
Musk, who has been roast that OpenAI is not open, naturally wants to insinuate something on the social platform, "We want to know more about the open part of OpenAI."
Grok-1 follows the Apache 2.0 protocol to open model weights and architecture. This means that it allows users to freely use, modify, and distribute the software, whether for personal or commercial use. This openness encourages broader research and application development. Since its release, the project has won 6.5k stars on GitHub and its popularity is still increasing.
The project description clearly emphasizes that since Grok-1 is a large-scale (314B parameter) model, a machine with sufficient GPU memory is needed to test the model using the example code. Netizens suggest that this may require a machine with 628 GB of GPU memory.
In addition, the implementation efficiency of the MoE layer in this repository is not high, and the reason for choosing this implementation is to avoid the need for a custom kernel to verify the correctness of the model.
Currently, popular open source models include Meta's Llama2 and France's Mistral. Generally speaking, releasing open-source models helps the community conduct large-scale testing and feedback, which means that the iteration speed of the model itself can also be accelerated.
Grok-1 is a Mixture of Experts (MOE) big model developed by xAI, an AI startup under Musk, over the past four months. Review the development process of this model:
After announcing the establishment of xAI, researchers first trained a prototype language model (Grok-0) with 33 billion parameters. This model approached the capabilities of LLaMA2 (70B) on the standard language model testing benchmark, but used fewer training resources;
Subsequently, researchers made significant improvements to the reasoning and encoding capabilities of the model, ultimately developing Grok-1 and releasing it in November 2023. This is a more powerful SOTA language model that achieved 63.2% performance in HumanEval encoding tasks and 73% in MMLU, surpassing all other models in its computational class, including ChatGPT-3.5 and Inflection-1.
What are the advantages of Grok-1 compared to other large models?
XAI emphasizes that Grok-1 is their own large model trained from scratch, that is, starting from October 2023, using custom training stacks to train on JAX and Rust without fine-tuning for specific tasks (such as conversations);
A unique and fundamental advantage of Grok-1 is that it can understand the world in real-time through the X platform, which allows it to answer spicy questions rejected by most other AI systems. The training data used in the released version of Grok-1 comes from the Internet data as of the third quarter of 2023 and the data provided by the AI trainers of xAI;
The Mixture of Experts model with 314 billion parameters has an active weight ratio of 25% for each token, providing it with powerful language comprehension and generation capabilities due to its large number of parameters.
XAI previously introduced that Grok-1 will serve as the engine behind Grok for natural language processing tasks, including question answering, information retrieval, creative writing, and encoding assistance. In the future, the understanding and retrieval of long context, as well as multimodal ability, will be one of the directions that this model will explore.
您需要登录后才可以回帖 登录 | Sign Up

本版积分规则

  • 11월 14일, 세계예선 아시아지역 제3단계 C조 제5라운드, 중국남자축구는 바레인남자축구와 원정경기를 가졌다.축구 국가대표팀은 바레인을 1-0으로 꺾고 예선 2연승을 거두었다. 특히 이번 경기 국내 유일한 중계 ...
    我是来围观的逊
    4 분전
    Up
    Down
    Reply
    Favorite
  • "영비릉: 2024회계연도 영업수입 동기대비 8% 감소"영비릉은 2024회계연도 재무제보를 발표했다.2024 회계연도 매출은 149억5500만 유로로 전년 동기 대비 8% 감소했습니다.이익은 31억 500만 유로입니다.이익률은 ...
    勇敢的树袋熊1
    3 일전
    Up
    Down
    Reply
    Favorite
  • 계면신문기자 장우발 4분기의 영업수입이 하락한후 텐센트음악은 다시 성장으로 돌아왔다. 11월 12일, 텐센트음악은 최신 재보를 발표했다.2024년 9월 30일까지 이 회사의 3분기 총수입은 70억 2천만 위안으로 전년 ...
    勇敢的树袋熊1
    그저께 15:27
    Up
    Down
    Reply
    Favorite
  • 본사소식 (기자 원전새): 11월 14일, 다다그룹 (나스닥코드: DADA) 은 2024년 3분기 실적보고를 발표했다. 수치가 보여준데 따르면 고품질발전전략에 지속적으로 전념하고 사용자체험을 끊임없이 최적화하며 공급을 ...
    家养宠物繁殖
    어제 15:21
    Up
    Down
    Reply
    Favorite
六月清晨搅 注册会员
  • Follow

    0

  • Following

    0

  • Articles

    30