Google's Big Model Has Finally taken a Big Step Gemini vs GPT-4
阿豆学长长ov
发表于 2023-12-8 10:00:39
241
0
0
On December 6th, US time, Google officially released the Gemini model. Google CEO Sundar Pichai stated that this is Google's most powerful and versatile model to date.
It has been one year and one week since ChatGPT was released. With the release of ChatGPT, OpenAI has become the most dazzling company in the field of artificial intelligence, especially in the field of large models. It is also a catch up target for all other technology companies, including Google.
For the past eight years, Google has been using AI first as its corporate strategy, and AlphaGo, which defeated the human Go champion in 2016, was created by Google. It is not an exaggeration to say that Google has sparked a wave of AI that has changed the development of the entire AI industry, but now it urgently needs to prove itself in the field of big models.
It is reported that the Gemini 1.0 version includes three different sizes, namely Gemini Ultra, Gemini Pro, and Gemini Nano. Among them, the Gemini Nano is mainly used on the device side, and the Pixel 8 Pro will be the first smartphone equipped with the Gemini Nano; Gemini Pro is suitable for expanding in various tasks, and Google plans to use Gemini Pro to upgrade its chatbot Bard, as well as more Google products including search, advertising, Chrome, and more.
For the most powerful Gemini Ultra, Google stated that it is currently undergoing trust and security checks, as well as further refining the model through fine-tuning and human feedback based reinforcement learning (RLHF). It is expected to be launched to developers and enterprise customers early next year.
Sandal Pichai stated that the release of Gemini is an important milestone in the development of artificial intelligence and the beginning of a new era for Gu Ge.
Beyond GPT-4?
According to Demis Hassabis, CEO of Google DeepMind, Gemini is a multimodal model built by the Google team from scratch, which means it can summarize and seamlessly understand and process different types of information, including text, code, audio, images, and videos.
In terms of performance testing, Gemini Ultra outperformed the current best performance in 30 out of 32 benchmark tests for large language models. Additionally, in MMLU (Massive Multi Task Language Understanding), Gemini Ultra scored 90%, becoming the first large model to surpass human experts.
Demis Hassabis stated that during the testing of image benchmarks, the Gemini Ultra surpassed previously state-of-the-art models without the help of image character recognition (OCR) systems. These benchmark tests highlight Gemini's multimodal ability and also show early signs of its more complex reasoning ability.
At present, the standard method for creating multimodal models is mainly to train individual components of different modalities and then concatenate them together. But the result of this operation is that these models sometimes perform well in performing certain tasks (such as describing images), but often find it difficult to handle more complex reasoning.
"We designed Gemini as a native multimodal model, which was pre trained for different modalities from the beginning, and then we fine tuned it with additional multimodal data to further improve its performance." Demis Hassabis explained, "This helps Gemini seamlessly understand and reason various inputs from the beginning, far superior to existing multimodal models, and its capabilities have reached the most advanced level in almost all fields."
For example, in terms of reasoning, Gemini 1.0 can understand complex written and visual information. By reading, filtering, and understanding information, it can extract insights from hundreds of thousands of documents.
In addition, Gemini 1.0 has been trained to recognize and understand text, images, audio, etc. at the same time, so it can better understand subtle information and answer questions related to complex topics, such as reasoning in complex disciplines such as mathematics and physics.
In terms of coding, Gemini 1.0 is able to understand, interpret, and generate high-quality code for the world's most popular programming languages, such as Python, Java, C++, and Go. Two years ago, Google launched the AI code generation platform AlphaCode. Now, with the help of Gemini, the platform has iterated to AlphaCode 2, and its performance has been greatly improved, which can solve almost twice the number of problems before.
Still continuously optimizing security
Sandal Pichai stated that millions of people are now using generative AI in Google products to do things they couldn't do a year ago, from answering more complex questions to collaborating and creating with new tools. At the same time, developers are using Google's models and infrastructure to build new generative AI applications, and startups and businesses around the world are also continuously growing using Google's AI tools.
In its view, this trend is already somewhat unbelievable, but it is only the beginning.
"We are boldly and responsibly carrying out this work. This means that our research needs to be ambitious, pursuing the ability to bring enormous benefits to humanity and society, while also establishing safeguards and collaborating with governments and experts to address the risks that arise as AI becomes stronger," said Sandal Pichai.
Therefore, during the development process of Gemini, Google also strengthened its security review work. Demis Hassabis introduced that based on Google's AI principles and product security policies, the Google team is adding new protection measures to Gemini's multimodal capabilities.
Not only that, Demis Hassabis also emphasized that at every stage of development, Google considers potential risks and strives to test and mitigate them.
It is reported that Gemini has the most comprehensive security assessment among all Google AI models to date, including assessment of bias and harmful information. Meanwhile, in order to identify blind spots in internal evaluation methods, Google is also collaborating with various external experts and teams to conduct stress tests on the Gemini model on various issues.
Another noteworthy point is that Gemini's training is based on Google's own Tensor Processing Units (TPUs) - v4 and v5e. On these TPUs, Gemini runs faster and has lower costs than previous models from Google. So in addition to the new model, Google has also announced the launch of a new TPU system - Cloud TPU v5p, which is designed specifically for training cutting-edge AI models and will also be used for Gemini development.
Industry insiders have told reporters that although Google's Gemini has surpassed GPT-4 in many aspects of performance, there is still a time gap between it and OpenAI. GPT-4 has been released for more than half a year, and the new generation model should also be in the development process.
"So for Google, comparing various benchmark tests with GPT-4 is only one aspect of demonstrating its current capabilities, and whether it can rely on its own accumulation and powerful resources to shorten the time gap with OpenAI is the key," the person pointed out. In addition, as a new infrastructure built by Google in the era of big models, whether Gemini can meet the needs of daily users and enterprise customers is the true standard for testing Gemini's capabilities, rather than testing data.
Demis Hassabis said that Google has started experimenting with Gemini in search, which makes user search generation experience faster, reducing latency by 40% in English searches in the United States, and also improving quality.
And in the process of accelerating the landing of Gemini 1.0, Google is also further expanding its future version's features, including adding context windows to process more information and provide better response.
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
You may like
- Dialogue | Baidu Li Tao: The overlap between automotive intelligence and the wave of big models is a historical inevitability
- Boeing announces 10% layoffs, first delivery of 777X model postponed to 2026
- Faraday Future plans to launch the first model of its second brand by the end of next year
- Will a third brand launch hybrid models overseas? NIO responds: Continuing the pure electric technology route
- He Xiaopeng: Xiaopeng's car end large model aims to achieve a 100 kilometer takeover once next year
- Faraday Future: Second brand FX plans to launch two models with a price not exceeding $50000
- Robin Lee: The average daily adjustment amount of Wenxin Model exceeded 1.5 billion, 30 times more than that of a year ago
- Will DeepMind's open-source biomolecule prediction model win the Nobel Prize and ignite a wave of AI pharmaceuticals?
- "AI new generation" big model manufacturer Qi "roll" agent, Robin Lee said that it will usher in an era of "making money by thinking"
- Robin Lee said that the illusion of the big model has basically eliminated the actual measurement of ERNIE Bot?
-
11월 14일, 세계예선 아시아지역 제3단계 C조 제5라운드, 중국남자축구는 바레인남자축구와 원정경기를 가졌다.축구 국가대표팀은 바레인을 1-0으로 꺾고 예선 2연승을 거두었다. 특히 이번 경기 국내 유일한 중계 ...
- 我是来围观的逊
- 어제 15:05
- Up
- Down
- Reply
- Favorite
-
계면신문기자 장우발 4분기의 영업수입이 하락한후 텐센트음악은 다시 성장으로 돌아왔다. 11월 12일, 텐센트음악은 최신 재보를 발표했다.2024년 9월 30일까지 이 회사의 3분기 총수입은 70억 2천만 위안으로 전년 ...
- 勇敢的树袋熊1
- 3 일전
- Up
- Down
- Reply
- Favorite
-
본사소식 (기자 원전새): 11월 14일, 다다그룹 (나스닥코드: DADA) 은 2024년 3분기 실적보고를 발표했다. 수치가 보여준데 따르면 고품질발전전략에 지속적으로 전념하고 사용자체험을 끊임없이 최적화하며 공급을 ...
- 家养宠物繁殖
- 그저께 15:21
- Up
- Down
- Reply
- Favorite
-
11월 12일 소식에 따르면 소식통에 따르면 아마존은 무료스트리밍서비스 Freevee를 페쇄하고 일부 종업원과 프로를 구독서비스 Prime Video로 이전할 계획이다. 올해 초 아마존이 내놓은 몇 편의 대형 드라마의 효 ...
- 度素告
- 3 일전
- Up
- Down
- Reply
- Favorite