Google releases its most powerful model to attack OpenAI, shifting focus to AI agents
老好人啊
发表于 8 시간전
3003
0
0
After releasing its strongest quantum chip, Google has made another important move in AI.
On the early morning of December 12th Beijing time, Google released a new model Gemini 2.0 before OpenAI announced the official launch of ChatGPT on the iPhone.
Google CEO Sundar Pichai said that this is Google's most powerful model to date. With improvements in multimodal aspects such as native image and native audio output, Gemini 2.0 is able to build new AI agents, taking Google one step closer to its vision of building a universal assistant.
It should be pointed out that Gemini 2.0 is mainly open to developers and trusted testers. At present, the Gemini 2.0 Flash Experience model is open to all Gemini users.
Gemini 2.0 Flash is a model built on the foundation of 1.5 Flash, which was previously Google's most popular version among developers. Compared to 1.5 Flash, Gemini 2.0 Flash further enhances performance with the same fast response time. Google claims that 2.0 Flash even surpassed 1.5 Pro in key benchmark tests, with its speed being twice that of 1.5 Pro.
At the same time, 2.0 Flash also has new features. In addition to supporting multimodal inputs such as images, videos, and audio, it can also support multimodal outputs, such as directly generating content that mixes images and text, as well as native generation of controllable multilingual text to speech (TTS) audio. It can also natively call tools such as Google Search, code execution, and third-party user-defined functions.
Global Gemini users can now experience chat conversations optimized for 2.0 Flash on both computers and mobile devices, and this version will soon be released in the Gemini mobile application. Based on this new model, users can also experience the Gemini assistant. At the beginning of next year, Google will also expand Gemini 2.0 to more products.
The biggest change in Gemini 2.0 is the shift in focus towards AI agents, aiming to become the foundation model for all AI agents. Based on this, Gemini 2.0 has developed a series of prototypes that can help users complete corresponding tasks.
Among them, the upgraded version of Project Astra is used to explore the research prototype of future general AI assistant capabilities. Since the launch of Project Astra at Google I/O, Google has been collecting feedback from trusted testers who use it on Android phones. The upgraded version launched this time can enable conversations between multiple languages and mixed languages, and can also use new tools such as Google Search, Google Lens, and Google Maps. It can remember conversation content for up to 10 minutes and understand language with a latency close to that of human conversations.
The all-new Project Mariner explores the future development of interaction between humans and intelligent agents from a browser perspective. Project Mariner utilized early research prototypes built with Gemini 2.0, capable of understanding and inferring information in browser pages, including pixels and text, code, images, and forms, among other web elements, and then assisting users with corresponding tasks through experimental Chrome extensions. In this upgrade, Project Mariner has improved the previously slow speed issue.
In short, users can use this feature to let the browser help them complete specific tasks, such as batch searching for email addresses on certain websites, thereby achieving a certain degree of "automatic operation" of the browser.
Jules is a coding agent designed for developers, which can be directly integrated into GitHub workflows to assist developers in completing development tasks.
In Google's demonstration video, the presenter inputs a long string of prompts containing detailed programming questions. Jules will then analyze these requirements and provide a three-step programming solution. After clicking 'agree', the model will start automatic programming and generate code. This undoubtedly helps developers further improve their work efficiency.
At the end of last year, Google released the Gemini 1.0 model, whose main capability is to integrate and understand information. And Gemini 2.0 can make information more useful. Sundar Pichai stated that the progress of Gemini 2.0 is due to Google's 10-year investment in full stack AI innovation research, built on Google's customized hardware sixth generation TPU Trillium.
Just as Google was attracting attention with its most powerful model, OpenAI's 12 day product launch event was still ongoing. On the same day, OpenAI showcased the integration of ChatGPT and Apple Intelligence to the public, but the content was somewhat plain. The sudden release of Google Gemini 2.0 clearly stole a lot of attention from OpenAI.
With the support of Gemini 2.0, Google has launched three intelligent agent products in one go, which also means that it has taken another important step in the competition with Microsoft's OpenAI, Amazon, and Anthropic.
Intelligent agents have become the core direction of competition in the field of large models. The so-called intelligent agent refers to a system that can perceive the environment, make decisions, and take actions to achieve specific goals, and is regarded as a key carrier for the implementation and application of Large Language Models (LLMs).
Nearly two months ago, Microsoft released 10 AI agents for sales, operations, and other scenarios, and later announced that the Copilot Studio platform now supports users in building autonomous agents, while also releasing 5 pre built agents. At the just concluded 2024 re: Invent, Amazon released six large models in one go, among which Amazon Nova Premier is also a multimodal large model designed for complex reasoning tasks.
Whether in consumer or enterprise scenarios, AI agents have a lot of room for imagination and a clear commercial prospect. Several industry insiders predict that 2025 will be the year of the commercial explosion of AI intelligent agents. At that time, the competition among technology giants such as Google and OpenAI around intelligent agents will inevitably become increasingly fierce.
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
You may like
- Robin Lee said that the illusion of the big model has basically eliminated the actual measurement of ERNIE Bot?
- AI Weekly | Yang Zhilin claims that Kimi has over 36 million monthly active users; Robin Lee: The illusion of big model is basically eliminated
- ERNIE Bot has more than 400 million users, Baidu Wu Tian: the big model is reshaping the industrial intelligence engine
- In October of this year, Tesla Model Y won the sales championship for first tier and new first tier city models
- Alibaba CEO Wu Yongming: AI development requires a batch of open-source models of different scales and fields
- Baidu's Q3 core net profit increased by 17%, exceeding expectations. Wenxin's large model daily usage reached 1.5 billion
- The delivery fee pricing has been lowered to 6 yuan, and McDonald's has adjusted the McDonald's delivery fee model
- Ideal Automobile implements a limited time zero interest policy for all models for the first time
- OpenAI launches full health version of the o1 big model and $200 per month ChatGPT Pro
- OpenAI has Rocket again! Officially launched Sora, an AI video generation model
-
"대적전 창시자 장충모: 인텔이 AI 물결을 따라잡지 못한 삼성의 문제는 경영전략에 있지 않다"12월 9일, 대적전 창시자 장충모의 자서전 전집의 신간 발표회가 중국 대만에서 개최되였다.행사장에서 경쟁사인 인텔 ...
- 西西里柠檬2017
- 그저께 14:46
- Up
- Down
- Reply
- Favorite
-
12월 11일 CNN에 따르면 엘론 머스크의 순자산은 4000억 달러에 달해 사상 처음으로 이 관문을 돌파했다. 머스크의 재산은 그의 우주 탐사 기술 회사와 관련이 있는 200억 달러 가까이 다시 늘어난 것으로 알려졌다 ...
- 真不是我干的的
- 7 시간전
- Up
- Down
- Reply
- Favorite
-
미국 동부 시간으로 월요일, 미국 주식 3대 지수는 집단적으로 하락하여 마감 마감되었는데, 나지는 0.62%, S & P500 지수는 0.61%, 지수는 0.54% 하락했다. 나스닥 중국 진룽지수는 8.54% 상승해 인기 있는 중국계 ...
- 强绝商爸摇
- 그저께 13:58
- Up
- Down
- Reply
- Favorite
-
샤오펑자동차 웨이보 12월 11일 소식에 따르면 샤오펑 P7 + 는 출시 4주 만에 10000대의 샤오펑 P7 + 를 정식 인도했다.
- 崔炫俊献
- 어제 12:18
- Up
- Down
- Reply
- Favorite