첫 페이지 News 본문

On the early morning of August 14th Beijing time, Google officially released its intelligent voice assistant Gemini Live at the "Made by Google" conference. This feature directly challenges OpenAI's GPT-4o voice mode and marks another step towards more natural, universal, and user-friendly AI interaction.
According to Google, users can have free and smooth conversations with Gemini Live instead of relying on traditional input and output settings.
During the conversation, users can interrupt to inquire about more details or pause for a period of time before resuming.
In order to make conversations more natural, Google also offers ten voices for users to choose from. Google said, "It's like having a companion in your pocket that you can talk to about new ideas or practice important conversations with
The GPT-4o advanced voice mode previously released by Open AI also allows users to interrupt during conversations and perceive and respond to emotional fluctuations. In terms of voice settings, Open AI offers four types of voices, all produced in collaboration with professional voice actors.
In addition, Google will also connect Gemini Live with other applications and tools. Google has announced that it will launch extension features such as Keep, Tasks, Utilities, Calendar, YouTube Music, etc. in the coming weeks.
Google described the specific application scenarios of these features. For example, if a user needs to host a dinner party, Gemini Live can find specific recipes and add ingredients to the Keep shopping list, as well as customize a playlist that "reminds people of the late 1990s"; For example, by taking a photo of a concert poster, Gemini Live can answer whether the user is available on the day and remind them to buy tickets.
However, during the live demonstration of Gemini Live features at the "Made by Google" conference, there was a small incident. Google executive Dave Citron asked Gemini Live if there were any events on his schedule, but he tried Gemini Live twice in a row without any response until he changed his device for the third time before successfully demonstrating.
Currently, Google has provided an English version to Gemini premium subscribers on Android phones and will expand to iOS in the coming weeks, offering more language modes. The latest Pixel 9 series phones released by Google also feature Gemini Live functionality.
Industry insiders believe that the release of Gemini Live is an important milestone in the development of artificial intelligence interaction. By introducing voice interruption and selection functions, Google is not only competing with OpenAI, but also promoting human-computer interaction, thereby changing the competitive landscape of the artificial intelligence chatbot market and forcing other companies to create more natural, practical, and attractive artificial intelligence assistants.
At the same time, the innovative development of human-computer interaction has also brought new problems and challenges. For example, how will artificial intelligence quickly handle topic changes while maintaining contextual unity and relevance? How to handle interference information without losing important clues? More importantly, with the deepening development of artificial intelligence, where is its boundary with real life?
However, GPT-4o, which OpenAI publicly introduced three months ago, has not yet been fully implemented. On August 9th, OpenAI released a blog post about security, detailing the company's security efforts in developing GPT-4o and exploring the potential risks these technologies may pose to society.
OpenAI pointed out in the report the risks that artificial intelligence's humanoid social model may pose. OpenAI believes that users may establish social relationships with artificial intelligence and reduce the need for human interaction. This is beneficial for lonely individuals, but it can affect healthy interpersonal relationships.
OpenAI revealed that during the early testing of GPT-4o, they observed subtle changes in the interaction language between users and models, such as "This is our last day together" and so on. This seemingly harmless expression may hide bigger problems behind it.
In addition, OpenAI also mentioned that GPT-4o sometimes unintentionally generates outputs that mimic user voices, which means that AI speech engines may be used for fraud.
And these security issues are also one of the reasons why OpenAI controls the landing pace of GPT-4o. As for whether Google Gemini Live has addressed similar security risks, it has not been disclosed.
All security related risks, whether we are aware of them or the additional possibilities attached to Pandora's Box, are issues that need to be further addressed in the field of artificial intelligence to ensure that technological progress serves humanity.
您需要登录后才可以回帖 登录 | Sign Up

本版积分规则

  • 11월 14일, 세계예선 아시아지역 제3단계 C조 제5라운드, 중국남자축구는 바레인남자축구와 원정경기를 가졌다.축구 국가대표팀은 바레인을 1-0으로 꺾고 예선 2연승을 거두었다. 특히 이번 경기 국내 유일한 중계 ...
    我是来围观的逊
    어제 15:05
    Up
    Down
    Reply
    Favorite
  • 계면신문기자 장우발 4분기의 영업수입이 하락한후 텐센트음악은 다시 성장으로 돌아왔다. 11월 12일, 텐센트음악은 최신 재보를 발표했다.2024년 9월 30일까지 이 회사의 3분기 총수입은 70억 2천만 위안으로 전년 ...
    勇敢的树袋熊1
    3 일전
    Up
    Down
    Reply
    Favorite
  • 본사소식 (기자 원전새): 11월 14일, 다다그룹 (나스닥코드: DADA) 은 2024년 3분기 실적보고를 발표했다. 수치가 보여준데 따르면 고품질발전전략에 지속적으로 전념하고 사용자체험을 끊임없이 최적화하며 공급을 ...
    家养宠物繁殖
    그저께 15:21
    Up
    Down
    Reply
    Favorite
  • 11월 12일 소식에 따르면 소식통에 따르면 아마존은 무료스트리밍서비스 Freevee를 페쇄하고 일부 종업원과 프로를 구독서비스 Prime Video로 이전할 계획이다. 올해 초 아마존이 내놓은 몇 편의 대형 드라마의 효 ...
    度素告
    3 일전
    Up
    Down
    Reply
    Favorite
六月清晨搅 注册会员
  • Follow

    0

  • Following

    0

  • Articles

    30