Nvidia brings a new AI model to 'revolutionize' the audio industry: capable of creating music and modifying vocals
芊芊551
发表于 4 일전
120
0
0
According to reports, Nvidia has developed a new type of artificial intelligence (AI) model that can create sound effects, change people's pronunciation, and generate music using natural language prompts.
This model is named Fugatto, which stands for Founding Generative Audio Transformer Opus 1, and is a research project. Nvidia stated that it will not announce any plans to release this technology, but it may have a wide-ranging impact on industries ranging from music, entertainment to translation services.
Bryan Catanzaro, Vice President of Applied Deep Learning Research at NVIDIA, said in an interview, "The most exciting thing about Fugatto is that it has a model that you can ask it to make sound in some way, which really opens up your imagination of its application scope
He further explained that other models on the market, some can synthesize speech, some can add sound effects to music, but Fugatto can do all of them. Catanzaro said that it can be seen as a supplement to video and image generation models such as Stability AI's Stable Video Diffusion or OpenAI's Sora.
The most fundamental improvement here is... we are able to use language to synthesize audio, which I believe opens up new prospects for tools that people can use to create amazing audio, "he added.
According to Nvidia, Fugatto is the first basic model with emerging features, which means it can mix trained elements and follow "free-form instructions".
Specifically, the model can generate audio through standard text prompts and also handle the audio files you upload. So, if you have a document of someone speaking, you can translate that person's words into another language while making it sound like their voice. You can also choose a simple tune to make it sound like an orchestral performance, or add different beats to the music.
In addition, you can also upload a document for the model to read aloud in any voice you like. More importantly, you can instruct the model to produce sounds with emotional components.
However, Catanzaro also added that this model is not always perfect. Moreover, just like models that generate images and videos, Fugatto also raises concerns among artists, sound engineers, and professionals in related fields. But Catanzaro pointed out that his original intention was to hope that this technology could help musicians.
I hope this is a new tool for artists to explore. "" I think audio has always been a productive field of exploration. You know, when we acquire new audio tools, sometimes we acquire new forms of music, "he said.
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
You may like
- Tesla Model 3/Y's 5-year, interest free car purchase campaign extended until the end of the year
- Tesla Model 3/Y's 5-year, interest free car purchase campaign extended until the end of the year
- Microsoft denies using user data to train artificial intelligence models
- Musk's xAI supercomputer will expand tenfold! 1 million GPUs are worth Nvidia opening a subsidiary
- Qifu Technology: Helping to improve the security level of digital finance and the practical application of financial models
- Nvidia establishes AI R&D center in Vietnam and acquires VinBrain
- Nvidia's US stock experienced a short-term pre-market decline, dropping 2%
- Guosheng Securities: Behind Nvidia's financial report lies the long logic of AI narrative
- Tech Weekly | Nvidia's sales nearly double, Apple announces developer revenue
- Nvidia CEO: Nvidia is rapidly certifying Samsung's AI memory chips
-
"대적전 창시자 장충모: 인텔이 AI 물결을 따라잡지 못한 삼성의 문제는 경영전략에 있지 않다"12월 9일, 대적전 창시자 장충모의 자서전 전집의 신간 발표회가 중국 대만에서 개최되였다.행사장에서 경쟁사인 인텔 ...
- 西西里柠檬2017
- 그저께 14:46
- Up
- Down
- Reply
- Favorite
-
12월 11일 CNN에 따르면 엘론 머스크의 순자산은 4000억 달러에 달해 사상 처음으로 이 관문을 돌파했다. 머스크의 재산은 그의 우주 탐사 기술 회사와 관련이 있는 200억 달러 가까이 다시 늘어난 것으로 알려졌다 ...
- 真不是我干的的
- 4 시간전
- Up
- Down
- Reply
- Favorite
-
미국 동부 시간으로 월요일, 미국 주식 3대 지수는 집단적으로 하락하여 마감 마감되었는데, 나지는 0.62%, S & P500 지수는 0.61%, 지수는 0.54% 하락했다. 나스닥 중국 진룽지수는 8.54% 상승해 인기 있는 중국계 ...
- 强绝商爸摇
- 그저께 13:58
- Up
- Down
- Reply
- Favorite
-
샤오펑자동차 웨이보 12월 11일 소식에 따르면 샤오펑 P7 + 는 출시 4주 만에 10000대의 샤오펑 P7 + 를 정식 인도했다.
- 崔炫俊献
- 어제 12:18
- Up
- Down
- Reply
- Favorite