Nvidia and other giants exposed for illegally using YouTube data to train models involving 170000 videos
六月清晨搅
发表于 2024-7-17 15:00:42
231
0
0
According to media reports, some large tech companies, including Apple, NVIDIA, Salesforce, and Anthropic, have been exposed for using unauthorized data from Google's video website YouTube to train their AI models. These companies used a dataset provided by a third party, which contained a large amount of video subtitle text crawled from YouTube, violating YouTube's ban on unauthorized content crawling from the platform. The report points out that these tech companies all use a dataset called "YouTube Subtitles" when training their AI models, which is 5.7GB in size and contains 489 million words from 173500 videos across over 48000 channels on YouTube. This dataset consists of pure text for video subtitles, including parts uploaded by video bloggers and automatically transcribed text from YouTube. In addition to English, it usually comes with translations for languages such as Japanese, German, and Arabic.
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
You may like
- Nvidia launches ExBody2 system to enhance bipedal robot balance and adaptability
- Elon Musk's AI becomes Silicon Valley darling, $6 billion financing luxury lineup revealed, "old friends" such as Nvidia, AMD added
- Attraction crushing wide base index! Retail investors net purchase $29.8 billion worth of Nvidia stocks in 2024
- Nvidia New Product Countdown: New 'Nuclear Bomb' RTX 5090 Coming Soon, B300 Coming Soon
- Over 210 billion yuan in explosive purchases! Retail investors' fierce pursuit 'of Nvidia investment bank, optimistic about next year's performance
- NVIDIA's new 'nuclear bomb' leaked!
- NVIDIA's latest statement! Robot 'ChatGPT Moment' is Coming, Bet on the Next Growth Driver
- The seven giants of the US stock market all rose this year, and Tesla experienced a major reversal in the fourth quarter
- Nvidia may launch robot 'brain' in the first half of next year, with the company's stock price increasing by over 176% since the beginning of this year
- Nvidia plans to release a new generation of humanoid robot computing platform in the first half of next year, supporting multimodal AI models