NVIDIA’s AI team reportedly scraped YouTube, Netflix videos without permission

In the latest example of a troubling industry pattern, NVIDIA appears to have scraped troves of copyrighted content for AI training. On Monday, 404 Media’s Samantha Cole reported that the $2.4 trillion company asked workers to download videos from YouTube, Netflix and other datasets to develop commercial AI projects. The graphics card maker is among the tech companies appearing to have adopted a “move fast and break things” ethos as they race to establish dominance in this feverish, too-often-shameful AI gold rush.

The training was reportedly to develop models for products like its Omniverse 3D world generator, self-driving car systems and “digital human” efforts.

NVIDIA defended its practice in an email to Engadget. A company spokesperson said its research is “in full compliance with the letter and the spirit of copyright law” while claiming IP laws protect specific expressions “but not facts, ideas, data, or information.” The company equated the practice to a person’s right to “learn facts, ideas, data, or information from another source and use it to make their own expression.” Human, computer… what’s the difference?

YouTube doesn’t appear to agree. Spokesperson Jack Malon pointed us to a Bloomberg story from April, quoting CEO Neal Mohan saying using YouTube to train AI models would be a “clear violation” of its terms. “Our previous comment still stands,” the YouTube policy communications manager wrote to Engadget.

That quote from Mohan in April was in response to reports that OpenAI trained its Sora text-to-video generator on YouTube videos without permission. Last month, a report showed that the startup Runway AI followed suit.

NVIDIA employees who raised ethical and legal concerns about the practice were reportedly told by their managers that it had already been green-lit by the company’s highest levels. “This is an executive decision,” Ming-Yu Liu, vice president of research at NVIDIA, replied. “We have an umbrella approval for all of the data.” Others at the company allegedly described its scraping as an “open legal issue” they’d tackle down the road.

It all sounds similar to Facebook’s (Meta’s) old “move fast and break things” motto, which has succeeded admirably at breaking quite a few things. That included the privacy of millions of people.

In addition to the YouTube and Netflix videos, NVIDIA reportedly instructed workers to train on movie trailer database MovieNet, internal libraries of video game footage and Github video datasets WebVid (now taken down after a cease-and-desist) and InternVid-10M. The latter is a dataset containing 10 million YouTube video IDs.

Some of the data NVIDIA allegedly trained on was only marked as eligible for academic (or otherwise non-commercial) use. HD-VG-130M, a library of 130 million YouTube videos, includes a usage license specifying that it’s only meant for academic research. NVIDIA reportedly brushed aside concerns about academic-only terms, insisting their batches were fair game for its commercial AI products.

To evade detection from YouTube, NVIDIA reportedly downloaded content using virtual machines (VMs) with rotating IP addresses to avoid bans. In response to a worker’s suggestion to use a third-party IP address-rotating tool, another NVIDIA employee reportedly wrote, “We are on [Amazon Web Services](#) and restarting a [virtual machine](#) instance gives a new public IP[.](#) So, that’s not a problem so far.”

404 Media’s full report on NVIDIA’s practices is worth a read.

This article originally appeared on Engadget at https://www.engadget.com/ai/nvidias-ai-team-reportedly-scraped-youtube-netflix-videos-without-permission-204942022.html?src=rss

NVIDIA’s AI team reportedly scraped YouTube, Netflix videos without permission

In the latest example of a troubling industry pattern, NVIDIA appears to have scraped troves of copyrighted content for AI training. On Monday, 404 Media’s Samantha Cole reported that the $2.4 trillion company asked workers to download videos from YouT...

NVIDIA’s AI team reportedly scraped YouTube, Netflix videos without permission

Leave a Reply Cancel reply

Join the Underground

a vibrant community where every pixel can be the difference between victory and defeat.

You could really enjoy these posts

Rogue Point is a four-player tactical shooter from the devs behind Black Mesa

Stellar Blade PC Port Gets a Release Window

Nintendo’s Black Friday sale has discounts on Switch games and controllers

All Arcane season 2 skins coming to LoL

Disney Said to be Considering a Surprising Replacement for Bob Iger: EA CEO Andrew Wilson

Call of Duty: Black Ops 6 is Going to Start Selling Gobblegum Packs

One of 2024’s Best Puzzle Games is Finally Coming to PlayStation

Overwatch 2 Nov. 12 patch notes: Doomfist buffs and more

Rime studio Tequila Works has filed for insolvency

News

Valve news update

E-sport

Guides

Adventures

Battle Royale

Early Access

interview

Live

Multiplayer

Platform

player

Need more help?