~ / guides / Is Scraping YouTube Legal? Terms of Service Explained

Is Scraping YouTube Legal? Terms of Service Explained

DT
Devon Tran
YouTube data engineer · about the author
the short version
  • Scraping YouTube is not a crime in the US for public data, but it does break YouTube's Terms of Service, which ban accessing the Service "using any automated means (such as robots, botnets or scrapers)."
  • Two different rulebooks apply: public-data law (where hiQ v. LinkedIn made breaking a ToS a civil contract matter under US law) and YouTube's contract (which you accept by using the site).
  • Risk climbs fast with three things: logged-in or private data, copyrighted content (videos, and increasingly transcripts), and personal data under GDPR.
  • I keep collection to public metadata, stay off login-gated pages, and route requests through a scraper API so I am not bypassing technical blocks by hand.

I read YouTube’s Terms of Service end to end before I answer the question I get asked most: is scraping YouTube legal? The honest answer is that two separate rulebooks apply, and they give different verdicts. General US law is fairly relaxed about collecting public data. YouTube’s own contract is not. Most of the confusion online comes from people quoting one rulebook and ignoring the other.

This guide separates the two. I cover what YouTube’s Terms of Service actually say about scraping, what the courts have decided about public data, where copyright and transcripts change the math, and how personal data brings GDPR into play. Then I show the setup I use to keep collection on the safe side of that line.

Scraping YouTube is generally not a crime in the United States when the data is publicly visible, but it does breach YouTube’s Terms of Service. The first is a criminal question, the second is a contract question. Those two get blurred together constantly, and keeping them apart is the whole point of this article.

The criminal question runs through the Computer Fraud and Abuse Act (CFAA), the federal anti-hacking law. The contract question runs through the agreement you accept by using YouTube. You can be on the right side of the first and the wrong side of the second at the same time, which is exactly where most YouTube scraping lands.

Here is the split in one table:

QuestionGoverning ruleVerdict for public YouTube dataConsequence if crossed
Is it a crime?Computer Fraud and Abuse ActGenerally no (public data)Criminal liability
Does it break the contract?YouTube Terms of ServiceYesAccount/IP block, civil claim
Does it infringe copyright?Copyright ActDepends on what you copyStatutory damages
Does it break privacy law?GDPR / state privacy lawsDepends on personal dataRegulatory fines

The rest of this article walks each row. The most-cited authority for the first row is a case that did not involve YouTube at all, so I will start there only after laying out what YouTube itself says, because the contract is the rule you agree to first.

What does YouTube’s Terms of Service say about scraping?

YouTube’s Terms of Service prohibit scraping directly. Under the “Permissions and Restrictions” section, the terms say you are not allowed to “access the Service using any automated means (such as robots, botnets or scrapers) except (a) in the case of public search engines, in accordance with YouTube’s robots.txt file; or (b) with YouTube’s prior written permission.” That clause is the single most important sentence for anyone asking whether YouTube scraping is against the terms of service. You can read it in full on the YouTube Terms of Service page.

Two exceptions are written into the clause, and almost nobody scraping YouTube qualifies for either:

The “automated means” language is deliberately broad. It covers the entire “Service,” which means both the public pages and anything behind a login. The terms also restrict using content beyond what the Service expressly authorizes, which is what blocks bulk downloading of videos for reuse. So under YouTube’s own rules, the scraping prohibition is not ambiguous: automated collection without written permission is a terms-of-service violation.

YouTube’s robots.txt file backs this up at the technical layer. It disallows crawling of paths including /results (search results), /youtubei/ (the internal API), /comment, /get_video_info, and /feeds/videos.xml for every user-agent except Google’s own ad crawler. A page being reachable in your browser does not mean YouTube invites a robot to fetch it, and the robots.txt file is YouTube stating which doors it considers off-limits to automated clients.

That covers scraping the site. YouTube runs a second, separate contract for its official data pipe, and the rules there are stricter in a different way.

How is scraping different from the YouTube Data API?

Scraping pulls data out of YouTube’s web pages without authorization, while the YouTube Data API is an authorized, authenticated pipe that returns structured data within a quota. They are governed by different agreements and fail in different ways. The API is the route YouTube actively wants automated clients to use.

The YouTube API Services Terms of Service tighten the rules for developers. Section 3.1 requires that you “only access (or attempt to access) the YouTube API Services to develop and operate your API Client(s) by the means described in the Agreement, including in accordance with the documentation.” Section 15 adds that you “will not, and will not attempt to, exceed or circumvent use or quota restrictions.” Translated: you take the data the API hands you, through the documented endpoints, inside the quota. Going around the quota by scraping the site instead is the behavior these terms exist to stop.

FactorScraping the web pagesOfficial YouTube Data API
AuthorizationNone (violates main ToS)Granted via API key + OAuth
Governing contractYouTube Terms of ServiceYouTube API Services Terms of Service
OutputHTML you parse yourselfStructured JSON
QuotaUndocumented, enforced by blocks10,000 units/day default (docs)
CoverageAnything on the pageOnly fields the API exposes
Typical failureIP block, CAPTCHA, 429Quota exceeded error

The default API quota is 10,000 units per day, and a single search call costs 100 units, so the free tier runs out at roughly 100 searches a day. That ceiling is the reason people look past the API in the first place. When the API does not return a field cleanly (full comment threads, complete transcripts, search result pages at volume), scraping the public page is the fallback, and that fallback is what carries the terms-of-service exposure described above. I walk through the technical mechanics of both routes in my guide on how to scrape YouTube.

So YouTube’s contract says no, and the API says use the front door within limits. The next question is whether breaking that contract is actually illegal, and that is where the courts come in.

What does US case law say about scraping public data?

US case law currently treats scraping publicly accessible data as outside the reach of the Computer Fraud and Abuse Act, which means it is not federal hacking. The leading authority is hiQ Labs v. LinkedIn. On April 18, 2022, the Ninth Circuit Court of Appeals affirmed that the CFAA does not bar scraping data that a website makes available to the general public. The full docket is on CourtListener.

The court’s reasoning was specific: when a network “generally permits public access to its data,” accessing that public data is not access “without authorization” under the CFAA. There are no login walls or permissions to defeat on a public page, so there is nothing to break into in the criminal sense. Law firm Jenner & Block’s analysis of the ruling lays out how narrowly the court read the statute.

The limits of that ruling matter as much as the holding:

For YouTube specifically, this means scraping public video metadata is unlikely to be a CFAA crime, and is very likely a breach of YouTube’s Terms of Service that YouTube can act on by blocking you or, in principle, suing for breach of contract. Criminal exposure and contract exposure are simply two different layers. The layer that bites hardest for media platforms is usually a third one: copyright.

Scraping the video file or the transcript carries copyright exposure that scraping a view count does not, because the video and its spoken content are the creator’s copyrighted work. YouTube’s terms are explicit that creators keep ownership: the Terms of Service state that “you retain all of your ownership rights in your Content.” Scraping a number about a video is different in kind from copying the creative work itself.

The live test of this is happening in court right now. In August 2024, YouTube creator David Millette filed a class action, Millette v. OpenAI, in the US District Court for the Northern District of California. The suit alleges that OpenAI transcribed more than a million hours of YouTube videos using its Whisper speech-recognition model and used those transcripts to train GPT-4, without permission or compensation. TechCrunch reported the filing, which seeks over $5 million in damages and claims the practice infringed copyright and broke YouTube’s terms of service. Bloomberg Law covered the parallel complaint against Google over Gemini training.

That case is why the youtube transcript policy gets so much attention now. The transcript is downstream of a copyrighted video, so reusing transcript text at scale (especially to train models or republish content) is the highest-risk form of YouTube scraping from a copyright standpoint. The practical guidance I follow:

Copyright governs the content. One more body of law governs a specific slice of that content: anything that identifies a person.

Does scraping YouTube break privacy laws like GDPR?

Scraping personal data from YouTube can break privacy laws such as the GDPR even when the data is public, because public availability does not strip away data-protection rights. Usernames, channel-owner identities, comment authors, and any contact details are personal data the moment they relate to an identifiable person. Public metadata like aggregate view counts is not.

European regulators have moved on exactly this. France’s data protection authority, the CNIL, confirmed that collecting publicly accessible data can rely on the “legitimate interest” legal basis under GDPR Article 6(1)(f), but only after a strict necessity and balancing test. The European Data Protection Board reached a similar position in its Opinion 28/2024 on AI models. Legitimate interest is a door, and it opens only under conditions.

The enforcement is real. In December 2024 the CNIL fined the company KASPR 240,000 euros for scraping contact details from LinkedIn, including data of users who had restricted its visibility. The lesson for YouTube scraping is direct: collecting public metrics is low risk on the privacy axis, and collecting data that identifies individual users (comment authors, channel operators, emails) pulls you into GDPR obligations and potential fines.

Stack the three content-level risks together and a clear pattern appears, which points straight at how to scrape within the rules.

What you collectToSCopyright riskPrivacy (GDPR) riskOverall risk
Public view/like counts, durationsViolationLowLowLow
Video titles, descriptions, tagsViolationLow to moderateLowLow to moderate
Full transcripts / captionsViolationHighLowHigh
Video/audio filesViolationHighLowHigh
Comment text + author identitiesViolationModerateHighHigh
Emails / personal contact dataViolationLowHighHigh

How do I scrape YouTube data while staying on the safe side?

I lower the risk by controlling three things: what I collect, whether I touch logged-in pages, and whether I bypass technical blocks by hand. None of this makes scraping compliant with YouTube’s Terms of Service, and I am clear-eyed that automated collection breaches the contract. It does keep me well clear of the criminal, copyright, and privacy lines that actually carry penalties. The practices I hold to:

That last point is where our YouTube API fits. It takes a YouTube URL and returns structured public data, so the request mechanics live on the server side instead of in a script I am babysitting. A public video-metadata request looks like this:

curl "https://api.youtubescraperapi.com/api/v1/youtube/video?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ&api_key=$API_KEY"

For transcript-specific work, the YouTube Transcript API endpoint returns the timed text for a video, and the YouTube video scraper endpoint returns titles, view counts, and other public metadata. You can create a free youtubescraperapi.com key and run the call above against any public video. Using a scraper API does not change YouTube’s terms, and it does not grant permission YouTube has not given. What it does is keep me on public data and out of manual block-circumvention, which is the part that escalates legal risk.

If you are choosing a tool for this, I compared the main options on accuracy and block handling in my roundup of the best YouTube scrapers in 2026. The legal posture is the same whichever you pick: the responsibility for what you collect, and how you use it, stays with you.

The bottom line on YouTube scraping and the law

Scraping public YouTube data is generally not a US crime under the CFAA after hiQ v. LinkedIn, and it does violate YouTube’s Terms of Service, which ban automated access without written permission. The penalties that actually sting come from the other three rulebooks: breach of contract (YouTube blocking or suing you), copyright (the OpenAI transcript litigation), and privacy law (the CNIL’s KASPR fine). Keep collection to public metadata, stay off logged-in and copyrighted content, avoid scraping personal data, and do not bypass security measures. Do that, and the question stops being “is this a crime” and becomes “am I respecting the contract and the content,” which is a question you can manage.

FAQ

Is scraping YouTube illegal?

Scraping publicly visible YouTube data (titles, view counts, public metadata) is generally not a criminal act in the US after hiQ v. LinkedIn, which held that accessing public data does not violate the Computer Fraud and Abuse Act. It still breaches YouTube's Terms of Service, which is a contract issue, and scraping private or copyrighted data carries separate legal risk.

Does YouTube's Terms of Service prohibit scraping?

Yes. YouTube's Terms of Service prohibit accessing the Service "using any automated means (such as robots, botnets or scrapers)" with two exceptions: public search engines following YouTube's robots.txt, and parties with YouTube's prior written permission. The YouTube API Services Terms add that developers may only access data "by the means described" in the documentation.

Can I web scrape YouTube for personal or research use?

You can technically scrape public YouTube pages for personal or research use, and US public-data law generally does not treat that as a crime. It still violates YouTube's Terms of Service, so the practical risk is account or IP blocking. Selling or redistributing the data, or scraping personal data, raises the stakes.

Is scraping YouTube transcripts legal?

Scraping a public transcript sits in the same gray zone as other public metadata, but transcripts carry extra copyright exposure because the underlying video is the creator's work. A 2024 class action (Millette v. OpenAI) alleges that transcribing YouTube videos to train AI models infringed copyright and broke YouTube's terms. Read YouTube's policy and the creator's rights before reusing transcript text commercially.

What is the safest legal way to get YouTube data?

The lowest-risk route is YouTube's official Data API for the fields it exposes, within its quota. For data the API does not return cleanly, limit yourself to public metadata, avoid logged-in pages, and do not bypass security measures. A scraper API handles the request layer without you circumventing technical blocks by hand.

DT
Devon Tran
I've built YouTube data pipelines for years. On youtubescraperapi.com I run YouTube scraping methods against live pages and publish what actually holds up.