Is Scraping YouTube Legal? Terms of Service Explained

by Devon Tran · Updated June 16, 2026 · 12 min read

Devon Tran

YouTube data engineer · about the author

the short version

Scraping YouTube is not a crime in the US for public data, but it does break YouTube's Terms of Service, which ban accessing the Service "using any automated means (such as robots, botnets or scrapers)."
Two different rulebooks apply: public-data law (where hiQ v. LinkedIn made breaking a ToS a civil contract matter under US law) and YouTube's contract (which you accept by using the site).
Risk climbs fast with three things: logged-in or private data, copyrighted content (videos, and increasingly transcripts), and personal data under GDPR.
I keep collection to public metadata, stay off login-gated pages, and route requests through a scraper API so I am not bypassing technical blocks by hand.

I read YouTube’s Terms of Service end to end before I answer the question I get asked most: is scraping YouTube legal? The honest answer is that two separate rulebooks apply, and they give different verdicts. General US law is fairly relaxed about collecting public data. YouTube’s own contract is not. Most of the confusion online comes from people quoting one rulebook and ignoring the other.

This guide separates the two. I cover what YouTube’s Terms of Service actually say about scraping, what the courts have decided about public data, where copyright and transcripts change the math, and how personal data brings GDPR into play. Then I show the setup I use to keep collection on the safe side of that line.

Is scraping YouTube legal in the United States?

Scraping YouTube is generally not a crime in the United States when the data is publicly visible, but it does breach YouTube’s Terms of Service. The first is a criminal question, the second is a contract question. Those two get blurred together constantly, and keeping them apart is the whole point of this article.

The criminal question runs through the Computer Fraud and Abuse Act (CFAA), the federal anti-hacking law. The contract question runs through the agreement you accept by using YouTube. You can be on the right side of the first and the wrong side of the second at the same time, which is exactly where most YouTube scraping lands.

Here is the split in one table:

Question	Governing rule	Verdict for public YouTube data	Consequence if crossed
Is it a crime?	Computer Fraud and Abuse Act	Generally no (public data)	Criminal liability
Does it break the contract?	YouTube Terms of Service	Yes	Account/IP block, civil claim
Does it infringe copyright?	Copyright Act	Depends on what you copy	Statutory damages
Does it break privacy law?	GDPR / state privacy laws	Depends on personal data	Regulatory fines

The rest of this article walks each row. The most-cited authority for the first row is a case that did not involve YouTube at all, so I will start there only after laying out what YouTube itself says, because the contract is the rule you agree to first.

What does YouTube’s Terms of Service say about scraping?

YouTube’s Terms of Service prohibit scraping directly. Under the “Permissions and Restrictions” section, the terms say you are not allowed to “access the Service using any automated means (such as robots, botnets or scrapers) except (a) in the case of public search engines, in accordance with YouTube’s robots.txt file; or (b) with YouTube’s prior written permission.” That clause is the single most important sentence for anyone asking whether YouTube scraping is against the terms of service. You can read it in full on the YouTube Terms of Service page.

Two exceptions are written into the clause, and almost nobody scraping YouTube qualifies for either:

Public search engines following robots.txt. This carve-out is for crawlers like Googlebot that obey youtube.com/robots.txt. A custom script is not a public search engine.
Prior written permission from YouTube. A signed agreement. An email reply does not count, and silence is not consent.

The “automated means” language is deliberately broad. It covers the entire “Service,” which means both the public pages and anything behind a login. The terms also restrict using content beyond what the Service expressly authorizes, which is what blocks bulk downloading of videos for reuse. So under YouTube’s own rules, the scraping prohibition is not ambiguous: automated collection without written permission is a terms-of-service violation.

YouTube’s robots.txt file backs this up at the technical layer. It disallows crawling of paths including /results (search results), /youtubei/ (the internal API), /comment, /get_video_info, and /feeds/videos.xml for every user-agent except Google’s own ad crawler. A page being reachable in your browser does not mean YouTube invites a robot to fetch it, and the robots.txt file is YouTube stating which doors it considers off-limits to automated clients.

That covers scraping the site. YouTube runs a second, separate contract for its official data pipe, and the rules there are stricter in a different way.

How is scraping different from the YouTube Data API?

Scraping pulls data out of YouTube’s web pages without authorization, while the YouTube Data API is an authorized, authenticated pipe that returns structured data within a quota. They are governed by different agreements and fail in different ways. The API is the route YouTube actively wants automated clients to use.

The YouTube API Services Terms of Service tighten the rules for developers. Section 3.1 requires that you “only access (or attempt to access) the YouTube API Services to develop and operate your API Client(s) by the means described in the Agreement, including in accordance with the documentation.” Section 15 adds that you “will not, and will not attempt to, exceed or circumvent use or quota restrictions.” Translated: you take the data the API hands you, through the documented endpoints, inside the quota. Going around the quota by scraping the site instead is the behavior these terms exist to stop.

Factor	Scraping the web pages	Official YouTube Data API
Authorization	None (violates main ToS)	Granted via API key + OAuth
Governing contract	YouTube Terms of Service	YouTube API Services Terms of Service
Output	HTML you parse yourself	Structured JSON
Quota	Undocumented, enforced by blocks	10,000 units/day default (docs)
Coverage	Anything on the page	Only fields the API exposes
Typical failure	IP block, CAPTCHA, 429	Quota exceeded error

The default API quota is 10,000 units per day, and a single search call costs 100 units, so the free tier runs out at roughly 100 searches a day. That ceiling is the reason people look past the API in the first place. When the API does not return a field cleanly (full comment threads, complete transcripts, search result pages at volume), scraping the public page is the fallback, and that fallback is what carries the terms-of-service exposure described above. I walk through the technical mechanics of both routes in my guide on how to scrape YouTube.

So YouTube’s contract says no, and the API says use the front door within limits. The next question is whether breaking that contract is actually illegal, and that is where the courts come in.

What does US case law say about scraping public data?

US case law currently treats scraping publicly accessible data as outside the reach of the Computer Fraud and Abuse Act, which means it is not federal hacking. The leading authority is hiQ Labs v. LinkedIn. On April 18, 2022, the Ninth Circuit Court of Appeals affirmed that the CFAA does not bar scraping data that a website makes available to the general public. The full docket is on CourtListener.

The court’s reasoning was specific: when a network “generally permits public access to its data,” accessing that public data is not access “without authorization” under the CFAA. There are no login walls or permissions to defeat on a public page, so there is nothing to break into in the criminal sense. Law firm Jenner & Block’s analysis of the ruling lays out how narrowly the court read the statute.

The limits of that ruling matter as much as the holding:

Public only. The protection covers data available without logging in. The moment you authenticate or bypass a barrier, the CFAA analysis changes.
CFAA only. hiQ resolved the federal hacking question. It did not bless breach of contract. The same litigation later produced a separate finding that hiQ had breached LinkedIn’s User Agreement, summarized by Morgan Lewis.
Civil exposure remains. A site can still send a cease-and-desist, block you, and pursue civil claims for contract violations or other torts.

For YouTube specifically, this means scraping public video metadata is unlikely to be a CFAA crime, and is very likely a breach of YouTube’s Terms of Service that YouTube can act on by blocking you or, in principle, suing for breach of contract. Criminal exposure and contract exposure are simply two different layers. The layer that bites hardest for media platforms is usually a third one: copyright.

Is scraping YouTube transcripts and videos legal under copyright?

Scraping the video file or the transcript carries copyright exposure that scraping a view count does not, because the video and its spoken content are the creator’s copyrighted work. YouTube’s terms are explicit that creators keep ownership: the Terms of Service state that “you retain all of your ownership rights in your Content.” Scraping a number about a video is different in kind from copying the creative work itself.

The live test of this is happening in court right now. In August 2024, YouTube creator David Millette filed a class action, Millette v. OpenAI, in the US District Court for the Northern District of California. The suit alleges that OpenAI transcribed more than a million hours of YouTube videos using its Whisper speech-recognition model and used those transcripts to train GPT-4, without permission or compensation. TechCrunch reported the filing, which seeks over $5 million in damages and claims the practice infringed copyright and broke YouTube’s terms of service. Bloomberg Law covered the parallel complaint against Google over Gemini training.

That case is why the youtube transcript policy gets so much attention now. The transcript is downstream of a copyrighted video, so reusing transcript text at scale (especially to train models or republish content) is the highest-risk form of YouTube scraping from a copyright standpoint. The practical guidance I follow:

Treat transcripts as the creator’s copyrighted text, and assume reuse needs the same care as copying the video.
Personal analysis, search, and research use sit on far safer ground than commercial republication or model training.
If you need transcripts programmatically, pull them for your own analysis and keep the copyrighted text out of products you redistribute. My walkthrough on how to get a YouTube transcript covers the extraction side.

Copyright governs the content. One more body of law governs a specific slice of that content: anything that identifies a person.

Scraping personal data from YouTube can break privacy laws such as the GDPR even when the data is public, because public availability does not strip away data-protection rights. Usernames, channel-owner identities, comment authors, and any contact details are personal data the moment they relate to an identifiable person. Public metadata like aggregate view counts is not.

European regulators have moved on exactly this. France’s data protection authority, the CNIL, confirmed that collecting publicly accessible data can rely on the “legitimate interest” legal basis under GDPR Article 6(1)(f), but only after a strict necessity and balancing test. The European Data Protection Board reached a similar position in its Opinion 28/2024 on AI models. Legitimate interest is a door, and it opens only under conditions.

The enforcement is real. In December 2024 the CNIL fined the company KASPR 240,000 euros for scraping contact details from LinkedIn, including data of users who had restricted its visibility. The lesson for YouTube scraping is direct: collecting public metrics is low risk on the privacy axis, and collecting data that identifies individual users (comment authors, channel operators, emails) pulls you into GDPR obligations and potential fines.

Stack the three content-level risks together and a clear pattern appears, which points straight at how to scrape within the rules.

What you collect	ToS	Copyright risk	Privacy (GDPR) risk	Overall risk
Public view/like counts, durations	Violation	Low	Low	Low
Video titles, descriptions, tags	Violation	Low to moderate	Low	Low to moderate
Full transcripts / captions	Violation	High	Low	High
Video/audio files	Violation	High	Low	High
Comment text + author identities	Violation	Moderate	High	High
Emails / personal contact data	Violation	Low	High	High

How do I scrape YouTube data while staying on the safe side?

I lower the risk by controlling three things: what I collect, whether I touch logged-in pages, and whether I bypass technical blocks by hand. None of this makes scraping compliant with YouTube’s Terms of Service, and I am clear-eyed that automated collection breaches the contract. It does keep me well clear of the criminal, copyright, and privacy lines that actually carry penalties. The practices I hold to:

Stay on public data. Public metadata only. I do not log in to scrape, and I do not touch private or unlisted content, which keeps the CFAA analysis from ever applying.
Respect the high-risk categories. I treat transcripts and video files as copyrighted, and I avoid harvesting comment-author identities or emails as a default, because that is where copyright and GDPR exposure live.
Do not circumvent security. Solving CAPTCHAs to defeat a block, or faking authentication, is the kind of “bypassing technical protections” that turns a gray case darker. I avoid it.
Prefer the official API first. For any field the YouTube Data API exposes inside quota, that is the front door, and I use it.
Use a scraper API for the rest. For public data the API does not return cleanly, a managed scraper API handles the request layer so I am not hand-rolling proxy rotation or block evasion.

That last point is where our YouTube API fits. It takes a YouTube URL and returns structured public data, so the request mechanics live on the server side instead of in a script I am babysitting. A public video-metadata request looks like this:

curl "https://api.youtubescraperapi.com/api/v1/youtube/video?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ&api_key=$API_KEY"

For transcript-specific work, the YouTube Transcript API endpoint returns the timed text for a video, and the YouTube video scraper endpoint returns titles, view counts, and other public metadata. You can create a free youtubescraperapi.com key and run the call above against any public video. Using a scraper API does not change YouTube’s terms, and it does not grant permission YouTube has not given. What it does is keep me on public data and out of manual block-circumvention, which is the part that escalates legal risk.

If you are choosing a tool for this, I compared the main options on accuracy and block handling in my roundup of the best YouTube scrapers in 2026. The legal posture is the same whichever you pick: the responsibility for what you collect, and how you use it, stays with you.

The bottom line on YouTube scraping and the law

Scraping public YouTube data is generally not a US crime under the CFAA after hiQ v. LinkedIn, and it does violate YouTube’s Terms of Service, which ban automated access without written permission. The penalties that actually sting come from the other three rulebooks: breach of contract (YouTube blocking or suing you), copyright (the OpenAI transcript litigation), and privacy law (the CNIL’s KASPR fine). Keep collection to public metadata, stay off logged-in and copyrighted content, avoid scraping personal data, and do not bypass security measures. Do that, and the question stops being “is this a crime” and becomes “am I respecting the contract and the content,” which is a question you can manage.

FAQ

Is scraping YouTube illegal?

Scraping publicly visible YouTube data (titles, view counts, public metadata) is generally not a criminal act in the US after hiQ v. LinkedIn, which held that accessing public data does not violate the Computer Fraud and Abuse Act. It still breaches YouTube's Terms of Service, which is a contract issue, and scraping private or copyrighted data carries separate legal risk.

Does YouTube's Terms of Service prohibit scraping?

Yes. YouTube's Terms of Service prohibit accessing the Service "using any automated means (such as robots, botnets or scrapers)" with two exceptions: public search engines following YouTube's robots.txt, and parties with YouTube's prior written permission. The YouTube API Services Terms add that developers may only access data "by the means described" in the documentation.

Can I web scrape YouTube for personal or research use?

You can technically scrape public YouTube pages for personal or research use, and US public-data law generally does not treat that as a crime. It still violates YouTube's Terms of Service, so the practical risk is account or IP blocking. Selling or redistributing the data, or scraping personal data, raises the stakes.

Is scraping YouTube transcripts legal?

Scraping a public transcript sits in the same gray zone as other public metadata, but transcripts carry extra copyright exposure because the underlying video is the creator's work. A 2024 class action (Millette v. OpenAI) alleges that transcribing YouTube videos to train AI models infringed copyright and broke YouTube's terms. Read YouTube's policy and the creator's rights before reusing transcript text commercially.

What is the safest legal way to get YouTube data?

The lowest-risk route is YouTube's official Data API for the fields it exposes, within its quota. For data the API does not return cleanly, limit yourself to public metadata, avoid logged-in pages, and do not bypass security measures. A scraper API handles the request layer without you circumventing technical blocks by hand.