~ / endpoints / Transcript API

YouTube Transcript Scraper API

Our YouTube transcript API takes a video ID and returns the full transcript as clean JSON: timestamped segments, the language, whether the captions are auto-generated or human, and the complete text in one field. Proxies, anti-bot, and rendering are handled, so one request with your API key is the whole integration.

Get a free API keyAll endpoints
JSON
structured output
2.6s
median response
99.2%
success rate
1,000
free requests
the problem

Why YouTube Transcript data is hard to get

The official YouTube Data API v3 only returns caption text for videos you own: captions.download needs OAuth and edit permission, so it answers a 403 for anyone else’s public video. The open-source libraries that read the public caption track work locally but throw IpBlocked from cloud IPs, which is the gap a hosted YouTube transcript API fills.

quickstart

The YouTube Transcript Scraper API in one request

cURL
curl "https://api.youtubescraperapi.com/api/v1/youtube/transcript?video_id=dQw4w9WgXcQ&api_key=$API_KEY"
Python
import requests, os

resp = requests.get(
    "https://api.youtubescraperapi.com/api/v1/youtube/transcript",
    params={
        "video_id": "dQw4w9WgXcQ",   # YouTube video id
        # "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",  # or pass the full watch URL
        # "format": "both",          # both | segments | text  (default both)
        # "units": "seconds",        # seconds | ms            (default seconds)
        "api_key": os.environ["API_KEY"],
    },
    timeout=30,
)
data = resp.json()

if not data["transcript_available"]:
    # no captions: a reason, not an error (age_restricted, members_only,
    # transcripts_disabled, none_found, video_unavailable)
    print("no transcript:", data["reason"])
else:
    print(data["language_name"], "| auto-generated:", data["is_generated"], "| source:", data["source"])
    print(data["segment_count"], "segments,", data["word_count"], "words")
    for seg in data["segments"][:3]:
        print(f"{seg['start']:.2f}s  {seg['text']!r}")

    # full transcript as one string, ready for an LLM
    transcript = data["text"]
parameters

Parameters

ParameterRequiredDefaultNotes
video_idrequired-YouTube video id, e.g. dQw4w9WgXcQ. Required unless you pass url.
urloptional-Full watch URL as an alternative to the id
formatoptionalbothboth, segments, or text. Choose timestamped segments, one combined text string, or both.
unitsoptionalsecondsseconds or ms. Sets the unit for each segment's start and duration.
response

What the YouTube Transcript Scraper API returns

200 OK
{
  "video_id": "dQw4w9WgXcQ",
  "language": "en",
  "language_name": "English",
  "is_generated": false,
  "source": "native",
  "format": "both",
  "units": "seconds",
  "segment_count": 61,
  "word_count": 487,
  "char_count": 2089,
  "segments": [
    {
      "start": 18.64,
      "duration": 3.6,
      "text": "We're no strangers to love"
    },
    {
      "start": 22.24,
      "duration": 3.84,
      "text": "You know the rules and so do I"
    }
  ],
  "text": "We're no strangers to love You know the rules and so do I ...",
  "available_languages": ["en", "es", "fr", "de", "pt", "ja"],
  "transcript_available": true
}
FieldTypeDescription
video_idstringThe video id the transcript belongs to
languagestringLanguage code of the returned transcript, e.g. en
language_namestringHuman-readable language name, e.g. English
is_generatedbooleantrue for auto-generated (ASR) captions, false for human-uploaded
sourcestringnative when the captions are a real uploaded track, asr when they are YouTube's auto-generated speech recognition
formatstringEchoes the requested format: both, segments, or text
unitsstringThe time unit used for each segment, seconds or ms
segment_countintegerNumber of timed segments in the transcript
word_countintegerTotal words across the transcript, handy for token and cost estimates
char_countintegerTotal characters in the combined transcript text
segmentsarrayTimed lines, each with start, duration, and text. Returned when format is both or segments.
textstringThe full transcript joined into one string. Returned when format is both or text.
available_languagesarrayLanguage codes that have a caption track for this video
transcript_availablebooleantrue when captions came back. false for caption-less videos, paired with a reason instead of an error.
reasonstringPresent only when transcript_available is false: age_restricted, members_only, transcripts_disabled, none_found, or video_unavailable
use cases

What you can build using the API

>

AI summaries and RAG

Pass the <code>text</code> field straight into an LLM to summarize a talk or ground a retrieval pipeline on real video content.
>

Searchable video archives

Index the transcript so users can search inside hours of footage and jump to the timestamped segment.
>

Subtitles and captions

Use the timed segments to build SRT or VTT files, or to repurpose long videos into clipped captions.
>

Content repurposing

Turn a video transcript into a blog draft, show notes, or a newsletter without retyping a word.
>

Multilingual workflows

Read <code>available_languages</code>, then request the track you need for translation or localization.
>

Research and analytics

Mine transcripts across a channel for keywords, topics, and mentions at scale.
why youtubescraperapi.com

Why teams choose our YouTube Transcript Scraper API

Our YouTube transcript API runs on our infrastructure: rotating residential proxies, anti-bot and CAPTCHA handling, and JS rendering behind one youtube/transcript call that returns validated JSON in about 2.6 seconds. The free plan covers 1,000 requests, and you only pay for successful calls.

*

Segments or one text block

The <code>format</code> param lets you ask for timestamped <code>segments</code>, a single combined <code>text</code> string, or <code>both</code> in one call, so you skip stitching or splitting on your side.
*

Auto-caption (ASR) fallback

When a video has no human captions, the endpoint returns YouTube's auto-generated track with <code>source: "asr"</code> and <code>is_generated: true</code>, so most spoken-word videos still come back with text.
*

Clear no-caption reasons

Caption-less videos return <code>transcript_available: false</code> and a <code>reason</code> (age_restricted, members_only, transcripts_disabled, none_found, video_unavailable) rather than an error you have to catch.
*

Seconds or milliseconds

The <code>units</code> param sets each segment's <code>start</code> and <code>duration</code> to <code>seconds</code> or <code>ms</code>, matching whatever your player or subtitle tooling expects.
*

Proxy rotation and anti-bot

Requests route through residential IPs with anti-bot handling, which sidesteps the datacenter-IP blocking that throws <code>IpBlocked</code> from cloud servers.
*

Pay for success

Failed requests are not charged, so retries and blocks cost you nothing and the bill tracks real transcripts returned.
comparison

YouTube Transcript Scraper API vs the official YouTube API

ApproachSetupMaintenanceTranscript textLimits
requests + parsinghighconstantyes, if it loadsIpBlocked on cloud IPs
Headless browser (Playwright)highhighyesslow, rate-limited
YouTube Data API v3OAuth setuplowowner videos only403 for others, 10k units/day
Our transcript APIone callnoneany public videohandled, pay per success
pricing

Start free, scale when ready

PlanPriceBest for
Free1,000 requestsTesting and small jobs
Pro$0.60 / 1kProduction workloads
Pay-as-you-go$0.90 / 1kSpiky or one-off volume

Median response 2.6s. You only pay for successful requests.

FAQ

What is a YouTube transcript API?

A YouTube transcript API is an endpoint that takes a video ID and returns the video's transcript as structured data. Ours reads the public caption track and gives you back timestamped segments, the language, whether the captions are auto-generated, and the full transcript text in one JSON response, typically in about 2.6 seconds. A format param lets you choose timestamped segments, one combined text string, or both.

Does the official YouTube Data API return transcripts?

No. The YouTube Data API v3 only downloads captions for videos you own, through a captions.download method that requires OAuth with the youtube.force-ssl scope and edit permission on the video. It returns a 403 for any public video you do not control, so it cannot fetch transcripts for arbitrary videos. That gap is the reason a dedicated transcript API exists.

Can I get a transcript for any YouTube video?

You can get a transcript for any YouTube video that has a caption track, whether the creator uploaded it or YouTube auto-generated it. When no human captions exist, the endpoint falls back to the auto-generated (ASR) track, returning source: "asr" and is_generated: true, so most spoken-word videos work. If a video genuinely has no captions, the response is not an error: it comes back with transcript_available: false and a reason such as none_found, transcripts_disabled, members_only, age_restricted, or video_unavailable.

How do I get captions in a specific language?

Each response includes an available_languages array listing the language codes that have a caption track for that video. Read that field to see what exists, then the response also returns the chosen track's language and language_name so you know exactly which transcript you received.

Do I need my own proxies or an API key from Google?

No. You do not need a Google Cloud project, an OAuth flow, or your own proxy pool. Proxy rotation, anti-bot handling, and retries run on our servers. You send one request with your API key and the video ID, and the parsed transcript comes back.

What does it cost to pull a transcript?

The free plan includes 1,000 requests. Paid usage runs about $0.60 per 1,000 on the Pro plan, or $0.90 per 1,000 pay-as-you-go. Failed requests are never charged, so you only pay for transcripts that actually come back.

Get transcript api as JSON
Free plan, 1,000 requests. No credit card required.
Get a free API key All endpoints