Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.particle.pro/llms.txt

Use this file to discover all available pages before exploring further.

The episode stream is a long-lived Server-Sent Events (SSE) connection that pushes episodes to you the moment they reach a chosen stage of ingestion — no polling. Open one connection, pick the level of hydration you care about, optionally filter to the podcasts you follow, and receive each new episode as it crosses that point.
The episode stream is an Enterprise feature. Access requires an API key belonging to an organization on the Enterprise plan. Authenticate exactly as you do elsewhere — the X-API-Key header (recommended) or an Authorization: Bearer token.
GET  https://api.particle.pro/v1/podcasts/episodes/stream
POST https://api.particle.pro/v1/podcasts/episodes/stream
Use GET for no filter, a popularity threshold, or a small explicit set of podcasts. Use POST (with a JSON body) when you want to filter on a large podcast_ids set — a long list would exceed query-string limits on a GET. The two forms are otherwise identical.

Pick a milestone

Episodes move through ingestion in stages. You subscribe to exactly one milestone and receive each episode once, when it reaches that stage. The milestones are strictly ordered — each builds on the previous — so picking a later milestone means you wait longer but the episode arrives with more data already populated.
milestoneDelivered when…What’s populated on the episode
discoveredthe episode first appears in the feedTitle, URL, publish date, podcast, basic metadata. No transcript/segments/clips yet.
transcribed (default)speech-to-text + speaker identification finishFull diarized transcript and identified speakers.
segmentedthe transcript is broken into structural segmentsSegments (intros, ad reads, topic blocks).
fully_ingestedingestion is completeClips and the full enrichment set. This is the terminal contract: anything added to “fully ingested” in future automatically flows to subscribers of this milestone.
If you don’t pass milestone, you get transcribed. Choose the single milestone that matches the data you need — a later one implies all earlier stages already happened. Expect real latency between stages: transcription and enrichment take minutes to hours.

Filter the podcasts

By default the stream delivers every episode in the catalog. Narrow it two ways, which combine as a union (an episode is delivered if it matches either):
  • podcast_ids — an explicit set of podcasts, each given as a slug (pivot) or ID. An episode is delivered if its podcast is in the set. Unknown values are ignored, so a single bad slug won’t break the stream — but if none of the supplied ids match a known podcast, the request fails immediately with an error event rather than leaving you waiting on a stream that can never produce anything.
  • popularity_threshold — a number in (0, 1). Podcast popularity is normalized 0–1 across the catalog (a percentile), so 0.9 ≈ the top 10% most popular podcasts. Use this to follow “the popular stuff” without enumerating ids.
Pass a large podcast_ids set via the POST body (see below). On GET, podcast_ids is capped at 100; beyond that you’ll get an error event telling you to use POST.

Parameters

milestone, cursor, since, and include are always query parameters. podcast_ids and popularity_threshold are query parameters on GET and JSON body fields on POST.
ParameterDescription
milestoneOne of discovered, transcribed, segmented, fully_ingested. Defaults to transcribed.
podcast_idsSlugs or IDs to filter to (union with popularity_threshold). GET: comma-separated, ≤100. POST: JSON array, ≤1000.
popularity_thresholdNumber in (0, 1). Deliver only podcasts at or above this popularity percentile.
cursorOpaque resume token from a previously received event. See Resuming.
sinceISO 8601 date or date-time to backfill from when you have no cursor. Ignored if cursor is set.
includeHeavy relations to embed in each episode (comma-separated): transcript, segments, clips, or all. Omitted by default. See Hydrate the payload.

Open the stream

A simple GET — all transcribed episodes, live:
# -N disables curl's output buffering so events print as they arrive.
curl -N "https://api.particle.pro/v1/podcasts/episodes/stream?milestone=transcribed" \
  -H "X-API-Key: $PARTICLE_API_KEY"
Filtered by popularity, or by a handful of shows (GET):
# Top ~10% most popular podcasts, at the segmented milestone:
curl -N "https://api.particle.pro/v1/podcasts/episodes/stream?milestone=segmented&popularity_threshold=0.9" \
  -H "X-API-Key: $PARTICLE_API_KEY"

# A few specific shows by slug:
curl -N "https://api.particle.pro/v1/podcasts/episodes/stream?podcast_ids=pivot,lex-fridman,all-in" \
  -H "X-API-Key: $PARTICLE_API_KEY"
A large explicit set (POST with a JSON body):
curl -N -X POST "https://api.particle.pro/v1/podcasts/episodes/stream?milestone=transcribed" \
  -H "X-API-Key: $PARTICLE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "podcast_ids": ["pivot", "lex-fridman", "QpMz7GYKfSNuUa6zKXA4Q", "... up to 1000 ..."] }'
With no cursor or since, the stream is live-only: you receive episodes that reach your milestone from the moment you connect forward.

Event format

Each message is an SSE event. There are two event types. event: episode — an episode reached your milestone. The data is a JSON envelope:
event: episode
data: {
  "milestone": "transcribed",
  "cursor": "g3Qk9m8...",          // opaque resume token — store this
  "episode": {
    "id": "78cgekLUjCJBUZbj3s5K8Y",
    "title": "WHCD Shooting Aftermath, Musk and Altman Face-Off…",
    "podcast": { "id": "QpMz7GYKfSNuUa6zKXA4Q", "title": "Pivot" },
    "published_at": "2026-04-28T10:00:00Z",
    "has_transcript": true
    // …same shape as a /v1/podcasts/episodes list entry
  }
}
The episode object is the same list-shaped representation returned by list episodes and the feed, hydrated to the level implied by your milestone (has_transcript, segment_count, etc. reflect the stage reached). For the full per-episode detail — topics, all entities, videos — fetch GET /v1/podcasts/episodes/{id}, or embed the heavy relations inline with include. event: error — a terminal error (e.g. a filter that matched no podcasts, too many ids for a GET, or an invalid cursor). The server sends one and closes the connection:
event: error
data: { "message": "no podcasts matched the provided filter (podcast_ids / popularity_threshold)" }

Hydrate the payload

By default each episode carries only its metadata, counts, and flags (has_transcript, segment_count, clip_count) — the heavy relations are not shipped, so a consumer that only needs to know an episode reached a milestone never pays for transcript bytes. To embed those relations directly — and avoid a follow-up request per delivered episode — pass include:
include valueEmbedsAvailable at milestone
transcriptepisode.transcript — the dialogue transcript, identical to GET /v1/podcasts/episodes/{id}/transcript?format=dialoguetranscribed
segmentsepisode.segmentssegmented
clipsepisode.clipsfully_ingested
alleverything available at the chosen milestone
Combine values with commas: include=transcript,clips. A relation can only be embedded at a milestone that guarantees it. Each becomes available at the milestone above, and because milestones are ordered, you can only embed what your milestone has reached. Asking for clips at milestone=transcribed is a contradiction — you’d be woken before clips exist — and is rejected with a terminal error event. all is milestone-relative: it expands to exactly the relations your milestone guarantees, so it never conflicts (e.g. all at transcribed embeds just the transcript).
# Segments at the segmented milestone, embedded in each event:
curl -N "https://api.particle.pro/v1/podcasts/episodes/stream?milestone=segmented&include=segments" \
  -H "X-API-Key: $PARTICLE_API_KEY"

# Everything available at the terminal milestone:
curl -N "https://api.particle.pro/v1/podcasts/episodes/stream?milestone=fully_ingested&include=all" \
  -H "X-API-Key: $PARTICLE_API_KEY"
Word-level transcripts are paginated and can’t be embedded inline; fetch them from GET /v1/podcasts/episodes/{id}/transcript/words.

Manage the stream lifecycle

Programming against the stream is mostly about three things: store the cursor, dedupe on episode id, and reconnect.
Disconnections are normal — design for automatic reconnection from day one. A long-lived stream will be interrupted periodically, and most often it’s not your network: we ship frequently, and every deploy does a rolling restart of the serving pods, which closes all open streams. Idle proxies and load balancers also recycle long connections. This is routine operation, not an error and not data loss — the durable log + your cursor guarantee a gap-free resume.Treat “the connection ended” as an ordinary, expected event your client handles silently, not an exception to alert on. Build the reconnect-with-backoff loop in from your very first implementation (see A resilient consumer); a client that assumes one connection stays open indefinitely will break during the next deploy.

The cursor

Every episode event carries an opaque cursor. Treat it as a black box — don’t parse it. Persist the cursor of the last event you have fully processed. It’s your resume point.

Delivery is at-least-once

You may occasionally receive the same episode more than once — most commonly right after a reconnect. Dedupe on episode.id and make your processing idempotent. You will not silently miss episodes (see below), but you should expect the rare duplicate rather than assume exactly-once.

Resuming after a disconnect

Connections end — network blips, your deploys, our rolling restarts. To resume without gaps, reconnect and pass the last cursor you stored as ?cursor=:
GET /v1/podcasts/episodes/stream?milestone=transcribed&cursor=g3Qk9m8...
The stream first replays every episode after that cursor (catch-up), then transitions seamlessly to live. (On POST, send the same body and the updated ?cursor=.) If you’ve never connected before and want history, use since instead of cursor.
If your consumer falls too far behind to keep up, the server ends the connection deliberately. This is not data loss: reconnect from your last stored cursor and the catch-up replay fills the gap. The golden rule is simply always reconnect from your last processed cursor.

A resilient consumer

The pattern in any language: connect → on each episode event, dedupe and process, then store its cursor → on error or disconnect, back off and reconnect with the stored cursor. Use exponential backoff with jitter, capped at a ceiling (e.g. 1s → 30s), and reset the delay to its minimum after a connection stays up and delivers — so a routine deploy reconnects within a second or two, while a sustained outage doesn’t hammer the API.
JavaScript
let cursor = await loadSavedCursor(); // null on first run
const seen = new Set();
let backoff = 1000; // ms; grows on repeated failure, resets on success
const MAX_BACKOFF = 30000;

while (true) {
  const url = new URL("https://api.particle.pro/v1/podcasts/episodes/stream");
  url.searchParams.set("milestone", "transcribed");
  if (cursor) url.searchParams.set("cursor", cursor);

  try {
    const res = await fetch(url, {
      headers: { "X-API-Key": process.env.PARTICLE_API_KEY },
    });
    for await (const evt of parseSSE(res.body)) {
      if (evt.event === "error") break; // terminal; reconnect from `cursor`
      if (evt.event !== "episode") continue;

      const { episode, cursor: next } = JSON.parse(evt.data);
      if (!seen.has(episode.id)) {
        seen.add(episode.id);
        await handleEpisode(episode); // idempotent
      }
      cursor = next;          // advance only after successful processing
      await saveCursor(cursor);
      backoff = 1000;         // healthy connection — reset backoff
    }
  } catch (err) {
    // network error / stream closed (e.g. a deploy) — fall through to reconnect
  }
  // Exponential backoff with jitter, capped. The disconnect itself is expected;
  // this just avoids reconnect storms during a longer outage.
  const delay = Math.min(backoff, MAX_BACKOFF) * (0.5 + Math.random() / 2);
  await sleep(delay);
  backoff = Math.min(backoff * 2, MAX_BACKOFF);
}
parseSSE is any standard SSE line parser (split on blank lines; read event: and data: fields). Persisting cursor to durable storage lets you resume cleanly across process restarts, not just transient drops.

Stream vs. poll

Not on Enterprise, or prefer polling to a long-lived connection? The episode feed is the all-plans pull alternative — the same episodes, milestones, and filters, returned by a resumable GET you poll on your own schedule. Reach for the stream when you want push-based, low-latency delivery without managing a poll loop. For plain catalog browsing, list episodes (which also accepts fully_ingested=true) is simpler still.
  • Episodes — the same episode shape, by query or by ID
  • Transcripts — dialogue available once an episode reaches transcribed
  • Segments & clips — available at segmented and fully_ingested