How to Access Your Substack Post Data via API

If you're a Substack publisher looking to back up your work or analyze your content programmatically, you might wonder how to extract the underlying data. Substack provides an internal API that lets you retrieve your published posts in a structured JSON format. This guide explains the process step by step, from finding the right endpoint to decoding the response. Whether you're automating workflows or just curious, these Q&As will walk you through the essentials.

What endpoint do I use to fetch Substack post data?

The key endpoint is /api/v1/drafts/${post_id}, where post_id is an integer representing your specific post. For example, to retrieve data for post 12345 on an account called mynewsletter, you would call https://mynewsletter.substack.com/api/v1/drafts/12345. This endpoint returns a detailed JSON object containing all metadata and the body of the post. The post_id can be found in the URL of your Substack post editor or by inspecting your dashboard. Note that this is an authenticated endpoint, so you'll need to include your session cookie in the request.

How to Access Your Substack Post Data via API
Source: dev.to

How do I authenticate the API request?

You must include a valid session cookie in the HTTP headers. Without it, the server will reject the request. The essential headers are:

  • Accept: */*
  • Accept-Encoding: gzip, deflate, br, zstd
  • Accept-Language: en-US
  • Cookie: your full cookie string from the browser (e.g., connect.sid=...; gdpr=...; ...)
  • User-Agent: your browser's user agent (e.g., Mozilla/5.0 ...)
  • sec-ch-ua, sec-ch-ua-mobile, sec-ch-ua-platform, sec-fetch-*: typical values that mimic a browser request.

Set these headers in your HTTP client (curl, Postman, etc.). For curl, you can use -H "Cookie: ..." and -H "User-Agent: ...". Without proper authentication, you'll get a 401 Unauthorized error.

Can I see a complete curl example?

Yes! Here's a typical curl command to fetch a Substack post's data:

curl -H "Accept: */*" -H "Accept-Encoding: gzip, deflate, br, zstd" -H "Accept-Language: en-US" -H "Cookie: YOUR_COOKIE_HERE" -H "User-Agent: YOUR_USER_AGENT" -H "sec-ch-ua: \"Chromium\";v=\"124\", \"Google Chrome\";v=\"124\"" -H "sec-ch-ua-mobile: ?0" -H "sec-ch-ua-platform: \"Linux\"" -H "sec-fetch-dest: empty" -H "sec-fetch-mode: cors" -H "sec-fetch-site: same-origin" -H "sec-gpc: 1" -H "Priority: u=1, i" https://ACCOUNT_NAME.substack.com/api/v1/drafts/POST_ID

Replace ACCOUNT_NAME, POST_ID, YOUR_COOKIE_HERE, and YOUR_USER_AGENT with your actual details. If successful, the server returns a JSON response with a 200 OK status.

What does the JSON response contain?

The JSON object is quite large and includes many fields such as title, published_at, canonical_url, visibility, SEO metadata, and more. The actual article content is stored in a field called draft_body, which contains a stringified JSON object. This nested JSON uses Substack's proprietary Document Model — a structured format that describes paragraphs, images, embeds, headlines, etc. For example, the JSON snippet:

How to Access Your Substack Post Data via API
Source: dev.to
{
  ...
  "draft_body": "[{\"type\":\"paragraph\",\"content\":[{\"type\":\"text\",\"text\":\"Hello world\"}]}]",
  ...
}

You'll need to parse draft_body with JSON.parse() to access the actual content blocks.

How do I extract the post body from draft_body?

Once you receive the API response, extract the draft_body string and parse it as JSON. For instance, in JavaScript: let body = JSON.parse(response.draft_body);. The resulting array contains objects representing each block in your post (paragraphs, images, headings, etc.). Each block has a type (e.g., paragraph, image, heading) and a content array that holds the actual text or media references. For bold or italic text, the content may contain nested type: "strong" or type: "em" elements. To get a plain text version, iterate through all blocks and concatenate their text nodes.

Are there any community resources for the Substack Document format?

Yes! The Substack Document Model used in draft_body is well-documented by the open-source community. You can check the DeepWiki page on Substack for a detailed schema description. Another excellent resource is the can3p/substack-api-notes repository, which includes reverse-engineered documentation of the JSON structure. These references help you understand how to handle embeds, images with captions, and custom elements like pull quotes.

What are some practical use cases for this API?

Beyond simple backup, you can use the API to automate content migration to other platforms, perform text analysis, create custom archives, or build a searchable index of your posts. Developers often integrate this with static site generators (e.g., Hugo, Jekyll) by converting the JSON content to Markdown. You could also monitor changes in your posts over time by comparing JSON snapshots. Because the response includes publishing dates and SEO metadata, you can generate sitemaps or RSS feeds tailored to your needs. The possibilities are vast — from personal content management to powering third-party tools like newsletter analytics dashboards.

Recommended

Discover More

Ann Leckie's Radiant Star: A New Gem in the Radch UniverseMassive Transformers and 90km Cable: Inside the Logistical Nightmare Bringing Marinus Link to Victoria's Coal Country10 Key Insights into Small Language Models for Enterprise AIAWS Weekly Highlights: Claude Opus 4.7 Launches, Interconnect Goes GAFrom Fringe to Mainstream: The QAnon Playbook for Hijacking Online Discourse