EN PT ID

X Articles vs Bluesky Long-Form 2026

June 15, 2026 · 9 min read · Comparison

In late May 2026, Bluesky launched a long-form content feature designed to compete directly with X Articles. For the first time, two major social platforms offer native long-form publishing — and content scraping tools like ThreadGrab have a new frontier to cover.

This is not another "which platform is better for writers" comparison. This is a technical guide to scraping, archiving, and repurposing content from both platforms in 2026. If you are a researcher, an AI trainer, or a content creator who wants to own your data, here is what changed and how to adapt.

TL;DR. Both X Articles and Bluesky long-form can be saved as Markdown using ThreadGrab. X uses a proprietary API with stricter rate limits. Bluesky uses the open AT Protocol (free, no API key). For batch archiving, Bluesky is easier to scrape. For single high-value articles, both work identically through ThreadGrab.

What Changed: Bluesky Long-Form Content (May 2026)

Bluesky's long-form feature, announced on May 28, 2026, lets users write and publish posts exceeding the traditional 300-character limit. Similar to X Articles, these long-form posts support rich text, headers, lists, and embedded media. The difference is in the underlying protocol: Bluesky builds on the AT Protocol, an open, decentralized standard that any developer can query without authentication.

X Articles, by contrast, sits inside X's proprietary ecosystem. To scrape them programmatically, you need either the X API (paid tiers starting at $200/month) or a third-party tool like ThreadGrab that reverse-engineers the public web interface.

Feature X Articles Bluesky Long-Form
Launch date Late 2024 (public) May 28, 2026
Protocol Proprietary (X API) Open (AT Protocol)
Auth required for scraping Yes (API key or web scraping) No (public API)
Rate limits Strict (100 req / 15 min) Generous (AT Protocol)
Markdown output via ThreadGrab Yes Yes
Best for scraping Single articles, individual saves Batch feeds, research archives

How to Scrape X Articles in 2026

X Articles are structured as HTML documents rendered inside X's web interface. The key challenge is that X serves articles as part of a React application, meaning the raw HTML source contains minimal content — most of the text is loaded dynamically via JavaScript.

ThreadGrab handles this by rendering the page server-side and extracting the article body from the DOM tree. The result is clean Markdown with no boilerplate, no sidebar, no suggested posts.

# Save an X Article as Markdown (via ThreadGrab API)
curl -s "https://threadgrab.com/api/x/article/some-article-title" \
  | jq -r '.text' > article.md

# Or use the profile API to get the latest article from a user
curl -s "https://threadgrab.com/api/profile/paulg" \
  | jq -r '.[] | select(.type == "article") | .text' > paulg-latest.md

Pro tip. X rate-limits anonymous page views aggressively in 2026. If you scrape X Articles directly with curl or Playwright, expect frequent CAPTCHAs and temporary IP blocks. ThreadGrab rotates user agents and proxies so you do not have to.

How to Scrape Bluesky Long-Form Content

Bluesky's AT Protocol makes scraping dramatically simpler. Every post — including long-form content — is stored as an AT Protocol record. You can query these records directly through any AT Protocol relay or through Bluesky's public API without authentication.

# Fetch a Bluesky user's recent posts (including long-form) via AT Protocol
curl -s "https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed?actor=username.bsky.social" \
  | jq -r '.feed[] | .post.record.text' > bsky-archive.md

# ThreadGrab supports Bluesky natively
curl -s "https://threadgrab.com/api/profile/username.bsky.social" \
  | jq -r '.[] | .text' > bsky-threadgrab.md

A critical advantage: Bluesky posts are signed with cryptographic keys and stored on Personal Data Servers (PDS). Even if a post is deleted from the user's timeline, the record may still exist on the PDS, making Bluesky a better platform for long-term content preservation.

Side-by-Side: Scraping Comparison

Criteria X Articles Bluesky Long-Form ThreadGrab (both)
Scraping difficulty High (JS rendering, CAPTCHAs) Low (open API, no CAPTCHA) Minimal (one endpoint)
Programmatic access X API (paid) or scraping AT Protocol (free, public) Free API, no auth
Rate limit handling Manual throttling required Generous limits Built-in retry + proxy
LLM-ready output Depends on tool Depends on tool Clean Markdown by default
Long-term preservation Content can be deleted Signed records on PDS Save to local .md files
Batch support Per-article or per-profile Per-feed or per-profile Per-profile (both platforms)

Building a Cross-Platform Archiving Pipeline

The true power of ThreadGrab is treating X and Bluesky as interchangeable sources. Here is a real-world pipeline that archives both platforms into a single Markdown vault:

#!/bin/bash
# Cross-platform content archive -- runs daily via cron

USERS_X=("paulg" "kelseyhightower" "levelsio")
USERS_BSKY=("jack.bsky.social" "tante.bsky.social")

OUTPUT_DIR="$HOME/archive/social-content"
mkdir -p "$OUTPUT_DIR"

echo "=== Archiving X Articles ==="
for user in "${USERS_X[@]}"; do
  curl -s "https://threadgrab.com/api/profile/$user" \
    | jq -r '.[] | select(.type == "article") | "## \(.author)\n\(.text)\n"' \
    > "$OUTPUT_DIR/x-$user-$(date +%Y-%m-%d).md"
done

echo "=== Archiving Bluesky Long-Form ==="
for user in "${USERS_BSKY[@]}"; do
  curl -s "https://threadgrab.com/api/profile/$user" \
    | jq -r '.[] | "## \(.author)\n\(.text)\n"' \
    > "$OUTPUT_DIR/bsky-$user-$(date +%Y-%m-%d).md"
done

echo "Archived to $OUTPUT_DIR"

This pipeline generates one Markdown file per platform per user per day. You can feed these files into Obsidian, Notion, or any LLM knowledge base. The jq filter select(.type == "article") picks only long-form posts from X profiles, while Bluesky's output already exposes the post text directly.

What the Bluesky Long-Form Launch Means for Scraping Tools

The launch of Bluesky long-form content reshapes the content scraping landscape in three important ways:

Note. Bluesky long-form content is less than three weeks old as of this writing. The AT Protocol relay infrastructure is still maturing. Some long-form posts may take minutes to propagate across relays. For production archiving, use ThreadGrab's API which queries multiple relays and falls back gracefully.

Which Platform Should You Scrape — Based on Your Use Case

Your goal Best platform Recommended method
LLM training data Both (diverse sources) ThreadGrab API + jq filter
Personal research archive Bluesky (open, permanent) AT Protocol direct query
Journalism / fact-checking X Articles (more authors) ThreadGrab with CAPTCHA bypass
Monitoring competitors Both (cross-reference) ThreadGrab cron pipeline
Building a knowledge base Both (max coverage) ThreadGrab + Obsidian vault
Occasional single-article save Either ThreadGrab web interface

FAQ

Does Bluesky long-form content require authentication to scrape?

No. Bluesky's AT Protocol is public by default. You can query posts, feeds, and profiles without an API key or account. This is a major advantage over X, which requires authentication for programmatic access.

Can ThreadGrab save both X Articles and Bluesky long-form posts?

Yes. ThreadGrab supports both platforms through a single API endpoint. Use the profile API to fetch all recent content from a user, regardless of whether they post on X, Bluesky, or both.

Is Bluesky long-form content permanent once archived via AT Protocol?

Bluesky posts are stored on Personal Data Servers (PDS). If the author deletes a post, the PDS may still retain the record. However, for guaranteed permanence, always save a local copy as Markdown or JSON.

What are the rate limits for scraping X Articles in 2026?

X's anonymous rate limits are approximately 100 page views per 15 minutes per IP. For heavy scraping, use a rotating proxy service or route through ThreadGrab which manages rate limits automatically.

Can I automate a daily archive of both X and Bluesky content?

Yes. Use the cron pipeline shown above. ThreadGrab's API handles both platforms in the same request pattern. Schedule it with a simple cron job — no API keys, no OAuth, no platform-specific code.

Start saving X Articles and Bluesky long-form content as Markdown today — no account needed.

Try ThreadGrab — Free Cross-Platform Content Downloader

The Scraping Frontier Is Open

The battle between X Articles and Bluesky long-form content is just beginning. For creators, researchers, and archivists, the winner is clear: having two major platforms competing on long-form means more content to discover, more perspectives to archive, and more incentive for tools like ThreadGrab to support both.

Bluesky's open protocol makes it the easier platform to scrape technically. X Articles has the larger existing library of content. Together, they cover the full spectrum of social long-form publishing in 2026. The smartest archiving strategy uses both.