Skip to content

maitty8879/Article-Reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Article-Reader

A Claude Code skill that fetches web articles and saves them as clean Markdown files locally. It auto-detects the URL type and picks the best scraping strategy.

Supported Sources

Source Strategy
X (Twitter) posts & articles fxtwitter API, falls back to Playwright
WeChat Official Account articles Playwright (mobile UA)
Everything else Jina Reader, falls back to Playwright

Features

  • Auto URL routing - detects x.com, twitter.com, mp.weixin.qq.com, or generic URLs and applies the right scraper
  • English-to-Chinese translation - automatically detects English articles and translates them to Chinese before saving
  • Clean Markdown output - preserves headings, bold/italic, links, images, blockquotes, code blocks, and lists
  • Configurable save path - asks once on first use, remembers for all subsequent fetches

Prerequisites

  • Claude Code CLI
  • Python 3.8+
  • Playwright (for WeChat and fallback scraping):
    pip install playwright
    playwright install chromium

Installation

Copy the Article-Reader folder into your Claude Code skills directory:

~/.claude/skills/Article-Reader/
├── SKILL.md
└── scripts/
    ├── scrape_tweet.py
    └── fetch_wechat.py

Or place it anywhere and register it in your Claude Code configuration.

Usage

In Claude Code, say:

帮我读一下 https://mp.weixin.qq.com/s/xxxxx
帮我读一下 https://x.com/elonmusk/status/123456789
帮我读一下 https://example.com/some-article

The skill will:

  1. Ask for a save directory (first time only)
  2. Detect the URL type and fetch the content
  3. Translate to Chinese if the article is in English
  4. Save as {title}.md to your chosen directory
  5. Show a preview with title, author, and the first 500 characters

How It Works

X (Twitter)

scripts/scrape_tweet.py first calls the fxtwitter API (api.fxtwitter.com) to get tweet data including long-form article content, images, and engagement metrics. If the API fails, it falls back to Playwright headless browser scraping.

WeChat

scripts/fetch_wechat.py uses Playwright with a mobile Safari user agent to render the WeChat article page, then walks the DOM tree to convert HTML into structured Markdown while preserving formatting.

Generic URLs

Uses Jina Reader via the r.jina.ai prefix to extract article content. If Jina fails or returns incomplete content, falls back to Playwright.

Scripts

Both scripts can also be used standalone:

# Fetch a tweet
python3 scripts/scrape_tweet.py <tweet_url> [output_dir]

# Fetch a WeChat article
python3 scripts/fetch_wechat.py <article_url> [output_dir]

License

MIT

About

一个Claude代码技能,用于获取网络文章并将其作为干净的Markdown文件保存到本地。它会自动检测URL类型并选择最佳的抓取策略。

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages