Skip to content

McFev/readability-bot-v2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Readability Bot v2

Deploy with Vercel

GitHub last commit GitHub code size in bytes GitHub repo size No Ads

A simple web service that extracts readable content from web articles using Mozilla's Readability.js. It cleans up cluttered web pages, removes ads, navigation, and other distractions, making articles easier to read. The service includes custom fixes for specific websites and can handle iframes by converting them to clickable links.

Hosted on Vercel, the app provides a clean interface for users to input a URL and view a readable version of the article. It also exposes an API endpoint for programmatic access.

Features

  • Article Extraction: Uses Readability.js to parse and clean web pages.

  • Custom Site Fixes: Special handling for sites like:

    and other

  • Iframe Conversion: Optionally converts embedded iframes (e.g., YouTube, VK, Rutube) to plain links for better readability.

  • API Endpoint: /api/readability?url=<URL>&format=<html|json>&changeiframe=<true|false>

  • Frontend Interface: Built with Svelte, allowing users to submit URLs via a simple form.

  • Formats Supported: HTML (default), JSON (metadata + content).

Demo

Try it out at: readability-bot-v2.vercel.app

Example API call:
https://readability-bot-v2.vercel.app/api/readability?url=https://example.com/article

Installation

To run locally:

  1. Clone the repository:

    git clone https://github.com/McFev/readability-bot-v2.git
    cd readability-bot-v2
    
  2. Install dependencies:

    npm install
    
  3. Run in development mode:

    npm run dev
    

    This starts a local server with live reloading. Open http://localhost:5000 in your browser.

  4. Build for production:

    npm run build
    
  5. Start the production server:

    npm start
    

Usage

Frontend

  • Visit the homepage.
  • Enter a URL in the input field.
  • Click "Read" to view the cleaned article.

API

  • Endpoint: /api/readability
  • Query Parameters:
    • url (required): The URL of the article to process.
    • format (optional, default: html): Output format (html, json).
    • changeiframe (optional, default: false): Set to true to convert iframes to links.
  • Example Response (JSON):
    {
      "title": "Article Title",
      "byline": "Author Name",
      "content": "<div>Readable content...</div>",
      "textContent": "Plain text version...",
      "length": 1234,
      "excerpt": "Short summary...",
      "siteName": "Site Name",
      "lang": "en",
      "publishedTime": "2023-01-01T00:00:00.000Z"
    }

Deployment

Vercel

  1. Fork this repository.
  2. Go to Vercel and create a new project.
  3. Import your forked repo.
  4. Deploy! Vercel will handle the build and hosting automatically.

Environment Variables (optional):

  • APP_URL: Custom app URL (defaults to Vercel URL).
  • READABILITY_API_URL: Custom API base URL.

Other Platforms

The app is a standard Node.js/Svelte project bundled with Rollup, so it can be deployed to any Node.js-compatible host like Heroku, AWS, or DigitalOcean.

Dependencies

See package.json for the full list.


Powered by Mozilla Readability. If you find this useful, star the repo! ⭐

About

A service that makes web pages "readable"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • JavaScript 89.2%
  • Svelte 5.0%
  • HTML 2.9%
  • CSS 2.9%