Skip to content

feat: enhanced scrapeLikedTweets scraper with rich data extraction#13

Closed
nj-io wants to merge 1 commit into
nirholas:mainfrom
nj-io:feat/enhanced-liked-tweets
Closed

feat: enhanced scrapeLikedTweets scraper with rich data extraction#13
nj-io wants to merge 1 commit into
nirholas:mainfrom
nj-io:feat/enhanced-liked-tweets

Conversation

@nj-io
Copy link
Copy Markdown

@nj-io nj-io commented Apr 2, 2026

Summary

Replaces the broken xeepy-based x_get_likes handler with a proper scrapeLikedTweets() scraper, following the same pattern as scrapeBookmarks().

What changed

  • New scrapeLikedTweets() in src/scrapers/twitter/index.js — scroll-based scraper with rich data extraction
  • Wired through src/scrapers/index.jssrc/mcp/local-tools.js (same pattern as bookmarks)
  • Removed x_get_likes from xeepyTools array and deleted the broken executeXeepyTool handler that called the non-existent localTools.getPage()

Rich data per tweet

Field Source
text Full text with "Show more" expansion
author, handle User-Name + first a[href]
timestamp, link time[datetime], first /status/ link
images a[href*="/photo/"] attributed to correct author by handle matching
quotedTweet Detected via multiple UserAvatar-Container-* elements (X doesn't always use quoteTweet testid)
article article-cover-image + nextElementSibling for title/description
card card.wrapper for link previews
replies, retweets, likes, views Parsed from role="group" aria-label

Bug fixes

  • "Show more" clicks one at a time — X re-renders the DOM after each click, detaching all other button references. Previous approach of collecting all buttons then iterating caused Node is detached from document errors.
  • Article URL construction — only builds article.url for direct articles (not quoted tweets where the article belongs to a different author). Always includes article.tweetUrl for reliable resolution.

Relation to #7

This overlaps with the scrapeLikedTweets portion of #7. The scraper here is more comprehensive (quote tweets, articles, cards, engagement stats, image attribution) vs #7's basic fields. Both follow the same architectural pattern requested by the maintainer: scraper in twitter/index.js, routed through local-tools.js, removed from xeepyTools.

Test plan

  • x_get_likes returns rich data with quote tweets, articles, cards, engagement stats
  • "Show more" expansion works for truncated tweets
  • Scrolling collects beyond the initial viewport
  • Deduplication prevents duplicate tweets
  • Existing tools unaffected (no changes to other handlers)

🤖 Generated with Claude Code

Replace the broken xeepy-based x_get_likes handler with a proper scraper
in src/scrapers/twitter/index.js, following the same pattern as
scrapeBookmarks.

The new scraper extracts rich data per tweet:
- text, author, handle, timestamp, link
- images (attributed to correct author)
- quoted tweets (detected via multiple UserAvatar-Container elements)
- X Articles (title, description, cover image via article-cover-image)
- link cards (via card.wrapper)
- engagement stats (replies, retweets, likes, views from role="group")

Also fixes:
- "Show more" expansion — clicks buttons one at a time since X
  re-renders the DOM after each click, detaching other button refs
- Scroll-based pagination with deduplication
- Removes x_get_likes from xeepyTools, routes through local-tools.js

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nj-io nj-io requested a review from nirholas as a code owner April 2, 2026 09:18
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 2, 2026

@nj-io is attempting to deploy a commit to the kaivocmenirehtacgmailcom's projects Team on Vercel.

A member of the Team first needs to authorize it.

@nj-io
Copy link
Copy Markdown
Author

nj-io commented Apr 5, 2026

Superseded by updated PR with timestamp filtering, auth checks, and proper scraper pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant