Skip to content

Add feed-type endpoints for all/videos/shorts/live#22

Merged
DisabledAbel merged 3 commits into
mainfrom
codex/add-youtube-feed-separation
May 14, 2026
Merged

Add feed-type endpoints for all/videos/shorts/live#22
DisabledAbel merged 3 commits into
mainfrom
codex/add-youtube-feed-separation

Conversation

@DisabledAbel

@DisabledAbel DisabledAbel commented May 14, 2026

Copy link
Copy Markdown
Owner

Motivation

  • Provide hidden separation of YouTube feeds so subscribers can choose all, videos, shorts, or live feeds while keeping existing combined behavior.
  • Support the new feed types for all supported URL forms (channel, @handle, /c/, /user/, video URLs) and expose typed feed links in API output.

Description

  • Added feed_type support to get_rss_feed(...) and wired it into scraping so the scanner fetches the appropriate YouTube tab; signature changed to get_rss_feed(..., feed_type: str = "all") and get_channel_videos(...) now accepts feed_type and uses a page_path_by_type map to hit videos, shorts, or streams pages and a shared _extract_video_ids_from_page(...) helper.
  • Updated api/app.py routing to new endpoints GET /feed/all/<channel>, /feed/videos/<channel>, /feed/shorts/<channel>, /feed/live/<channel> with feed-type validation returning 400 for invalid types and adjusted cached route to include the request path in the cache key so each feed type caches separately (key_prefix=lambda: f"feed_{request.path}").
  • Updated generated api_endpoints returned by get_rss_feed(...) when include_api_endpoints is enabled to advertise typed feed URLs (atom_feed_path, videos_feed, shorts_feed, live_feed).
  • Updated README.md to document the new /feed/<type>/<channel_url> pattern and example endpoints.

Testing

  • Compiled files successfully with python -m py_compile rss_scanner.py api/app.py (succeeded).
  • Exercised Flask routes with app.test_client() using a stubbed rss_scanner.get_rss_feed to confirm valid feed types (all, videos, shorts, live) return 200 and invalid type returns 400 (succeeded).
  • Ran a quick unit check of the new ID extractor (_extract_video_ids_from_page) to confirm the limit parameter works (10 IDs returned) (succeeded).
  • Attempted live network integration tests via app.test_client() against real YouTube pages but outbound requests failed in this environment (Tunnel connection failed: 403 Forbidden), so full live fetch validation could not complete (network-limited failure).

Codex Task


Open in Devin Review

Summary by CodeRabbit

  • New Features

    • RSS endpoint now supports filtered feed types via /feed//<channel_url> (all, videos, shorts, live).
    • UI adds a “Feed Type” selector and returns corresponding feed links in results.
    • API accepts a feed_type parameter when requesting feeds.
  • Documentation

    • README updated with new path examples, supported feed types, and revised URL-encoding guidance.

Review Change Stack

@vercel

vercel Bot commented May 14, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
you-tube-rss-feed-scanner Ready Ready Preview, Comment May 14, 2026 2:04am

@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds feed_type filtering (all, videos, shorts, live) end-to-end: scanner scraping, get_rss_feed signature, Flask routing (typed /feed/<feed_type>/<channel_url> with validation and caching), UI selectors, and README examples updated.

Changes

Feed Type Support throughout Stack

Layer / File(s) Summary
Scanner feed type handling
rss_scanner.py
Introduces _extract_video_ids_from_page helper and refactors get_channel_videos to accept feed_type and limit, choosing the appropriate channel subpage (videos/shorts/streams) to scrape. get_rss_feed signature now accepts feed_type and forwards it; api_endpoints now includes per-type Atom feed paths.
API routing and validation
api/app.py
Adds /feed/ usage endpoint and replaces prior untyped feed routes with a cached /feed/<feed_type>/<path:channel_url> route. The route validates feed_type, reconstructs the channel URL, calls get_rss_feed(..., feed_type=...), and caches responses keyed by request.path. /api/feed handler accepts and forwards feed_type from request data.
UI wiring for feed_type selection
api/index.html, index.html, templates/index.html
Adds a Feed Type <select> to pages, includes the selected feed_type in /api/feed requests, and conditionally renders the selected feed link (data.api_endpoints.atom_feed_path) when present.
Endpoint documentation and examples
README.md
Documents the new /feed/<type>/<channel_url> path with supported types all, videos, shorts, live; updates fetch examples to use typed paths like /feed/videos/${encodedUrl} and recommends pre-encoded path usage (e.g., /feed/all/<encodedUrl>).

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Flask API
  participant rss_scanner
  participant YouTube Page
  
  Client->>Flask API: GET /feed/<feed_type>/<channel_url>
  Flask API->>Flask API: validate feed_type
  Flask API->>rss_scanner: get_rss_feed(url, feed_type="all"|"videos"|"shorts"|"live")
  rss_scanner->>rss_scanner: get_channel_videos(channel_id, feed_type)
  rss_scanner->>YouTube Page: scrape /channel/<id>/<feed_type>
  YouTube Page-->>rss_scanner: HTML with video IDs
  rss_scanner->>rss_scanner: extract unique 11-char video IDs via regex
  rss_scanner-->>Flask API: Atom feed + api_endpoints (feed/all/, videos_feed, shorts_feed, live_feed)
  Flask API-->>Client: Atom XML response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I nibble through pages, URLs in line,
Sorting shorts, lives, and videos fine.
Paths now typed, I hop and weave,
feed/all, feed/videos—how clever to cleave! 🎧

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding feed-type endpoints for all/videos/shorts/live, which matches the core objective of supporting multiple feed types.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/add-youtube-feed-separation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 3 file(s) based on 3 unresolved review comments.

Files modified:

  • README.md
  • api/app.py
  • rss_scanner.py

Commit: b3295c92dd14329343bffa3fbfc97764f311d777

The changes have been pushed to the codex/add-youtube-feed-separation branch.

Time taken: 2m 26s

Fixed 3 file(s) based on 3 unresolved review comments.

Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
api/app.py (2)

171-171: ⚡ Quick win

Remove redundant channel_id or channel_id expression.

The expression channel_id or channel_id always evaluates to channel_id. This appears to be a copy-paste error where the second occurrence should likely just be channel_id, or the entire expression was meant to provide a fallback.

♻️ Suggested fix
                atom_feed = atom_feed.replace(
-                    f">{channel_id or channel_id} - YouTube Videos",
+                    f">{channel_id} - YouTube Videos",
                    f">{channel_name} - YouTube Videos"
                )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/app.py` at line 171, The f-string currently uses a redundant expression
`channel_id or channel_id`; replace it with a single `channel_id` (i.e.,
`f"{channel_id} - YouTube Videos"`) in the same place where the f-string is
constructed so the title uses the actual channel_id value (or, if a fallback is
desired, use a proper fallback like `channel_id or "<unknown>"` instead).

165-165: ⚡ Quick win

Replace unused video_count with _.

The video_count variable is unpacked but never used in this function. Replace it with _ to signal the value is intentionally ignored.

♻️ Suggested fix
-        _, channel_id, channel_name, atom_feed, video_count, _, _ = rss_scanner.get_rss_feed(full_url, feed_type=feed_type)
+        _, channel_id, channel_name, atom_feed, _, _, _ = rss_scanner.get_rss_feed(full_url, feed_type=feed_type)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/app.py` at line 165, Unpack from rss_scanner.get_rss_feed already binds
an unused video_count variable; change the tuple unpack to replace video_count
with _ so the call becomes: _, channel_id, channel_name, atom_feed, _, _, _ when
calling rss_scanner.get_rss_feed(full_url, feed_type=feed_type) — update the
unpacking in the function that calls rss_scanner.get_rss_feed to use _ for the
unused value to signal intentional ignore.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@api/app.py`:
- Line 171: The f-string currently uses a redundant expression `channel_id or
channel_id`; replace it with a single `channel_id` (i.e., `f"{channel_id} -
YouTube Videos"`) in the same place where the f-string is constructed so the
title uses the actual channel_id value (or, if a fallback is desired, use a
proper fallback like `channel_id or "<unknown>"` instead).
- Line 165: Unpack from rss_scanner.get_rss_feed already binds an unused
video_count variable; change the tuple unpack to replace video_count with _ so
the call becomes: _, channel_id, channel_name, atom_feed, _, _, _ when calling
rss_scanner.get_rss_feed(full_url, feed_type=feed_type) — update the unpacking
in the function that calls rss_scanner.get_rss_feed to use _ for the unused
value to signal intentional ignore.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 04bc8b30-ef51-40db-9dff-f36766a16144

📥 Commits

Reviewing files that changed from the base of the PR and between 2d1e3da and b3295c9.

📒 Files selected for processing (6)
  • README.md
  • api/app.py
  • api/index.html
  • index.html
  • rss_scanner.py
  • templates/index.html
🚧 Files skipped from review as they are similar to previous changes (2)
  • README.md
  • rss_scanner.py

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment thread rss_scanner.py
Comment on lines +329 to +330
"atom_feed_path": f"{base_url.rstrip('/')}/feed/all/{encoded_url}",
"atom_feed_query": f"{base_url.rstrip('/')}/feed/all/{urllib.parse.quote(url, safe='')}",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 atom_feed_path and atom_feed_query produce identical URLs

The PR changed atom_feed_query from a distinct query-parameter-based URL format (/feed/?channel_url=...) to the same path-based format as atom_feed_path. Since encoded_url = urllib.parse.quote(url, safe="") at rss_scanner.py:326 and the inline urllib.parse.quote(url, safe='') at line 330 produce the same result, both keys now generate the exact same URL. The CLI at rss_scanner.py:416-417 still prints both as "Atom Feed (path)" and "Atom Feed (query)" — showing identical URLs with different labels, which is confusing to users.

Prompt for agents
The atom_feed_path and atom_feed_query entries in the api_endpoints dictionary (rss_scanner.py:329-330) now produce identical URLs since both use the same path-based format. The old code had atom_feed_query using a query-parameter format (/feed/?channel_url=...) which was a distinct alternative.

Since the query-parameter route was removed in api/app.py, either:
1. Remove the atom_feed_query key entirely and update all references (rss_scanner.py:417 CLI print, and any UI references).
2. Or, if the intent is to show the currently-selected feed type's URL rather than always 'all', replace atom_feed_query with a URL that uses the selected feed_type parameter, e.g. f"{base_url}/feed/{feed_type}/{encoded_url}".

Also update rss_scanner.py:416-417 (the CLI output) to stop printing the duplicate entry.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread api/app.py


@app.route('/feed/<feed_type>/<path:channel_url>')
@cache.cached(timeout=300, key_prefix=lambda: f"feed_{request.path}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Cache stores transient error responses (404/500) for 5 minutes

The @cache.cached decorator at api/app.py:146 caches all responses from get_feed, including Response("No videos found", status=404) at line 180 and Response(f"Error: {str(e)}", status=500) at line 182. If YouTube scraping temporarily fails (e.g., due to rate limiting or network issues), the error response is served from cache for 5 minutes, even after the transient issue resolves. The old code effectively had no working cache (the cached route get_cached_feed was shadowed by the identically-patterned uncached get_feed route), so this is a new behavior introduced by the PR.

Suggested change
@cache.cached(timeout=300, key_prefix=lambda: f"feed_{request.path}")
@cache.cached(timeout=300, key_prefix=lambda: f"feed_{request.path}", response_filter=lambda response: response.status_code == 200)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DisabledAbel DisabledAbel merged commit ee543ff into main May 14, 2026
5 checks passed
@DisabledAbel DisabledAbel deleted the codex/add-youtube-feed-separation branch May 14, 2026 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant