Skip to content

Add advanced feed monitoring features#32

Merged
DisabledAbel merged 3 commits into
mainfrom
add-feed-monitoring-5498580811796658517
Jun 9, 2026
Merged

Add advanced feed monitoring features#32
DisabledAbel merged 3 commits into
mainfrom
add-feed-monitoring-5498580811796658517

Conversation

@DisabledAbel

@DisabledAbel DisabledAbel commented Jun 9, 2026

Copy link
Copy Markdown
Owner

This PR adds advanced feed monitoring capabilities to the YouTube RSS Feed Scanner.

Key changes:

  • Created a new utility module api/monitoring_utils.py that handles:
    • XML parsing for both Atom and RSS 2.0 formats.
    • Automatic detection of the latest update timestamp from various fields (<updated>, <pubDate>, etc.).
    • Calculation of human-readable relative time (e.g., "3 hours ago").
    • Health scoring and status detection based on response time, XML validity, and upload recency.
    • Response time measurement using urllib.request.
  • Added a new API route /api/monitor in api/app.py to expose these features.
  • Updated README.md with documentation and examples for the new endpoint.

The implementation is written in Python to maintain consistency with the existing codebase and is compatible with Vercel deployment.


PR created automatically by Jules for task 5498580811796658517 started by @DisabledAbel

Summary by CodeRabbit

  • New Features

    • Added /api/monitor endpoint for monitoring RSS/Atom feeds with health checks and scoring
    • Provides feed health status, response times, and timestamps (absolute and relative)
    • Detects stale feeds with intelligent staleness detection
    • Auto-normalizes URLs with https:// prefix
  • Documentation

    • Added endpoint documentation with request parameters and example payloads

- Implement `/api/monitor` endpoint for health checks and scoring
- Add `api/monitoring_utils.py` for XML parsing and health calculation
- Support latest update timestamp detection (Atom/RSS)
- Calculate human-readable relative time
- Measure response time in milliseconds
- Document new monitoring features in README.md
@google-labs-jules

Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
you-tube-rss-feed-scanner Ready Ready Preview, Comment Jun 9, 2026 1:31am

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@DisabledAbel, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 6 minutes and 10 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 71bb489a-7589-4073-a24a-54d9cd5109a4

📥 Commits

Reviewing files that changed from the base of the PR and between 542c129 and 7a8d263.

📒 Files selected for processing (6)
  • README.md
  • api/app.py
  • api/index.html
  • monitoring_utils.py
  • requirements.txt
  • templates/index.html
📝 Walkthrough

Walkthrough

This pull request adds a new /api/monitor endpoint that enables clients to monitor RSS/Atom feed health. The implementation includes core utilities for HTTP-based feed fetching, XML parsing (supporting both Atom and RSS 2.0 formats), timestamp extraction with timezone normalization, and health scoring based on feed age, response time, and parsing success.

Changes

Feed Monitoring Endpoint

Layer / File(s) Summary
Core monitoring utilities
api/monitoring_utils.py
Utilities to fetch feeds over HTTP, parse Atom/RSS 2.0 XML content, extract latest timestamps with multi-format support, convert timestamps to relative time strings, and compute health status and score (0–100) based on fetch/parse results, item presence, response-time penalties, and staleness (>30 days).
API endpoint and integration
api/app.py
Flask route handler at /api/monitor (GET/POST) that validates the url parameter, normalizes non-HTTP URLs by prefixing https://, orchestrates the monitoring utilities, and returns JSON with the feed URL, response time, health status/reason, and last-updated timestamps in both ISO and relative formats.
Endpoint documentation
README.md
Documents the /api/monitor endpoint's purpose, required url parameter, and example request/response payload structure with fields for feed URL, response time, score, health status/reason, and last-updated timestamps.

Sequence Diagram

sequenceDiagram
  participant Client
  participant api_monitor as /api/monitor handler
  participant utils as monitoring_utils
  Client->>api_monitor: GET/POST url=...
  api_monitor->>api_monitor: validate & normalize url
  api_monitor->>utils: fetch_feed(url)
  utils-->>api_monitor: content, response_ms, error, status
  alt fetch successful
    api_monitor->>utils: parse_xml(content)
    utils-->>api_monitor: parsed_feed
    api_monitor->>utils: get_latest_timestamp(parsed_feed)
    utils-->>api_monitor: latest_timestamp
    api_monitor->>utils: get_relative_time(latest_timestamp)
    utils-->>api_monitor: relative_time
    api_monitor->>utils: calculate_health_and_score(...)
    utils-->>api_monitor: status, reason, score
  else fetch error
    api_monitor->>api_monitor: use error details for health
  end
  api_monitor-->>Client: {feedUrl, responseTime, health, lastUpdated, ...}
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

codex

Poem

🐰 A rabbit hops through XML streams,
Parsing feeds and timestamps' dreams,
Health scores bloom like garden greens,
Monitoring flows with tested routines,
Hop, hop—the monitor thrives! 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add advanced feed monitoring features' is directly related to the main changes in the pull request, which introduces a comprehensive monitoring system with utilities, a new API endpoint, and documentation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add-feed-monitoring-5498580811796658517

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/app.py`:
- Around line 29-33: The JSON payload read into data via
request.get_json(silent=True) may be a non-dict (e.g., a list), so guard its
type before calling data.get('url') to avoid a 500; in the handler around the
data variable (and where url is assigned) check if isinstance(data, dict) and if
not replace/normalize data with an empty dict (or a dict fallback) so subsequent
use of data.get('url') is safe (update the code around request.get_json and the
url assignment to use this normalized data).

In `@api/monitoring_utils.py`:
- Line 223: The code currently forces UTF-8 by doing content =
response.read().decode('utf-8'); change this to keep the raw bytes from
response.read() (do not call .decode()) and return/pass those bytes out so the
XML parser can detect and respect the feed's declared encoding; update any
callers expecting a str to accept bytes (or decode using the parser) and adjust
variable uses of content accordingly.
- Around line 219-223: The fetch_feed function currently opens untrusted URLs
(variable url) directly via urllib.request.urlopen(req), causing SSRF risk;
before creating the Request or opening the connection, validate and sanitize
url: ensure scheme is http or https using urllib.parse, resolve the hostname to
IPs with socket.getaddrinfo/gethostbyname_ex, and use the ipaddress module to
reject any resolved IP that is loopback, RFC1918 (private), link-local,
multicast, or IPv6 unique/local addresses; also forbid non-network schemes
(file, gopher, etc.) and raise/return an error if any disallowed address is
found so that urllib.request.urlopen(req, timeout=timeout) is only called for
safe, public addresses.
- Line 16: The XML parsing call ET.fromstring(content) in
api/monitoring_utils.py must be made safe for untrusted input: replace direct
xml.etree.ElementTree parsing with a secure parser from defusedxml (e.g. import
defusedxml.ElementTree as ET and use ET.fromstring(content)) or otherwise
configure the parser to disable DTD/entity expansion to prevent
XXE/entity‑expansion attacks; update the call replacing ET.fromstring(content)
and any related XML parsing logic so it uses the defusedxml API (or an
equivalent secure parsing approach) everywhere that processes remote feed
content.

In `@README.md`:
- Around line 98-99: Update the README example to match the API's actual
timestamp format (which uses Python's datetime.isoformat() and yields "+00:00"
for UTC) by changing "2026-06-08T10:30:00Z" to "2026-06-08T10:30:00+00:00"; if
you prefer to normalize to a trailing Z instead, modify the serialization where
timestamps are produced (the code calling datetime.isoformat()) to convert UTC
offsets of "+00:00" to "Z" (e.g., replace("+00:00","Z")) so docs and output stay
consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 81b48352-1a2e-4b93-ba55-88c5aed410ec

📥 Commits

Reviewing files that changed from the base of the PR and between 554b8a5 and 542c129.

📒 Files selected for processing (3)
  • README.md
  • api/app.py
  • api/monitoring_utils.py

Comment thread api/app.py Outdated
Comment thread api/monitoring_utils.py Outdated
Comment thread api/monitoring_utils.py Outdated
Comment thread api/monitoring_utils.py Outdated
Comment thread README.md Outdated
- Implement `/api/monitor` endpoint for health checks and scoring
- Support latest update timestamp detection (Atom/RSS)
- Calculate human-readable relative time
- Measure response time in milliseconds
- Implement SSRF protection for outbound feed requests
- Use `defusedxml` for secure XML parsing (prevent XXE)
- Ensure robust handling of non-dictionary JSON payloads
- Update README.md with new endpoint documentation
- Move `monitoring_utils.py` to root for reliable imports
- Fix JavaScript event delegation and copy button logic in frontend
- Address "The string did not match the expected pattern" error by improving DOM interactions
- Simplify SSRF IP validation logic
- Ensure consistent data-encoded attribute usage for copy buttons
@DisabledAbel DisabledAbel merged commit 9e14fbe into main Jun 9, 2026
4 checks passed
@DisabledAbel DisabledAbel deleted the add-feed-monitoring-5498580811796658517 branch June 9, 2026 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant