Add advanced feed monitoring features#32
Conversation
- Implement `/api/monitor` endpoint for health checks and scoring - Add `api/monitoring_utils.py` for XML parsing and health calculation - Support latest update timestamp detection (Atom/RSS) - Calculate human-readable relative time - Measure response time in milliseconds - Document new monitoring features in README.md
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Review limit reached
More reviews will be available in 6 minutes and 10 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis pull request adds a new ChangesFeed Monitoring Endpoint
Sequence DiagramsequenceDiagram
participant Client
participant api_monitor as /api/monitor handler
participant utils as monitoring_utils
Client->>api_monitor: GET/POST url=...
api_monitor->>api_monitor: validate & normalize url
api_monitor->>utils: fetch_feed(url)
utils-->>api_monitor: content, response_ms, error, status
alt fetch successful
api_monitor->>utils: parse_xml(content)
utils-->>api_monitor: parsed_feed
api_monitor->>utils: get_latest_timestamp(parsed_feed)
utils-->>api_monitor: latest_timestamp
api_monitor->>utils: get_relative_time(latest_timestamp)
utils-->>api_monitor: relative_time
api_monitor->>utils: calculate_health_and_score(...)
utils-->>api_monitor: status, reason, score
else fetch error
api_monitor->>api_monitor: use error details for health
end
api_monitor-->>Client: {feedUrl, responseTime, health, lastUpdated, ...}
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/app.py`:
- Around line 29-33: The JSON payload read into data via
request.get_json(silent=True) may be a non-dict (e.g., a list), so guard its
type before calling data.get('url') to avoid a 500; in the handler around the
data variable (and where url is assigned) check if isinstance(data, dict) and if
not replace/normalize data with an empty dict (or a dict fallback) so subsequent
use of data.get('url') is safe (update the code around request.get_json and the
url assignment to use this normalized data).
In `@api/monitoring_utils.py`:
- Line 223: The code currently forces UTF-8 by doing content =
response.read().decode('utf-8'); change this to keep the raw bytes from
response.read() (do not call .decode()) and return/pass those bytes out so the
XML parser can detect and respect the feed's declared encoding; update any
callers expecting a str to accept bytes (or decode using the parser) and adjust
variable uses of content accordingly.
- Around line 219-223: The fetch_feed function currently opens untrusted URLs
(variable url) directly via urllib.request.urlopen(req), causing SSRF risk;
before creating the Request or opening the connection, validate and sanitize
url: ensure scheme is http or https using urllib.parse, resolve the hostname to
IPs with socket.getaddrinfo/gethostbyname_ex, and use the ipaddress module to
reject any resolved IP that is loopback, RFC1918 (private), link-local,
multicast, or IPv6 unique/local addresses; also forbid non-network schemes
(file, gopher, etc.) and raise/return an error if any disallowed address is
found so that urllib.request.urlopen(req, timeout=timeout) is only called for
safe, public addresses.
- Line 16: The XML parsing call ET.fromstring(content) in
api/monitoring_utils.py must be made safe for untrusted input: replace direct
xml.etree.ElementTree parsing with a secure parser from defusedxml (e.g. import
defusedxml.ElementTree as ET and use ET.fromstring(content)) or otherwise
configure the parser to disable DTD/entity expansion to prevent
XXE/entity‑expansion attacks; update the call replacing ET.fromstring(content)
and any related XML parsing logic so it uses the defusedxml API (or an
equivalent secure parsing approach) everywhere that processes remote feed
content.
In `@README.md`:
- Around line 98-99: Update the README example to match the API's actual
timestamp format (which uses Python's datetime.isoformat() and yields "+00:00"
for UTC) by changing "2026-06-08T10:30:00Z" to "2026-06-08T10:30:00+00:00"; if
you prefer to normalize to a trailing Z instead, modify the serialization where
timestamps are produced (the code calling datetime.isoformat()) to convert UTC
offsets of "+00:00" to "Z" (e.g., replace("+00:00","Z")) so docs and output stay
consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 81b48352-1a2e-4b93-ba55-88c5aed410ec
📒 Files selected for processing (3)
README.mdapi/app.pyapi/monitoring_utils.py
- Implement `/api/monitor` endpoint for health checks and scoring - Support latest update timestamp detection (Atom/RSS) - Calculate human-readable relative time - Measure response time in milliseconds - Implement SSRF protection for outbound feed requests - Use `defusedxml` for secure XML parsing (prevent XXE) - Ensure robust handling of non-dictionary JSON payloads - Update README.md with new endpoint documentation
- Move `monitoring_utils.py` to root for reliable imports - Fix JavaScript event delegation and copy button logic in frontend - Address "The string did not match the expected pattern" error by improving DOM interactions - Simplify SSRF IP validation logic - Ensure consistent data-encoded attribute usage for copy buttons
This PR adds advanced feed monitoring capabilities to the YouTube RSS Feed Scanner.
Key changes:
api/monitoring_utils.pythat handles:<updated>,<pubDate>, etc.).urllib.request./api/monitorinapi/app.pyto expose these features.README.mdwith documentation and examples for the new endpoint.The implementation is written in Python to maintain consistency with the existing codebase and is compatible with Vercel deployment.
PR created automatically by Jules for task 5498580811796658517 started by @DisabledAbel
Summary by CodeRabbit
New Features
/api/monitorendpoint for monitoring RSS/Atom feeds with health checks and scoringDocumentation