Conversation
📝 WalkthroughWalkthroughA single council scraper module was refactored to replace Selenium WebDriver automation with direct HTTP requests, removing browser interactions and cookie banner handling. Input validation for UPRN and postcode was added, and the parse_data method signature was updated to accept kwargs for flexible parameter passing. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py`:
- Around line 44-48: The requests.get call in BroadlandDistrictCouncil (where
you call requests.get(...)) is using a hardcoded
"https://area.southnorfolkandbroadland.gov.uk/" instead of the configured URL
stored in the local variable url (obtained from kwargs via url =
kwargs.get("url")); update that requests.get invocation to use the url variable
(keeping headers and timeout as-is) so the configured URL (e.g.,
".../FindAddress") from input.json is honored.
🧹 Nitpick comments (6)
uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py (6)
104-104: Remove duplicate import.
parsefromdateutil.parseris already imported at line 8. This redundant import inside the function should be removed.Proposed fix
- from dateutil.parser import parse - - parsed_date = parse(cleaned_date_text, fuzzy=True) + parsed_date = parse(cleaned_date_text, fuzzy=True)
17-17: Fix implicit Optional type hint.Per PEP 484,
page: str = Noneshould explicitly declare the Optional type. The unusedpageparameter is understood since this council handles its own request.Proposed fix
- def parse_data(self, page: str = None, **kwargs) -> dict: + def parse_data(self, page: str | None = None, **kwargs) -> dict:
23-23: Remove unused variable.The
urlvariable is assigned but never used. This was likely left over from the Selenium refactor.Proposed fix
uprn = kwargs.get("uprn") postcode = kwargs.get("postcode") - url = kwargs.get("url")
57-57: Remove unused variable.
current_dateis assigned but never used in the parsing logic.Proposed fix
- # Initialize current date - current_date = datetime.now() - - # Process collection details + # Process collection details
60-60: Consider removing or replacing debug print statements.Multiple
print()calls remain in the code for debugging. For production code, consider removing them or replacing with proper logging (e.g.,_LOGGER.debug()) as used elsewhere in the codebase.Also applies to: 78-78, 96-96, 116-116
117-119: Consider using logging instead of print for error reporting.The error message uses
print()which is inconsistent with production practices. Consider using a logger for consistent error reporting.Proposed fix
except Exception as e: - print(f"An error occurred: {e}") + import logging + logging.getLogger(__name__).error(f"An error occurred: {e}") raiseOr import
_LOGGERif available in the common module.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py
🧰 Additional context used
🧬 Code graph analysis (1)
uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py (2)
uk_bin_collection/uk_bin_collection/get_bin_data.py (1)
AbstractGetBinDataClass(43-146)uk_bin_collection/uk_bin_collection/common.py (2)
check_uprn(67-78)check_postcode(36-49)
🪛 Ruff (0.14.11)
uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py
11-11: from uk_bin_collection.uk_bin_collection.common import * used; unable to detect undefined names
(F403)
17-17: Unused method argument: page
(ARG002)
17-17: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
23-23: Local variable url is assigned to but never used
Remove assignment to unused variable url
(F841)
25-25: check_uprn may be undefined, or defined from star imports
(F405)
26-26: check_postcode may be undefined, or defined from star imports
(F405)
🔇 Additional comments (1)
uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py (1)
25-27: Input validation and UPRN normalization look good.The
check_uprnfunction validates that the UPRN exists and is not empty, whilecheck_postcodevalidates UK postcode formatting against the postcodes.io API. The zero-padding to 12 digits is correct for UK UPRNs.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| r = requests.get( | ||
| "https://area.southnorfolkandbroadland.gov.uk/", | ||
| headers=headers, | ||
| timeout=30 | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find configuration files that might contain URL definitions
find . -name "*.json" -o -name "*.yaml" -o -name "*.yml" | head -20Repository: robbrad/UKBinCollectionData
Length of output: 851
🏁 Script executed:
# Search for BroadlandDistrictCouncil in configuration files
rg "BroadlandDistrictCouncil" -t json -A 5 2>/dev/null || echo "No JSON matches"
rg "BroadlandDistrictCouncil" -A 5 --type-list | grep -E "\.(json|yaml|yml)" && rg "BroadlandDistrictCouncil" -A 5 || echo "Search complete"Repository: robbrad/UKBinCollectionData
Length of output: 949
🏁 Script executed:
# Check the actual BroadlandDistrictCouncil implementation to see how it uses kwargs
cat -n uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py | head -60Repository: robbrad/UKBinCollectionData
Length of output: 2215
🏁 Script executed:
# Look at how other council classes handle URLs - check if they use kwargs.get("url")
rg "kwargs\.get\(.*url" -A 2 -B 2Repository: robbrad/UKBinCollectionData
Length of output: 17682
URL from kwargs is retrieved but never used; requests.get() uses a hardcoded URL instead.
At line 23, url = kwargs.get("url") retrieves the configured URL (https://area.southnorfolkandbroadland.gov.uk/FindAddress from input.json), but at line 45, the code uses a different hardcoded URL (https://area.southnorfolkandbroadland.gov.uk/). This means configuration changes to the URL will be ignored. Use the url variable from kwargs in the request instead of the hardcoded value.
🤖 Prompt for AI Agents
In `@uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py`
around lines 44 - 48, The requests.get call in BroadlandDistrictCouncil (where
you call requests.get(...)) is using a hardcoded
"https://area.southnorfolkandbroadland.gov.uk/" instead of the configured URL
stored in the local variable url (obtained from kwargs via url =
kwargs.get("url")); update that requests.get invocation to use the url variable
(keeping headers and timeout as-is) so the configured URL (e.g.,
".../FindAddress") from input.json is honored.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1814 +/- ##
=======================================
Coverage 86.67% 86.67%
=======================================
Files 9 9
Lines 1141 1141
=======================================
Hits 989 989
Misses 152 152 ☔ View full report in Codecov by Sentry. |
Summary by CodeRabbit
Performance
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.