-
Notifications
You must be signed in to change notification settings - Fork 199
fix: Council Fix Pack - March 2026 #1841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
3833956
ce62fa2
a46ffb6
24c50f8
a9f94d4
5dccfdc
9c45fde
add90fc
be071e2
c1c70e6
473c40b
ef6ec89
db52375
1747edf
3aed4cc
36573c3
c5cf578
56dfaa5
8f2823f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -30,7 +30,8 @@ def parse_data(self, page: str, **kwargs) -> dict: | |||||||||
| data = {"bins": []} | ||||||||||
|
|
||||||||||
| # Get our initial session running | ||||||||||
| driver = create_webdriver(web_driver, headless, None, __name__) | ||||||||||
| user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" | ||||||||||
| driver = create_webdriver(web_driver, headless, user_agent, __name__) | ||||||||||
|
Comment on lines
+33
to
+34
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Check all hardcoded user_agent strings in council scrapers
rg -n 'user_agent\s*=' --type=py uk_bin_collection/uk_bin_collection/councils/ -A 1 | head -50Repository: robbrad/UKBinCollectionData Length of output: 5269 🏁 Script executed: #!/bin/bash
# Check what Chrome versions are used in the codebase
rg 'Chrome/[0-9]+' --type=py uk_bin_collection/uk_bin_collection/councils/ | sort | uniq -cRepository: robbrad/UKBinCollectionData Length of output: 13582 🏁 Script executed: #!/bin/bash
# Check if there are any shared constants or utilities for user_agent
rg -n 'user_agent|USER_AGENT' --type=py uk_bin_collection/uk_bin_collection/ | grep -E '(const|CONSTANT|=)' | head -20Repository: robbrad/UKBinCollectionData Length of output: 3004 Hardcoded user-agent references Chrome 120, which is outdated (14 months old, with Chrome 143+ in active use elsewhere in the codebase). Chrome 120 was released in December 2023. Multiple other council scrapers in the codebase have since moved to Chrome 121–143. Consider updating to a more current version (e.g., Chrome 141–143) to stay aligned with contemporary releases, or extract this into a shared constant similar to the pattern in Example UA bump- user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+ user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36"📝 Committable suggestion
Suggested change
🧰 Tools🪛 Ruff (0.15.0)[error] 34-34: (F405) 🤖 Prompt for AI Agents |
||||||||||
| driver.get(kwargs.get("url")) | ||||||||||
|
|
||||||||||
| wait = WebDriverWait(driver, 30) | ||||||||||
|
|
||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add
skip_get_url: trueto avoid a prefetch that can fail before parsing.The parser performs its own request; without
skip_get_url, a failed prefetch to the base URL can block the flow or add unnecessary latency. Consider addingskip_get_url: truesoparse_dataruns directly (the auto-generated wiki entry will then include-s).🛠️ Suggested change
"LondonBoroughHammersmithandFulham": { "postcode": "W12 0BQ", + "skip_get_url": true, "url": "https://www.lbhf.gov.uk/", "wiki_command_url_override": "https://www.lbhf.gov.uk/", "wiki_name": "Hammersmith & Fulham",📝 Committable suggestion
🤖 Prompt for AI Agents