Summary
In extract_interests, the JS scraper runs before the correct tab is activated, causing every interest record to have the category of the previous iteration's tab.
Location
- File:
src/scrapers/person.rs
- Line(s): 434–498
Severity
Medium
Details
The loop iterates ["companies", "groups", "schools", ...]. In each iteration it: (1) runs JS scraper on the currently-active tab, (2) clicks the corresponding tab for the next iteration. Result: items from the "companies" tab are labeled "groups", "groups" items are labeled "schools", etc. The first iteration's category is undefined.
Suggested Fix
Reverse the order — click the tab first, sleep, then scrape:
for category in categories {
// 1. click tab for this category
// 2. sleep for page to load
// 3. run JS scraper and label results with this category
}
Automated finding by repo-monitor
Summary
In
extract_interests, the JS scraper runs before the correct tab is activated, causing every interest record to have the category of the previous iteration's tab.Location
src/scrapers/person.rsSeverity
Medium
Details
The loop iterates
["companies", "groups", "schools", ...]. In each iteration it: (1) runs JS scraper on the currently-active tab, (2) clicks the corresponding tab for the next iteration. Result: items from the "companies" tab are labeled "groups", "groups" items are labeled "schools", etc. The first iteration's category is undefined.Suggested Fix
Reverse the order — click the tab first, sleep, then scrape:
Automated finding by repo-monitor