[repo-monitor] High: URL validation is substring-only — SSRF to arbitrary hosts

## Summary
Each scraper validates LinkedIn URLs with a simple `.contains("/in/")` substring check that does not validate the hostname, allowing SSRF to arbitrary hosts.

## Location
- **File**: `src/scrapers/person.rs (and company.rs, job.rs, company_posts.rs)`
- **Line(s)**: 21

## Severity
**High**

## Details
A URL like `"https://evil.com/in/victim"` passes validation and causes the browser to navigate to `evil.com`. The scraper then constructs sub-page URLs via `format!("{}/details/experience/", profile_url...)`, driving further authenticated requests to the same external host. Scraped data and cookies could be exposed.

## Suggested Fix
Parse the URL and validate the hostname:
```rust
use url::Url;
let parsed = Url::parse(linkedin_url).map_err(|_| ScraperError::InvalidUrl(...))?;
if parsed.host_str() != Some("www.linkedin.com") && parsed.host_str() != Some("linkedin.com") {
    return Err(ScraperError::InvalidUrl("URL must be on linkedin.com".to_string()));
}
```


---
*Automated finding by repo-monitor*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[repo-monitor] High: URL validation is substring-only — SSRF to arbitrary hosts #2

Summary

Location

Severity

Details

Suggested Fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[repo-monitor] High: URL validation is substring-only — SSRF to arbitrary hosts #2

Description

Summary

Location

Severity

Details

Suggested Fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions