LinkedIn Profile Scraper

A Python script to scrape LinkedIn profiles based on search queries. It supports two modes for data extraction: a reliable "summary" mode that scrapes from the main profile page and a "detailed" mode that navigates to specific sub-pages for more comprehensive data.

Setup

Choose either the uv (recommended) or pip setup method.

1. Using `uv` (Recommended)

Prerequisites: uv must be installed.

Create and activate a virtual environment:

uv venv
source .venv/bin/activate  # On Windows, use: .venv\Scripts\activate

Install dependencies:
```
uv pip install -r requirements.txt
```

2. Using `pip`

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```

Usage

The scraper requires you to be authenticated with a LinkedIn account.

1. Authenticate and Save Cookies

Run the login command to authenticate. This will open a browser window where you can enter your LinkedIn credentials. Upon successful login, your session cookies will be saved to cookies.json, allowing the scraper to run without needing to log in again.

python main.py login

2. Configure the Scraper

Edit the config.py file to customize the scraper's behavior:

SEARCH_KEYWORDS: The job title or keyword to search for (e.g., "Data Scientist").
LOCATION: The geographical location to search within (e.g., "United States").
PROFILES_TO_SCRAPE: The total number of profiles to collect.
SCRAPE_MODE:
- "SUMMARY" (Default): Faster and more reliable. Scrapes data visible on the main profile page.
- "DETAILED": Slower but more comprehensive. Navigates to the details/experience, details/education, and details/skills sub-pages.
HEADLESS:
- True (Default): Runs the browser in the background without a visible UI.
- False: Opens a visible browser window, which can be useful for debugging.

3. Run the Scraper

Once authenticated and configured, run the scrape command:

python main.py scrape

The scraper will begin its process, and you will see the extracted data printed to the terminal in real-time. The final results will be saved to linkedin_profiles.csv.

A Note on Delays

The script includes intentional delays (time.sleep()) between requests and actions to mimic human behavior and reduce the risk of being blocked by LinkedIn. If you are on a very fast and reliable network, you may be able to slightly reduce these delays in scraper.py and utils.py, but this is not recommended.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
auth.py		auth.py
config.py		config.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
scraper.py		scraper.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LinkedIn Profile Scraper

Setup

1. Using `uv` (Recommended)

2. Using `pip`

Usage

1. Authenticate and Save Cookies

2. Configure the Scraper

3. Run the Scraper

A Note on Delays

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LinkedIn Profile Scraper

Setup

1. Using uv (Recommended)

2. Using pip

Usage

1. Authenticate and Save Cookies

2. Configure the Scraper

3. Run the Scraper

A Note on Delays

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Using `uv` (Recommended)

2. Using `pip`

Packages