Skip to content

Liohtml/RUSTlinkedin_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

linkedin-scraper

A fast, async LinkedIn scraper built in Rust

Rust License Chromium

Features · Installation · Usage · API Reference · Configuration


Features

  • Person Profiles — Name, location, about, experiences, education, interests, accomplishments, contact info
  • Company Pages — Company details, about, overview, employee listings
  • Job Listings — Job title, company, location, description, applicant count
  • Job Search — Keyword + location based search returning job URLs
  • Company Posts — Feed posts with reactions, comments, images, videos
  • Contact Info Overlay — Extracts LinkedIn URL, websites, email, phone, birthday
  • Multiple Auth Methods — Credentials, li_at cookie, or manual browser login
  • Rate Limit Detection — Automatically detects checkpoints and CAPTCHAs
  • Stealth Mode — Anti-detection browser flags out of the box
  • Fully Async — Built on Tokio + chromiumoxide for non-blocking I/O

Installation

Add to your Cargo.toml:

[dependencies]
linkedin-scraper = { path = "." }
tokio = { version = "1", features = ["full"] }

Prerequisites

  • Rust 1.70+
  • Chromium or Google Chrome installed on the system

Usage

Quick Start

use linkedin_scraper::{AuthMethod, BrowserManager, Credentials, PersonScraper};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let credentials = Credentials::from_env()?;
    let browser = BrowserManager::new(false).await?;

    // Authenticate (falls back to manual login on checkpoint)
    let auth = browser.authenticate(AuthMethod::Credentials(credentials)).await;
    if auth.is_err() {
        browser.authenticate(AuthMethod::Manual).await?;
    }

    let scraper = PersonScraper::new(browser.page());
    let person = scraper.scrape("https://www.linkedin.com/in/someone/").await?;

    println!("{}", person.to_json()?);
    browser.close().await?;
    Ok(())
}

Authentication Methods

// 1. Email + Password (from .env or environment)
let creds = Credentials::from_env()?;
browser.authenticate(AuthMethod::Credentials(creds)).await?;

// 2. li_at session cookie
browser.authenticate(AuthMethod::Cookie("AQEDAx...".to_string())).await?;

// 3. Manual login (opens browser, waits up to 5 min)
browser.authenticate(AuthMethod::Manual).await?;

Scrape a Company

use linkedin_scraper::{BrowserManager, CompanyScraper};

let scraper = CompanyScraper::new(browser.page());
let company = scraper.scrape("https://www.linkedin.com/company/google/").await?;

println!("Name: {:?}", company.name);
println!("Industry: {:?}", company.industry);
println!("Size: {:?}", company.company_size);
println!("Website: {:?}", company.website);

Search & Scrape Jobs

use linkedin_scraper::{JobSearchScraper, JobScraper};

let search = JobSearchScraper::new(browser.page());
let urls = search.search("Rust Developer", "Berlin", 10).await?;

let scraper = JobScraper::new(browser.page());
for url in &urls {
    let job = scraper.scrape(url).await?;
    println!("{:?} at {:?}", job.job_title, job.company);
}

Scrape Company Posts

use linkedin_scraper::CompanyPostsScraper;

let scraper = CompanyPostsScraper::new(browser.page());
let posts = scraper.scrape("https://www.linkedin.com/company/google/", 10).await?;

for post in &posts {
    println!("{:?} - {} reactions", post.text, post.reactions_count.unwrap_or(0));
}

API Reference

Data Models

Model Fields
Person name, location, about, open_to_work, experiences, educations, interests, accomplishments, contacts
Experience position_title, institution_name, from_date, to_date, duration, location, description
Education institution_name, degree, from_date, to_date, description
Contact contact_type, value, label
Interest name, category, linkedin_url
Accomplishment category, title, issuer, issued_date, credential_id, description
Company name, about_us, website, headquarters, industry, company_size, headcount, employees
Job job_title, company, location, posted_date, applicant_count, job_description
Post urn, text, posted_date, reactions_count, comments_count, image_urls, video_url

Error Types

pub enum ScraperError {
    Authentication(String),
    RateLimit { suggested_wait_seconds: u64 },
    ElementNotFound { selector: String },
    ProfileNotFound { url: String },
    Network(String),
    Scraping(String),
    Browser(String),
    InvalidUrl(String),
    Timeout(String),
}

Configuration

Environment Variables

Create a .env file in the project root:

LINKEDIN_EMAIL=your_email@example.com
LINKEDIN_PASSWORD=your_password

Browser Options

// Headless mode (no visible browser window)
let browser = BrowserManager::new(true).await?;

// Headed mode (visible, useful for debugging)
let browser = BrowserManager::new(false).await?;

// Custom viewport and user agent
let browser = BrowserManager::with_config(
    true,                          // headless
    Some((1920, 1080)),           // viewport
    Some("Custom User Agent"),     // user agent
).await?;

Architecture

src/
├── lib.rs                 # Public API exports
├── core/
│   ├── auth.rs           # Authentication (credentials, cookie, manual)
│   ├── browser.rs        # Chromium lifecycle & session management
│   ├── exceptions.rs     # Error types (thiserror)
│   └── utils.rs          # Scrolling, rate-limit detection, DOM helpers
├── models/
│   ├── person.rs         # Person, Experience, Education, Contact, etc.
│   ├── company.rs        # Company, CompanySummary, Employee
│   ├── job.rs            # Job
│   └── post.rs           # Post
└── scrapers/
    ├── base.rs           # BaseScraper (shared logic)
    ├── person.rs         # PersonScraper
    ├── company.rs        # CompanyScraper
    ├── job.rs            # JobScraper
    ├── job_search.rs     # JobSearchScraper
    └── company_posts.rs  # CompanyPostsScraper

Important Notes

  • LinkedIn restricts data for non-connections on free accounts. Experience, education, and contact info may only be available for your connections or with a Premium account.
  • Rate limiting — LinkedIn will temporarily block scraping after too many requests. The scraper detects this automatically and returns a RateLimit error with a suggested wait time.
  • Security checkpoints — New logins from unfamiliar browsers often trigger CAPTCHAs or email verification. Use AuthMethod::Manual as a fallback.
  • Locale-aware — The scraper handles both English and German LinkedIn UIs.

License

Licensed under the Apache License 2.0.


About

A fast, async LinkedIn scraper built in Rust. Supports person profiles, companies, jobs, job search, and company posts.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages