Skip to content

Latest commit

 

History

History
94 lines (65 loc) · 2.9 KB

File metadata and controls

94 lines (65 loc) · 2.9 KB

🐍 FreshersWorld Python Jobs Analytics & Scraper

Python Streamlit Pandas Status

A production-grade web scraper and analytics tool that extracts Python job listings from FreshersWorld. Built with Requests, BeautifulSoup, and Streamlit, it mimics human behavior to safely scrape over 200+ pages of job content.


🎥 Project Demo

WhatsApp.Video.2026-02-04.at.3.26.05.PM.1.mp4

🚀 Features

🖥️ Interactive Dashboard

  • Controls: Set strict page limits (1-200) via the sidebar.
  • Live Feedback: Real-time progress bar and streaming logs.
  • Instant Export: Download data in Excel, CSV, or JSON immediately.
  • Data Metrics: Quick view of total jobs and unique companies found.

🛡️ Smart Scraping Core

  • Anti-Bot Evasion: Uses requests.Session() with realistic headers and random delays (1.5-3.5s).
  • Retry Logic: Automatically retries failed requests up to 3 times.
  • Resilience: Skips broken entries without crashing the entire process.
  • Logging: Detailed logging to both the UI and fresherworld_scraper.log.

📥 Extracted Data

Field Description
Role Job Title (e.g., Python Developer)
Company Hiring Organization
Location City / Remote status
Experience Years required
Salary Compensation range
Link Direct application URL

🛠️ Installation & Usage

  1. Clone the Repository

    git clone https://github.com/AryanPrajapati9456/freshersworld-scraper.git
    cd freshersworld-scraper
  2. Install Dependencies

    pip install -r requirements.txt
  3. Run the Dashboard

    streamlit run app.py

    Open the URL shown in your terminal (usually http://localhost:8501).


📁 Project Structure

├── app.py                  # 🎨 Streamlit App (Entry Point)
├── scraping.py             # 🧠 Core Scraping Logic
├── fresherworld_scraper.log # 📝 Runtime logs
├── requirements.txt        # 📦 Dependencies
├── README.md               # 📄 Documentation

⚠️ Ethical Note

This tool is for educational and portfolio purposes. Please respect FreshersWorld's robots.txt and Terms of Service. Do not scrape aggressively.


👨‍💻 Author

Aryan Prajapati Python Developer • Web Scraper • Automation Engineer

GitHub