Skip to content

JMartinezRuiz/EmailCrawler

Repository files navigation

Email Crawler

Email Crawler is a Python desktop and CLI tool for finding public email addresses from a list of websites. It uses undetected-chromedriver with Selenium so it can render JavaScript-heavy pages and keep the original Google and Facebook fallback strategies.

Features

  • Tkinter GUI for pasting many websites and reviewing results.
  • CLI for quick terminal runs.
  • Uses undetected-chromedriver.
  • Checks the main page, contact paths, clicked contact/support links, about pages, mailto: links, Facebook pages, and Google search results.
  • Optional headless Chrome mode.
  • Configurable wait time and Chrome major version.
  • Exclusion patterns for noisy addresses such as Sentry or example domains.
  • Best-match mode or show-all mode.

Requirements

  • Python 3.10+
  • Google Chrome installed

Setup

python -m venv .venv

Windows PowerShell:

.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

macOS/Linux:

source .venv/bin/activate
pip install -r requirements.txt

Run The GUI

python emailcrawlerGUI.py

On Windows:

.\LaunchEmailCrawler.bat

Run From The CLI

Best match per website:

python email_crawler.py example.com --headless

All matches:

python email_crawler.py example.com another-site.com --all --headless

Force a Chrome major version if UC cannot auto-detect it:

python email_crawler.py example.com --chrome-version-main 149

Disable one fallback when debugging:

python email_crawler.py example.com --no-google
python email_crawler.py example.com --no-facebook

Project Layout

.
├── email_crawler.py       # Core Selenium/UC crawler and CLI
├── emailcrawlerGUI.py     # Tkinter GUI
├── LaunchEmailCrawler.bat # Windows launcher
├── requirements.txt       # Runtime dependencies
└── tests/                 # Unit tests for parsing/ranking helpers

Development

Run tests:

python -m unittest discover -s tests

Compile-check:

python -m py_compile email_crawler.py emailcrawlerGUI.py

License

MIT. See LICENSE.

About

Tkinter and CLI email crawler using Selenium, undetected-chromedriver, Google and Facebook fallbacks.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors