Skip to content

Clock-Skew/email-scrubber-3000

Repository files navigation

Email Scrubber 3000

Email Scrubber 3000 Screenshot

Python utility that validates email addresses by resolving MX records and performing a minimal SMTP handshake (RCPT check). It can validate individual addresses, scan a file, or generate a "scrubbed" output file containing only addresses that appear deliverable.

Features

  • MX lookup using dnspython
  • IPv4 resolution with fallback when name resolution fails
  • Minimal SMTP handshake on port 25 for deliverability hints
  • Concurrent validation for large lists (ThreadPoolExecutor)
  • Scrub mode writes only valid emails to a new file

Quickstart

  • Requirements: Python 3.8+.
  • Install deps: pip install -r requirements.txt.
  • Run: python3 valid.py.

Usage

Run the interactive menu:

python3 valid.py

Menu options:

  • Validate a single email
  • Validate emails from a file (.txt or .csv)
  • Scrub a file (produce a file with only valid addresses)

File Formats

  • .txt: one email per line
  • .csv: uses the first column of each row

Output

  • Invalid addresses are printed with reasons and appended to error_log.txt.
  • Scrubbed results are saved next to the source: name_scrub.txt or name_scrub.csv.

Examples

  • Validate single address:

    1. Choose option 1 and enter someone@example.com.
    2. The tool prints ✅/❌ with the SMTP response code/message.
  • Validate from file:

    1. Place emails.txt with one address per line in this folder.
    2. Choose option 2, pick the file from the list.
  • Scrub file:

    1. Place emails.csv (address in first column) here.
    2. Choose option 3; the scrubbed list is saved as emails_scrub.csv.

How It Works

For each address:

  1. Split into local@domain.
  2. Lookup the domain's MX record.
  3. Resolve the MX host to IPv4 and connect on port 25.
  4. Issue HELO, MAIL FROM, and RCPT TO for the target.
  5. Interpret the SMTP code: 250 implies likely deliverable; other codes are treated as invalid.

Notes & caveats:

  • Providers can greylist/tarpit/accept‑all; expect false positives/negatives.
  • Port 25 may be blocked by your ISP/cloud; connectivity failures are inconclusive.
  • Some servers only validate after DATA; RCPT 250 is not a guarantee.

Best Practices

  • Legal/ethical: Validate lists you own the right to test. Respect anti‑abuse laws and provider terms.
  • Warm‑up: For large lists, test in smaller batches to avoid rate limits.
  • Backoff: Insert delays if you see repeated timeouts or 4xx deferrals.
  • Retries: The tool retries transient network issues; consider re‑running on failures.
  • Timeout hygiene: Default timeouts are conservative; adjust only if you understand trade‑offs.
  • Data handling: Never store full SMTP transcripts for real user data unless necessary.

Troubleshooting

  • Port 25 blocked: Try a network where SMTP egress is allowed.
  • Name resolution issues: Ensure DNS works and outbound UDP/53 is permitted.
  • Slow validations: Reduce max_workers in validate_emails_concurrently or limit input size.
  • IPv6 pitfalls: The script uses IPv4 resolution explicitly to avoid IPv6 routing problems.

Security

  • No credentials required; uses a fake MAIL FROM for probing.
  • Do not run from sensitive networks without approvals; outbound SMTP can be monitored/blocked.
  • Review error_log.txt contents before sharing.

Development

  • Main entry: valid.py
  • Deps: requirements.txt
  • Style: keep changes minimal and focused; prefer readability over micro‑optimizations.

Local Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python3 valid.py

Releasing

  • Tag a release after notable changes: git tag -a vX.Y.Z -m "..." && git push --tags.
  • Update this README with behavior changes.

License

This project is licensed under the MIT License. See LICENSE for details.

About

Python utility that validates email addresses by resolving MX records and performing a minimal SMTP handshake (RCPT check). It can validate individual addresses, scan a file, or generate a "scrubbed" output file containing only addresses that appear deliverable.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages