Async Update! v1.0.0

This major update brings full asynchronous support to the HLTV.org scraper, drastically improving speed and efficiency by enabling parallel scraping sessions. Key improvements include:

Complete rewrite of all scripts to use asyncio and AsyncCamoufox for faster, non-blocking requests
Enhanced Cloudflare bypass with improved session and cookie management
Support for proxy rotation and dynamic User-Agent switching for better stealth
Modular and cleaner codebase for easier customization and extension
Robust error handling and retry logic for more reliable scraping
Detailed logging and progress reporting to keep track of scraping status
Continued use of BeautifulSoup and pandas for parsing and data handling

Please leave feedback, either here on GitHub or on Discord :)

Note: This is my first released project ever, please be kind :)

Features:

Scrapes HLTV.org for team URLs, team data, match URLs and match data

Uses Camoufox for stealthy scraping

Parses HTML with BeautifulSoup

Outputs data as pandas DataFrames for easy analysis

Easy to understand code and pretty customizable

How to use:

See README.md for setup

Disclaimer:
This is very basic and it will be improved a lot. The features are currently very restricted, but this will get improved. I think this is a pretty good foundation, even if you want to add functionality yourself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Async Update! v1.0.0

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

First release of my python HLTV.org scraper!

Note: This is my first released project ever, please be kind :)

Uh oh!

Releases: ClassicalClemi/python-hltv-scraper