Releases: ClassicalClemi/python-hltv-scraper
Releases · ClassicalClemi/python-hltv-scraper
Async Update! v.1.0.0
Async Update! v1.0.0
This major update brings full asynchronous support to the HLTV.org scraper, drastically improving speed and efficiency by enabling parallel scraping sessions. Key improvements include:
- Complete rewrite of all scripts to use asyncio and AsyncCamoufox for faster, non-blocking requests
- Enhanced Cloudflare bypass with improved session and cookie management
- Support for proxy rotation and dynamic User-Agent switching for better stealth
- Modular and cleaner codebase for easier customization and extension
- Robust error handling and retry logic for more reliable scraping
- Detailed logging and progress reporting to keep track of scraping status
- Continued use of BeautifulSoup and pandas for parsing and data handling
Please leave feedback, either here on GitHub or on Discord :)
First Release!
First release of my python HLTV.org scraper!
Note: This is my first released project ever, please be kind :)
Features:
- Scrapes HLTV.org for team URLs, team data, match URLs and match data
- Uses Camoufox for stealthy scraping
- Parses HTML with BeautifulSoup
- Outputs data as pandas DataFrames for easy analysis
- Easy to understand code and pretty customizable
How to use:
- See README.md for setup
Disclaimer:
This is very basic and it will be improved a lot. The features are currently very restricted, but this will get improved. I think this is a pretty good foundation, even if you want to add functionality yourself.