Skip to content

Releases: ClassicalClemi/python-hltv-scraper

Async Update! v.1.0.0

06 Jun 19:09
f47eba9

Choose a tag to compare

Async Update! v1.0.0

This major update brings full asynchronous support to the HLTV.org scraper, drastically improving speed and efficiency by enabling parallel scraping sessions. Key improvements include:

  • Complete rewrite of all scripts to use asyncio and AsyncCamoufox for faster, non-blocking requests
  • Enhanced Cloudflare bypass with improved session and cookie management
  • Support for proxy rotation and dynamic User-Agent switching for better stealth
  • Modular and cleaner codebase for easier customization and extension
  • Robust error handling and retry logic for more reliable scraping
  • Detailed logging and progress reporting to keep track of scraping status
  • Continued use of BeautifulSoup and pandas for parsing and data handling

Please leave feedback, either here on GitHub or on Discord :)

First Release!

03 Jun 20:04
ae49d8d

Choose a tag to compare

First release of my python HLTV.org scraper!

Note: This is my first released project ever, please be kind :)

Features:

  • Scrapes HLTV.org for team URLs, team data, match URLs and match data
  • Uses Camoufox for stealthy scraping
  • Parses HTML with BeautifulSoup
  • Outputs data as pandas DataFrames for easy analysis
  • Easy to understand code and pretty customizable

How to use:

  • See README.md for setup

Disclaimer:
This is very basic and it will be improved a lot. The features are currently very restricted, but this will get improved. I think this is a pretty good foundation, even if you want to add functionality yourself.