JupiterBroadcasting Show Scraper

🚨**ATTENTION:**🚨

Make commits into main with great caution, as this branch is used in "production" by the jupiterbroadcasting.com GitHub Action.

Scraper written in python to convert episodes hosted on Fireside or jupiterbroadcasting.com (Wordpress) into Hugo files.

Originally based on Self-Hosted show-notes scraper

Data

All the scraped data is saved into the ./data folder.

config.yml contains:

usernames_map - Fireside to Hugo username translations
data_dont_override - data filenames (sponsors or people) which shouldn't be overridden when scraping Fireside

Run using Docker

make scrape

Run without Docker

Setup python venv

Install pipenv:

pip3 install pipenv

Install all the dependencies

pipenv install -d

Activate your pipenv shell:

pipenv shell

Run

Make sure you have activated the pipenv virtual environment, running which python should point to the binary inside the pipenv venv dir.

Run the script from the root dir:

python scraper.py

You can set these env variables:

LOG_LVL: Integer severity value for the loguru library (see this table). Defaults to 20 (INFO).
LATEST_ONLY: Set to true to scrape only the latest episode of each show defined in config.yml. This mode is used for automatically scraping new episode with GitHub actions. Default mode is to scrape all episodes and all data.
DATA_DIR: The location where all the scraped files would be saved to. Defaults to ./data.

Example:

LOG_LVL=1 LATEST_ONLY=1 python scraper.py

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
models		models
scripts		scripts
tests		tests
.Dockerignore		.Dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
Vagrantfile		Vagrantfile
config.yml		config.yml
docker-compose.yml		docker-compose.yml
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JupiterBroadcasting Show Scraper

Data

Run using Docker

Run without Docker

Setup python venv

Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

JupiterBroadcasting/show-scraper

Folders and files

Latest commit

History

Repository files navigation

JupiterBroadcasting Show Scraper

Data

Run using Docker

Run without Docker

Setup python venv

Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages