- Python scripts
- PostgreSQL for metadata
- MinIO for PDF storage
docker compose up -dPG_HOST,PG_PORT,PG_DB,PG_USER,PG_PASSWORDMINIO_ENDPOINT,MINIO_ACCESS_KEY,MINIO_SECRET_KEY,MINIO_BUCKET
python scripts/crawl/crawl_metadata/crawl.py
python scripts/crawl/crawl_cv/get_cv.py