Scrape the PEPPOL BIS Billing code list pages and export the code lists as JSON files.
This repository is a small utility project: it contains a single Python script that downloads the public PEPPOL BIS Billing code lists (such as country codes, currency codes, tax categories, invoice type codes, etc.) and turns them into machine‑friendly JSON that you can reuse in your own applications, validators, or integration layers.
- Python 3.8+ (any modern 3.x version should work)
Clone the repository:
git clone https://github.com/Toenn-Vaot/peppol_extract_codes_list.git
cd peppol_extract_codes_listInstall the required Python packages with pip (as noted above):
pip install requests beautifulsoup4From the root of the repository, run:
python scrape_peppol_codelists.pyThe script will:
- Connect to the official PEPPOL BIS Billing 3.0 documentation site.
- Follow the links to the various code list pages (e.g. ISO 3166-1 country codes, ISO 4217 currency codes, tax category codes, invoice type codes, etc.).
- Parse the HTML tables that define each code list.
- Export the data to one or more JSON files.
The exact output format (file names, directory layout, field names) is defined inside scrape_peppol_codelists.py. Common patterns are:
- one JSON file per code list, or
- a single JSON file containing several named lists.
Open the script to see the current behaviour and adjust the README if you change it.
A typical JSON structure for a single code list might look like:
{
"code_list": "ISO 4217 Currency codes",
"source": "https://docs.peppol.eu/poacc/billing/3.0/codelist/",
"codes": [
{ "code": "EUR", "name": "Euro" },
{ "code": "USD", "name": "US Dollar" }
]
}This is only an example; adapt it to the actual output produced by the script.
This project is licensed under the MIT License. See the LICENSE file for details.