A Django-based backend service that scrapes, normalizes, and aggregates pricing data for Large Language Models (LLMs) across multiple providers such as OpenAI, Google Gemini, Anthropic Claude, and AWS Bedrock.
This project solves a very real problem: LLM pricing data is scattered, inconsistent, and hard to compare. Each provider publishes pricing in different formats (HTML tables, markdown docs, CSVs, or APIs). This service ingests all of them and exposes a clean, queryable pricing API.
-
🔎 Multi-provider pricing ingestion
- OpenAI (Playwright-based web scraping)
- Google Gemini (Markdown parsing)
- Anthropic Claude (Markdown tables)
- AWS Bedrock (CSV import + AWS Pricing API)
-
Normalized pricing model
-
Input vs Output token pricing
-
Modality-aware pricing (text, image, audio, video)
-
Unified pricing units (
mtok,images,seconds) -
Pricing aggregation APIs
-
List models by provider and modality
-
Aggregate input/output prices per model
-
Extensible design
-
Easy to add new providers or pricing sources
-
Decoupled scraping, parsing, and persistence layers
- Provider – OpenAI, Google, Anthropic, AWS, etc.
- LlmModel – Canonical model (e.g. GPT-4o, Claude 3.5)
- ProviderModel – Provider-specific API model mapping
- ModelCapability – What the model can do (input/output, text/image/audio)
- ProviderModelPricing – Actual price per unit
| Provider | Source | Method |
|---|---|---|
| OpenAI | Pricing Docs | Playwright HTML scraping |
| Google Gemini | Markdown Docs | Regex + table parsing |
| Anthropic Claude | Markdown Tables | Structured parsing |
| AWS Bedrock | CSV + AWS Pricing API | boto3 |
| Endpoint | Method | Description |
|---|---|---|
/api/scrape/openai/ |
GET / POST | Scrape OpenAI pricing |
/api/scrape/gemini/ |
GET / POST | Scrape Google Gemini pricing |
/api/scrape/claude/ |
GET / POST | Scrape Anthropic Claude models |
/api/scrape/bedrock/ |
GET | Load Bedrock pricing from CSV |
/api/scrape/bedrock/aws/ |
GET / POST | Fetch Bedrock pricing from AWS API |
GET /api/models?provider=OpenAI&modality=textResponse:
[
{
"id": "gpt-4o",
"name": "GPT-4o",
"provider": "OpenAI",
"modalities": ["text", "image"],
"context_length": 128000,
"status": "active"
}
]GET /api/pricing?provider=OpenAI&modality=textResponse:
[
{
"model": "gpt-4o",
"provider": "OpenAI",
"input_token_price": 5.0,
"output_token_price": 15.0,
"modality": "text",
"unit": "mtok"
}
]MIT License
Built to make LLM pricing sane, comparable, and queryable. 🚀