Skip to content

Latest commit

 

History

History
251 lines (179 loc) · 5.91 KB

File metadata and controls

251 lines (179 loc) · 5.91 KB

GridSync

A self-hosted pipeline that downloads ERCOT's public Electric Service Identifier (ESIID) data, archives raw files to object storage, loads them into Postgres, and serves a fast fuzzy search API.

Built with Rust. No AWS, GCP, or Azure required.


What it does

ERCOT publishes a daily feed of every electric service point in Texas (~3–5 million records) across all Transmission and Distribution Service Providers (TDSPs). This project:

  1. Downloads new files from the ERCOT MIS API on a schedule
  2. Archives raw ZIP files to Backblaze B2 (S3-compatible)
  3. Parses and loads CSV data into Postgres with upsert logic
  4. Exposes a REST API for fuzzy address and ESIID search

Stack

Layer Choice Why
Language Rust Performance, safety, single binaries
Database Supabase / Postgres Portable, open-source, fuzzy search via pg_trgm
Object storage Backblaze B2 S3-compatible, no egress fees, no vendor lock-in
Search pg_trgm + Postgres FTS Built into Postgres, no extra service needed
API Axum Fast, async, minimal
Scheduler System cron or pg_cron No external dependency

Prerequisites

  • mise — manages Rust and dev tools
  • A Postgres database (Supabase free tier works)
  • A Backblaze B2 bucket with an application key

Getting Started

1. Clone and configure

git clone <repo-url>
cd gridsync
cp .env.example .env

Edit .env with your credentials:

DATABASE_URL=postgresql://postgres:password@db.your-project.supabase.co:5432/postgres

B2_KEY_ID=your-key-id
B2_APPLICATION_KEY=your-application-key
B2_BUCKET_NAME=gridsync
B2_ENDPOINT=https://s3.us-east-005.backblazeb2.com
B2_REGION=us-east-005

2. Install tools and dependencies

mise install              # installs Rust stable
mise run install          # installs sqlx-cli and cargo-watch

3. Run migrations

mise run db:migrate

4. Build

mise run build

5. Download ERCOT data

# Download all available files (first run may take a while)
mise run download

# Or limit to a specific TDSP
mise run download -- --tdsp ONCOR_ELEC

# Preview what would be downloaded without fetching
mise run download -- --dry-run

6. Run the ETL for a downloaded file

mise run etl -- \
  --key raw/ercot/esiid/2026/03/22/ext.00000203...ONCOR_ELEC_FUL.zip \
  --utility ONCOR_ELEC \
  --file-type FUL

7. Start the API

mise run api
# or with live reload during development
mise run watch

API

GET /search

Fuzzy search across all ESIID records.

Query parameters:

Parameter Type Default Description
q string Free-text query — fuzzy address, city, zip, or ESIID
zip string Filter by exact zip code
utility string Filter by TDSP name (e.g. ONCOR_ELEC)
premise_type string Filter by premise type (RES, COM, etc.)
load_zone string Filter by settlement load zone
active_only bool true Only return active service points
limit int 25 Results per page (max 100)
offset int 0 Pagination offset

Example:

curl "http://localhost:3000/search?q=123+main+st&zip=78701&limit=10"

Response:

{
  "total": 42,
  "limit": 10,
  "offset": 0,
  "results": [
    {
      "id": 1,
      "esiid": "10443720000123456",
      "address": "123 MAIN ST",
      "city": "AUSTIN",
      "state": "TX",
      "zip_code": "78701",
      "utility": "ONCOR_ELEC",
      "premise_type": "RES",
      ...
    }
  ]
}

When q is provided, results are ranked by full-text relevance then trigram similarity. Without q, results are returned ordered by ESIID.

GET /health

Returns 200 ok. Use for uptime checks.


Data Source

ERCOT publishes Report 203 (TDSP ESIID Extract) — a public feed requiring no authentication.

File types:

  • _FUL — Full snapshot of all ESIIDs for a TDSP. Published on the first Tuesday following the first Monday of each month.
  • _DAILY — Incremental delta. Published every day.

TDSPs covered: CENTERPOINT, ONCOR_ELEC, LUBBOCK, NUECES_ELEC, TNMP, AEP_CENTRAL, AEP_NORTH, ENTERGY_GULF, SHARYLAND, AEP_TEXAS_SP, SWEPCO_ENERG

Raw files are archived to B2 at:

raw/{source}/{dataset}/{YYYY}/{MM}/{DD}/{original-filename}.zip

Examples:

raw/ercot/esiid/2026/03/22/ext.00000203...ONCOR_ELEC_FUL.zip
raw/ercot/settlement/2026/03/22/file.zip

Scheduling

To keep data current, schedule the downloader to run daily via system cron:

# Run at 6am UTC every day
0 6 * * * cd /path/to/gridsync && mise run cron >> /var/log/gridsync-cron.log 2>&1

Or use pg_cron if running inside Supabase/Postgres.

The downloader skips files already present in B2, so reruns are safe. FUL files are picked up automatically when they appear in the ERCOT feed.


Development

mise run check      # cargo check + clippy
mise run fmt        # cargo fmt
mise run test       # cargo test
mise run watch      # live-reload API server
mise run clean      # cargo clean (remove build artifacts)

After adding or modifying sqlx::query! calls, regenerate the offline query cache so CI can build without a live database:

mise run db:prepare   # writes .sqlx/ — commit this directory

Migrations

mise run db:migrate   # apply pending
mise run db:status    # show applied/pending
mise run db:revert    # revert last migration

Project Layout

crates/
├── common/       # Shared config, DB pool, B2 client, models
├── downloader/   # ERCOT → B2 pipeline
├── etl/          # B2 → Postgres loader
└── api/          # Axum search API
db/
└── migrations/   # Versioned SQL migrations
.mise/tasks/      # All runnable project tasks

License

GPL-3.0 — see LICENSE for details.