A dedicated CLI tool driven by @domus/sync to orchestrate data synchronization between the central DUK system and a local parish database. Designed to be deployed on a mini PC within the parish network.
The crawler performs four main tasks:
- Seed: Populates local regional data (Provinces, Regencies, Districts, Villages) from BPS sources.
- Scrape: Fetches reference data and parishioner records from DUK into a staging area.
- Transform: Processes staged data into production-ready tables in the local database.
- Sync-Back: Pushes local updates (made in Domus) back to the DUK system.
The crawler requires the following environment variables. Create a .env file in this directory or pass them to Docker.
| Variable | Description | Example |
|---|---|---|
DATABASE_URL |
Local PostgreSQL connection string. | postgresql://user:pass@localhost:5432/domus |
SYNC_TARGET_URL |
Base URL of the DUK system. | https://duk-target.com |
SYNC_USERNAME |
Username for DUK system login. | parish_admin |
SYNC_PASSWORD |
Password for DUK system login. | ******** |
Running from the monorepo root:
# Show help
pnpm --filter @domus/crawler start help
# Run full pipeline
pnpm --filter @domus/crawler start crawl
# Individual commands
pnpm --filter @domus/crawler start seed
pnpm --filter @domus/crawler start scrape
pnpm --filter @domus/crawler start transform
pnpm --filter @domus/crawler start syncbackFor production deployment on a mini PC:
-
Build the image (run from monorepo root):
docker build -t domus-crawler -f apps/crawler/Dockerfile . -
Run the container:
docker run --rm --env-file .env domus-crawler [command]
Example: Run full crawl
docker run --rm --env-file .env domus-crawler crawl
The Dockerfile uses a multi-stage build:
- Base: Installs all monorepo dependencies.
- Run: Uses
tsxto execute the TypeScript source directly, ensuring full compatibility with the monorepo's ESM structure without complex build-time transpilation issues.
Built with ❤️ for Kristus Raja Barong Tongkok Parish.