feat: add edgenode-harvester container variant#137
Open
evangineer wants to merge 18 commits intomainfrom
Open
Conversation
Introduces a new container variant (Dockerfile.harvester) that bundles log-harvester (unytco/log-harvester) instead of log-sender. The harvester reads from log-collector, aggregates usage data, and parks invoices on Unyt Agreements via an embedded Holochain conductor. Key changes: - Dockerfile.harvester: single-stage wolfi-base build, clones and builds log-harvester from source via GITHUB_TOKEN build secret, bakes in unyt.happ from latest unytco/unyt-sandbox release (pinnable via UNYT_HAPP_VERSION build arg), exposes ports 4444 and 4445 - s6-overlay-harvester/: self-contained s6 service tree (conductor, log-harvester, logrotate-cron, setup) with no dependency on the base s6-overlay directory - log-harvester s6 service: waits for conductor readiness, installs unyt.happ, attaches app websocket on 4445, init/refreshes harvester config, runs harvester in loop mode - CI: build-and-push-harvester-image job publishing ghcr.io/holo-host/edgenode-harvester using HARVESTER_REPO_TOKEN secret - docker-compose.yml: edgenode-harvester service for local testing - Docs: LOG_HARVESTER_QUICKSTART.md, updated README.md, docker/README.md, docker/CHANGELOG.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds BATS test files and a test runner for the harvester container variant: - harvester_startup.bats: verifies conductor start, unyt.happ install, app-ws on 4445, config initialization, and log-harvester service start - harvester_process.bats: verifies holochain and node run as nonroot - run_harvester_tests.sh: runner analogous to run_tests_multi.sh, builds the harvester image with GITHUB_TOKEN secret and waits for full startup - pr-checks.yml: adds test-harvester-image CI job running on docker/** PRs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds harvester_e2e.bats which submits real signed metrics via log-sender, verifies they reach D1, then runs the harvester (--today --dry-run) and asserts "fetched metrics count" > 0 and "Successfully invoiced logs." Trims harvester_integration.bats to two lightweight connectivity checks. Extends run_harvester_tests.sh to start edgenode and wait for its Holochain conductor before running the e2e test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- CONFIG_PATH: /etc/log-harvester → /data/log-harvester (volume-mounted, survives restarts; drops now-redundant holo-config-harvester volume) - s6 run scripts: shebang → #!/command/with-contenv so HC_* env vars are available inside the service - log-harvester/run: remove add-app-ws call (handled by harvester init); add chown after both init and refresh paths - Dockerfile.harvester: GITHUB_TOKEN secret now optional — falls back to unauthenticated clone when git credentials already grant access - pr-checks.yml: switch to docker/build-push-action with GHA layer cache for edgenode, edgenode-harvester, and log-collector builds - run_tests_multi.sh: build log-collector separately before compose up to preserve layer cache across repeat runs - harvester_startup.bats: update config path and websocket test to match Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The docker/log-collector directory is gitignored (it's a separate repo). Add an actions/checkout step to clone it into place before the build. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The log-collector Dockerfile and entrypoint are ours, not upstream's. Store as Dockerfile.log-collector in the docker/ directory so CI can find them after checking out unytco/log-collector as the build context. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
secret-envs is not a valid build-push-action parameter; the correct field is secrets which mounts the value directly as a build secret. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
build-push-action secrets: does not reliably mount the github_token secret into the BuildKit RUN --mount. Revert to an explicit docker buildx build run: step with --secret id=github_token,env=GITHUB_TOKEN, which is the known-working approach. GHA layer cache flags are retained. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rets Docker BuildKit secret mounting is unreliable in GHA. Mirror the log-collector approach: checkout unytco/log-harvester into docker/log-harvester-src/ before the build, then COPY it in. - Dockerfile.harvester: replace RUN --mount=type=secret git clone with COPY log-harvester-src - pr-checks.yml: add actions/checkout step for log-harvester - run_harvester_tests.sh: auto-clone log-harvester-src if absent - .gitignore: add docker/log-harvester-src/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docker compose up -d was starting all services including edgenode-harvester, which requires log-harvester-src in the build context. Explicitly name only the services needed for the edgenode test suite. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In CI the wrangler state directory is empty (no persistent local volume), so drone_registrations and other tables don't exist. Run wrangler d1 execute --local --file=schema.sql in the entrypoint before starting wrangler dev. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrangler dev reads vars from wrangler.toml, not Docker env vars.
Pass --var ADMIN_SECRET:${ADMIN_SECRET} so the worker picks up the
secret configured in docker-compose.yml.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
run_tests_multi.sh was running all *.bats files including harvester_*, which fail when edgenode-harvester isn't started. Exclude harvester files explicitly — they belong to run_harvester_tests.sh. Also add setup() skip guards to harvester_startup.bats and harvester_process.bats which were missing them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add restart: unless-stopped to log-collector service so Docker auto-recovers when wrangler dev crashes under concurrent D1 load - Tighten healthcheck (15s interval, 5 retries, 30s start_period) - Run each .bats file separately in run_tests_multi.sh with a log-collector health-check wait between files, so a wrangler crash in integration_data_pipeline.bats doesn't cascade to later files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- CHANGELOG: add all Unreleased entries for harvester variant work (Dockerfile.log-collector, run_harvester_tests.sh, BATS files, test runner isolation, restart policy) - TESTING: note per-file test isolation and harvester_* naming convention - LOG_HARVESTER_QUICKSTART: remove stale --secret flag from local build snippet; replace with git clone of log-harvester-src (required for COPY) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dockerfile was renamed/consolidated; latest-unyt tag is not published. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Dockerfile.harvester— a new edge node container variant that bundleslog-harvester(unytco/log-harvester) instead oflog-sender, for Unyt invoice aggregationunyt.happis baked in from the latestunytco/unyt-sandboxrelease (pinnable viaUNYT_HAPP_VERSIONbuild arg)s6-overlay-harvester/service tree manages conductor, log-harvester, logrotate-cron, and setupDockerfile.log-collectoradded — our Dockerfile for the unytco/log-collector Cloudflare Worker, used in CI and local integration testingbuild-and-push-harvester-imagepublishesghcr.io/holo-host/edgenode-harvesteron release; requiresHARVESTER_REPO_TOKENsecret (PAT with repo read access tounytco/log-harvesterandunytco/log-collector)docker-compose.ymlupdated withedgenode-harvesterservice andrestart: unless-stoppedon log-collectorLOG_HARVESTER_QUICKSTART.mdand updatedREADME.md,docker/README.md,docker/CHANGELOG.md,docker/TESTING.mdArchitecture
Notable fixes
/etc/log-harvesterto/data/log-harvesterso it survives restarts on the volume-mounted path; drops the now-redundantholo-config-harvestervolume#!/command/with-contenvsoHC_*env vars are visible inside servicesactions/checkoutin CI (andgit clonelocally byrun_harvester_tests.sh) thenCOPY'd into the image — no build-time secrets needed in the Dockerfilerun_tests_multi.shruns each.batsfile individually with a log-collector health-check wait between files, so a wrangler dev crash in one file doesn't cascade to subsequent filesTest suite
Four BATS files run via
./run_harvester_tests.sh(builds image, starts all services, waits for readiness):harvester_startup.batsharvester_process.batsnonrootharvester_integration.bats/logsreturns successharvester_e2e.bats--today --dry-run), asserts"Successfully invoiced logs."Test plan
run_harvester_tests.sh)test-docker-imageandtest-harvester-image)HARVESTER_REPO_TOKENsecret is set in repo Actions secrets before merging (required for CI build)docker compose up edgenode-harvesterwith validCOLLECTOR_URL,ADMIN_SECRET,LAIR_PASSWORDand verify harvester reaches "Starting log-harvester service..." in startup loglastInvoiceis preserved across a container restart🤖 Generated with Claude Code