Arq Signals is a read-only PostgreSQL diagnostic collector. It runs on your infrastructure, collects statistics from your databases, and packages them as portable snapshots. No data leaves your machine. No AI. No cloud. Just structured evidence from the views PostgreSQL already exposes.
From Elevarq — PostgreSQL tools for engineering teams.
Read-only by design — three independent enforcement layers prevent any write operations. Unsafe roles (superuser, replication) are blocked before collection starts. Every SQL query is in the source.
No cloud, no phone-home — all data stays on your machine. No telemetry, no analytics, no external network calls.
No AI inside — Arq Signals is a pure data collector. No language models, no scoring, no recommendations. What you collect is what you get.
Built for restricted environments — runs airgapped, as a non-root container, with no internet dependency. Suitable for networks where third-party monitoring agents are not permitted.
git clone https://github.com/elevarq/arq-signals.git
cd arq-signals
docker compose -f examples/docker-compose.yml up -dThis starts Arq Signals alongside PostgreSQL 16 with a pre-configured monitoring role. Collection begins automatically.
# Trigger an immediate collection
curl -X POST http://localhost:8081/collect/now \
-H "Authorization: Bearer test-token"
# Download your first snapshot
curl -o snapshot.zip http://localhost:8081/export \
-H "Authorization: Bearer test-token"
# Inspect the contents
unzip -l snapshot.zipYour snapshot contains raw PostgreSQL statistics in structured JSON —
nothing more. See examples/snapshot-example/
for what the output looks like.
Every PostgreSQL instance exposes diagnostic data through built-in statistics views. But collecting this data consistently, safely, and in a format you can actually use takes tooling that most teams end up building themselves.
Arq Signals handles the collection part so you don't have to. It connects with a read-only role, runs approved SQL queries on a schedule, and writes structured results to local storage. When you need the data elsewhere, it packages everything as a portable ZIP snapshot.
The project is open source because we think data collection should be transparent. You can read every SQL query Arq Signals will run. You can audit the binary. You own the output.
- Connects to one or more PostgreSQL instances (14+)
- Runs 73 read-only diagnostic collectors covering:
- Server configuration, identity, and cluster fingerprint
- Session activity and connection pressure
- Table, index, and I/O statistics (incl.
pg_stat_io/pg_stat_wal) - Schema metadata: columns, constraints, indexes, sequences, triggers, views, materialised views, partitions, functions
- Storage placement: tablespaces, per-relation storage, per- attribute storage
- Query intelligence (via
pg_stat_statements, self-filtered to exclude Signals' own probe queries and scoped to the connected database) - Transaction wraparound risk and prepared-transaction age
- Vacuum and autovacuum health
- In-flight operation progress — six
pg_stat_progress_*collectors covering vacuum, analyze, create_index, cluster, basebackup, copy - Index hygiene — derived findings for unused, invalid, redundant, and duplicate indexes
- Bloat estimation — statistical table and index bloat
without
pgstattuple(runs on managed PG) - Replication: physical (
pg_stat_replication), slot risk (pg_replication_slots), and logical-slot health (pg_stat_replication_slots) - Checkpoint, background writer, and checkpointer pressure
- Storage growth, largest relations, temp I/O pressure
- Per-role / per-database configuration overrides
- Foreign data wrappers and partition topology
- Vector / pgvector column inventory
- Role capabilities and login-role surface
- Stores results locally in SQLite as structured NDJSON
- Schedules collection with configurable cadences (5m to 7d per query)
- Packages snapshots as portable ZIP archives
- Exposes a local HTTP API for triggering collection, pausing / resuming targets, and reloading configuration
- Provides a CLI (
arqctl) for operations, including pre-flight diagnostics (arqctl doctor) and classified connection-test (arqctl connect test) - Per-target sensitivity profiles, per-class retention, and a per-target circuit breaker for operator safety during incidents
Arq Signals is developed using STDD — a methodology where the specification and tests define the system, and code is a replaceable artifact that must satisfy both.
The repository contains:
- Formal specification — 104 numbered requirements covering collection, safety, configuration, API, persistence, and diagnostics (specification.md)
- Acceptance tests — 240+ test cases derived directly from the
specification (per-collector acceptance files under
specifications/collectors/, plus the cross-cuttingacceptance-tests.md) - Traceability matrix — every requirement mapped to executable tests with evidence classification (behavioral, structural, or integration) (traceability.md)
- Language-neutral contracts — API and configuration schemas defined as appendices, independent of the Go implementation (Appendix A, Appendix B)
This approach matters for a tool that connects to production databases. Every safety guarantee — read-only enforcement, role validation, credential handling — is formally specified, tested, and traceable. You can verify the claims without reading the implementation.
- All PostgreSQL queries execute inside
READ ONLYtransactions, enforced at three independent layers - Role safety validation blocks superuser, replication, and bypassrls roles before any query runs
- Defensive session timeouts (
statement_timeout,lock_timeout,idle_in_transaction_session_timeout) prevent runaway queries - The collector never performs write operations on PostgreSQL — this is enforced by static SQL linting, session configuration, and transaction access mode
- Credentials are never stored in snapshots, export metadata, API responses, or log output
- If an unsafe role override is used, it is explicitly recorded in export metadata with the specific bypassed checks
- The entire safety model is formally specified and covered by 800+ automated tests across the module
For the full safety model, see docs/runtime-safety-model.md.
| Example | Description |
|---|---|
| Local safe role | Recommended production setup with arq_signals monitoring role |
| Local superuser override | Dev/test setup with postgres superuser (unsafe override) |
| Docker | Container build, run, and export workflow |
| Docker Compose | Quick start with PostgreSQL 16 |
| Helm | Kubernetes deployment with the starter Helm chart |
| Snapshot inspection | How to inspect and understand export output |
| Snapshot example | Static reference snapshot for offline review |
Arq Signals has first-class catalog support for PostgreSQL 14, 15,
16, 17, and 18. Each major has its own catalog file
(internal/pgqueries/catalog_pgN.go) that carries the SQL needed
when a pg_stat_* view's column shape differs from the version-
agnostic default. Logical collector IDs (e.g. pg_stat_io_v1) stay
stable across majors — only the SQL underneath changes.
A per-cycle discovery probe runs first on each target and returns
the server's version, server_version_num, installed extensions,
current database, and current user. Catalog selection is driven by
that probe, not by configured assumption. Version-specific collectors
(e.g. checkpointer_stats_v1 on PG 17+) and extension-gated
collectors (e.g. pg_stat_statements_v1) are included or skipped
automatically.
PostgreSQL 19 is treated as experimental: the daemon falls back to the highest supported catalog (PG 18) and logs a startup warning so the experimental status is visible. PostgreSQL versions below 14 are out of scope.
Arq Signals is local-first by design:
- No data egress. The daemon writes snapshots only to local
storage (a SQLite file under
database.path) and serves them via the HTTP API on the operator's listener. There is no outbound telemetry, no LLM call, no analytics ping. The repository's boundary tests assert this at every build. - Read-only PostgreSQL access. Three layers: a static SQL linter
rejects DDL/DML at registration; the session is set to
default_transaction_read_only=on; every collector query runs inside aBEGIN ... READ ONLYtransaction. Roles withrolsuper,rolreplication, orrolbypassrlsare refused. - No secrets in artifacts. Passwords, API tokens, and DSNs never appear in logs, exports, or the metrics endpoint. A central audit-event denylist filters secret-shaped attribute keys before any slog record is emitted (R078).
- Off-by-default surfaces. The Prometheus
/metricsendpoint (R079), the high-sensitivity collector pack (R075), and the per-collector export view (R080) are all opt-in and have no effect unless explicitly enabled insignals.yaml.
Releases are published as multi-arch container images at
ghcr.io/elevarq/arq-signals with cosign keyless signatures, an
SPDX SBOM (OCI attestation and downloadable file), and a SLSA
build provenance attestation (mode=max).
Quick signature verification:
cosign verify ghcr.io/elevarq/arq-signals:<VERSION> \
--certificate-identity-regexp='github.com/Elevarq/Arq-Signals/.github/workflows/release.yml@' \
--certificate-oidc-issuer='https://token.actions.githubusercontent.com'Inspect the SBOM (registry attestation):
cosign download sbom ghcr.io/elevarq/arq-signals:<VERSION> > sbom.spdx.jsonConfirm the image is multi-arch:
docker buildx imagetools inspect ghcr.io/elevarq/arq-signals:<VERSION>Full operator checklist (provenance, Trivy re-scan, OCI labels, etc.):
docs/release-verification.md.
git clone https://github.com/elevarq/arq-signals.git
cd arq-signals
docker compose -f examples/docker-compose.yml up -ddocker run -d --name arq-signals \
-e ARQ_SIGNALS_TARGET_HOST=host.docker.internal \
-e ARQ_SIGNALS_TARGET_USER=arq_monitor \
-e ARQ_SIGNALS_TARGET_DBNAME=postgres \
-e ARQ_SIGNALS_TARGET_PASSWORD_ENV=PG_PASSWORD \
-e PG_PASSWORD=your_password \
-e ARQ_ALLOW_INSECURE_PG_TLS=true \
-e ARQ_ENV=dev \
-v arq-data:/data \
-p 8081:8081 \
ghcr.io/elevarq/arq-signals:latestgit clone https://github.com/elevarq/arq-signals.git
cd arq-signals
make build # produces bin/arq-signals and bin/arqctl
./bin/arq-signals --config signals.yamlSee examples/signals.yaml for a complete
annotated configuration file.
Arq Signals is designed to run using a dedicated monitoring role, not
the PostgreSQL superuser. For production use, create a role such as
arq_signals and grant the pg_monitor predefined role:
CREATE ROLE arq_signals LOGIN;
GRANT pg_monitor TO arq_signals;
GRANT CONNECT ON DATABASE your_database TO arq_signals;
-- Optional: enable query-level statistics
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;The default postgres role is a superuser and will be rejected by the
safety model unless the operator explicitly enables unsafe override
mode (ARQ_SIGNALS_ALLOW_UNSAFE_ROLE=true). This behavior is
intentional — it prevents accidental execution with elevated
privileges in production.
For a full discussion of the role posture — including what additional
access (if any) the high-sensitivity collector pack needs and how to
audit the role — see docs/postgres-role.md.
Arq Signals can expose an opt-in /metrics endpoint with operational
counters and gauges (collection outcomes, export outcomes, sqlite
persistence health, high-sensitivity gate state). The endpoint never
publishes collected PostgreSQL data. Off by default; enable with
signals.metrics_enabled: true. See
docs/prometheus.md for the safety scope
guarantees and
docs/metrics-consumer-guide.md
for the full metric inventory, scrape configuration, and
recommended alerting rules.
# Via CLI
arqctl collect now
# Via API
curl -X POST http://localhost:8081/collect/now \
-H "Authorization: Bearer $ARQ_SIGNALS_API_TOKEN"# Via CLI
arqctl export --output snapshot.zip
# Via API
curl -o snapshot.zip http://localhost:8081/export \
-H "Authorization: Bearer $ARQ_SIGNALS_API_TOKEN"# Check config, store path, target reachability, role safety,
# collector prerequisites, and snapshot freshness in one pass.
arqctl doctor
# Test one connection (or an ad-hoc DSN) with classified
# failure reasons: ok / dns / tcp / tls / auth / startup / role /
# password_resolve / config.
arqctl connect test prod-db
arqctl connect test --dsn "host=db.example.com port=5432 dbname=app user=arq sslmode=require password_env=APP_DB_PW"# Stop collecting from a target without taking the daemon down.
# State is in-memory; daemon restart resumes all targets.
arqctl collect pause --target=prod-db --reason="investigating incident #4321"
# Bring it back.
arqctl collect resume --target=prod-dbThe daemon also auto-pauses (state open) a target that fails
3 consecutive collection cycles and auto-recovers after a
cooldown (default 5 minutes). See R097 in
features/arq-signals/specification.md
for the full state-machine spec.
# SIGHUP path
kill -HUP $(pidof arq-signals)
# HTTP path
curl -X POST http://localhost:8081/reload \
-H "Authorization: Bearer $ARQ_SIGNALS_API_TOKEN"v1 reload scope is the target list — add / remove / modify
connection params or collectors.profile. poll_interval,
retention, and circuit thresholds remain set-at-construction
(documented as future scope).
arqctl statusArq Signals produces snapshots in the arq-snapshot.v1 format:
snapshot.zip
├── metadata.json # collector version, timestamp, PG version
├── query_catalog.json # which queries were executed
├── query_runs.ndjson # execution metadata (timing, row counts, errors)
├── query_results.ndjson # the actual data (one JSON object per row)
└── snapshots.ndjson # legacy combined format
Example metadata.json:
{
"schema_version": "arq-snapshot.v1",
"collector_version": "0.1.0",
"collector_commit": "abc1234",
"collected_at": "2026-03-14T10:30:00Z",
"instance_id": "a1b2c3d4e5f6"
}Example query_results.ndjson (one line per query):
{"run_id":"01JD...","payload":[{"name":"max_connections","setting":"100","unit":"","source":"configuration file"},{"name":"shared_buffers","setting":"16384","unit":"8kB","source":"configuration file"}]}The format is versioned. Breaking changes will bump schema_version.
A complete example snapshot is available at
examples/snapshot-example/ — you can
inspect exactly what Arq Signals collects without running it.
Arq Signals includes 73 read-only collectors. Grouped by domain:
- Baseline & runtime — server config, sessions, databases,
tables, indexes, table / index I/O, query stats
(
pg_stat_statements) - Schema model — columns, constraints, indexes, partitions, sequences, schemas, triggers, views, materialised views, functions, planner stats, extended statistics, vector columns
- Definitions — view, materialised-view, function, and trigger definitions (DDL bodies)
- Storage placement — tablespaces, per-relation storage, per-attribute storage
- In-flight operations — six
pg_stat_progress_*collectors (vacuum, analyze, create_index, cluster, basebackup, copy) - Index hygiene — derived findings: unused, invalid, redundant, duplicate
- Bloat estimation — statistical table-bloat and index-bloat
estimates without
pgstattuple - Wraparound risk — XID age at database / relation level, freeze blockers, prepared-transaction age
- Vacuum / checkpointer / bgwriter — autovacuum health, checkpointer stats (PG 17+), bgwriter pressure
- Replication —
pg_stat_replication,pg_replication_slots,pg_stat_replication_slots(logical slot health) - Operational pressure — connection utilisation, blocking locks, long-running transactions, idle-in-transaction offenders, temp I/O, lock summary
- Identity & configuration — server identity, cluster identity (network fingerprint), extension inventory, role capabilities, login roles, per-role / per-database GUC overrides
- Foreign data wrappers — wrappers, servers, user mappings, foreign tables
Collectors requiring unavailable extensions or unsupported PostgreSQL
versions are silently skipped and surface with a reason in
collector_status.json. Replication collectors return empty
results on standalone instances.
See docs/collectors.md for the full inventory
with query IDs, PostgreSQL sources, and cadences. Every query is
visible in internal/pgqueries/.
| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/health |
No | Liveness probe, always 200 |
GET |
/status |
Bearer | Collector status, targets, last collection |
POST |
/collect/now |
Bearer | Trigger immediate collection (optional JSON body to narrow targets) |
GET |
/export |
Bearer | Download snapshot ZIP |
Set ARQ_SIGNALS_API_TOKEN to configure the bearer token. If unset, a
random token is generated at startup and logged (fingerprint only;
the value is never logged).
The body is optional. An empty / missing body keeps the historical
"collect every enabled target" behaviour. When present, the body
may carry an optional targets subset, an optional request_id
correlation identifier, and an optional reason label.
# 1. No body — collect every enabled target.
curl -s -X POST http://127.0.0.1:8081/collect/now \
-H "Authorization: Bearer ${ARQ_SIGNALS_API_TOKEN}"
# 2. Narrow to a subset of configured targets.
curl -s -X POST http://127.0.0.1:8081/collect/now \
-H "Authorization: Bearer ${ARQ_SIGNALS_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"targets":["prod-main"]}'
# 3. Caller-supplied correlation id and reason.
curl -s -X POST http://127.0.0.1:8081/collect/now \
-H "Authorization: Bearer ${ARQ_SIGNALS_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"targets": ["prod-main", "prod-reporting"],
"request_id": "scheduled_run_2026_04_25",
"reason": "automated_cycle"
}'A successful response (HTTP 202):
{
"status": "collection triggered",
"request_id": "scheduled_run_2026_04_25",
"accepted_targets": ["prod-main", "prod-reporting"]
}A rejection (HTTP 400) — invalid target name:
{
"error": "one or more targets cannot be collected",
"accepted_targets": ["prod-main"],
"rejected_targets": [
{"name": "does-not-exist", "reason": "unknown_target"}
]
}The cycle is not triggered when any target was rejected.
For the full request schema, validation rules, and audit-trace
behaviour, see docs/control-plane.md.
POST /collect/now accepts an optional JSON body that lets a caller
narrow the cycle to a configured + enabled subset of targets. The
configured target list in signals.yaml is the authoritative
ceiling — no caller can introduce a database name that wasn't
already configured.
Two optional correlation fields ride along with the request:
request_id(regex^[A-Za-z0-9_-]{1,32}$) — caller-supplied correlation identifier. When absent, Arq Signals generates a ULID.reason(regex^[A-Za-z0-9_-]{1,64}$) — short tag-style label surfaced in audit events.
Every accepted request produces a deterministic audit trace keyed
by request_id:
collect_now_requested → collection_started → collection_completed (per target)
Validation failures emit collect_now_rejected; requests that
queue but can't run (channel full, or cycle overlap) emit
collect_now_dropped. See "Audit guarantees" below.
Operators who want the commercial Arq control plane to drive this endpoint additionally enable Mode B authentication — see the next section. The endpoint itself works in both modes.
Reference: docs/control-plane.md.
Arq Signals supports two modes, configured by signals.mode in
signals.yaml (default standalone).
A single bearer token (api.token) authorises every request.
Matched-token audit events carry actor=local_operator. This is
the only mode every open-source deployment needs to know about.
Adds a second bearer token, the Arq control-plane token,
distinct from api.token. The matched token determines the audit
identity:
| Bearer matched | actor |
|---|---|
api.token |
local_operator |
arq_control_plane_token |
arq_control_plane |
The actor is sourced from which token matched — it is never
inferred from request shape. A caller holding only api.token
cannot acquire the arq_control_plane identity by adding a
request_id or any other body field.
The control-plane token is supplied via file (preferred) or environment-variable indirection:
signals:
mode: arq_managed
arq_control_plane_token_file: /etc/arq/control-plane.token
# or:
# arq_control_plane_token_env: ARQ_CONTROL_PLANE_TOKENThe file is re-read on every authentication attempt so rotation is a single file-write — no daemon restart required. Token length floor is 32 characters; the two tokens must be distinct (constant- time check at startup).
Mode B has no licence-validation surface in Arq Signals. The
collector remains open source; the commercial value lives in the
Arq control plane's analysis layer, not in obscured collector
behaviour. See docs/authentication.md for the full Mode B model,
rotation behaviour, and security posture.
Reference: docs/authentication.md.
Arq Signals emits structured slog records keyed
audit_event=<name> for every operationally significant lifecycle
moment. The contract:
No silent request loss. Every accepted /collect/now request
reaches a terminal outcome for its request_id along exactly one
of three branches:
| Branch | Terminal records | When |
|---|---|---|
| rejected | one collect_now_rejected |
validation failed; cycle never queued |
| dropped | one collect_now_dropped |
queued but cycle never ran (channel full, or cycle overlap) |
| ran | one collection_started per target + one collection_completed per target |
cycle ran |
The "ran" branch is per-target: a request that narrows to two
targets emits two started/completed pairs sharing the same
request_id; a request that omits targets emits one pair per
enabled target. There is no aggregate "cycle complete" record. If
a request_id appears on collect_now_requested but the audit
log shows no records on any of the three branches, that's a bug.
Token values never logged. A centralised denylist filter in
internal/safety/audit.go rejects audit attributes whose key
contains password, secret, api_token, token, dsn,
connection_string, payload, or query_result. A small
hand-curated allow-list overrides the substring match for keys
that carry only metadata about a configured value (booleans /
fingerprints), never the secret value itself — as of today the
allow-list has exactly one entry, the boolean
arq_control_plane_token_configured on the mode_configured
startup event.
Correlation by request_id. When a caller supplies (or the
daemon generates) a request_id, that value is propagated through
to every per-target collection_started / collection_completed
audit record so the full sequence is greppable as one trail.
For the full event catalogue, attribute schemas, and the
secret-handling proof points, see
docs/audit-model.md.
- Static linting — every SQL query is validated at startup. DDL
(
CREATE,ALTER,DROP), DML (INSERT,UPDATE,DELETE), and dangerous functions (pg_terminate_backend,pg_sleep) cause the process to abort immediately. - Session-level — all connections set
default_transaction_read_only=on. - Per-query — each query runs inside
BEGIN ... READ ONLY.
Before collecting from any target, Arq Signals validates the connected role's safety posture. Collection is blocked if the role has:
- Superuser privileges (
rolsuper=true) - Replication privileges (
rolreplication=true) - Bypass RLS privileges (
rolbypassrls=true)
This is enforced by default with no configuration needed. Use a
dedicated monitoring role with pg_monitor for safe collection.
See docs/runtime-safety-model.md for
details.
- Passwords are read from file or environment variable at connection time
- Passwords are never cached in memory beyond a single connection attempt
- Passwords are never written to SQLite
- Passwords never appear in snapshots or exports
- Password rotation is supported (re-read on each new connection)
- Both bearer tokens (the local
api.tokenand the optional Mode Barq_control_plane_token) are compared in constant time viacrypto/subtle. - Token values never appear in audit logs, metrics, error
messages, or HTTP responses. The auto-generated
api.tokenlogs only its SHA-256 fingerprint at startup. - Audit-attribute filtering is centralised: a denylist on attribute
key names (
password,secret,api_token,token,dsn,connection_string,payload,query_result) drops any record whose key contains a denylisted substring before it leaves the process. A small hand-curated allow-list permits a single configuration-status boolean (arq_control_plane_token_configured) on themode_configuredstartup event — never a token value. - The control-plane token (when configured) is re-read from file
on every authentication attempt. Rotation is a single file-write;
no daemon restart is required. See
docs/authentication.md.
- Arq Signals makes no outbound network connections except to your PostgreSQL targets
- No telemetry, no analytics, no phone-home
- The HTTP API binds to loopback by default (
127.0.0.1:8081)
When deployed via Docker, Arq Signals runs as a non-root user
(UID 10001) on a minimal Alpine 3.21 base. The image contains
BusyBox (used by the wget-based healthcheck and tini init) and
no Bash, sh or other full shell beyond BusyBox's ash applet.
For deployments that require a shell-free runtime, build against
a distroless base — the binary is statically linked and CGO-free
so it runs without glibc.
Arq Signals reads configuration from (in order):
--configflag/etc/arq/signals.yaml./signals.yaml
Environment variables override file-based config. See
examples/signals.yaml for a complete
annotated example.
| Environment variable | Description | Default |
|---|---|---|
ARQ_ENV |
Environment: dev, lab, prod | dev |
ARQ_ALLOW_INSECURE_PG_TLS |
Allow weak TLS in non-prod | false |
ARQ_SIGNALS_ALLOW_UNSAFE_ROLE |
Allow unsafe role attributes (lab/dev only) | false |
ARQ_SIGNALS_TARGET_HOST |
PostgreSQL host | -- |
ARQ_SIGNALS_TARGET_PORT |
PostgreSQL port | 5432 |
ARQ_SIGNALS_TARGET_DBNAME |
Database name | postgres |
ARQ_SIGNALS_TARGET_USER |
Username | -- |
ARQ_SIGNALS_TARGET_NAME |
Target name | default |
ARQ_SIGNALS_TARGET_PASSWORD_FILE |
Path to password file | -- |
ARQ_SIGNALS_TARGET_PASSWORD_ENV |
Env var containing the password | -- |
ARQ_SIGNALS_TARGET_PGPASS_FILE |
Path to pgpass file | -- |
ARQ_SIGNALS_TARGET_SSLMODE |
TLS mode | -- |
ARQ_SIGNALS_POLL_INTERVAL |
Collection interval | 5m |
ARQ_SIGNALS_RETENTION_DAYS |
Days to retain data | 30 |
ARQ_SIGNALS_LOG_LEVEL |
Log level: debug, info, warn, error | info |
ARQ_SIGNALS_LOG_JSON |
JSON log format | false |
ARQ_SIGNALS_MAX_CONCURRENT_TARGETS |
Max parallel targets | 4 |
ARQ_SIGNALS_TARGET_TIMEOUT |
Per-target timeout | 60s |
ARQ_SIGNALS_QUERY_TIMEOUT |
Per-query timeout | 10s |
ARQ_SIGNALS_LISTEN_ADDR |
API listen address | 127.0.0.1:8081 |
ARQ_SIGNALS_DB_PATH |
SQLite database path | /data/arq-signals.db |
ARQ_SIGNALS_WRITE_TIMEOUT |
API write timeout | 180s |
ARQ_SIGNALS_API_TOKEN |
Bearer token for API auth | auto-generated |
Arq Signals is the open-source collection layer of the Arq platform. It is a complete, standalone tool — not a crippled free tier.
┌─────────────────┐
│ Arq Signals │ Collects diagnostic signals from PostgreSQL.
│ (open source) │ Produces portable snapshots. This repository.
└────────┬────────┘
│ snapshot (ZIP / NDJSON)
▼
┌─────────────────┐
│ Arq │ Analyzes signals. Scores health. Generates
│ (private) │ findings and recommendations.
└────────┬────────┘
│ findings
▼
┌─────────────────┐
│ Arq Workbench │ Presents results to engineers.
│ (private) │ Interactive UI for DBA workflows.
└─────────────────┘
The snapshot format (arq-snapshot.v1) is the stable contract between
layers. Each layer is independently deployable and separately
maintained.
Arq Signals is fully usable on its own. You do not need Arq or Arq Workbench to collect, export, or inspect your PostgreSQL diagnostics. Many teams use Arq Signals purely for data collection, feeding the snapshots into their own scripts, dashboards, or analysis workflows.
The boundary between Signals and the rest of the platform is intentional, not accidental:
| Capability | Where it lives | Why not in Signals |
|---|---|---|
| Database analysis | Arq | Interpretation is a separate concern from evidence collection |
| Health scoring | Arq | Scoring requires domain judgment that evolves independently |
| AI / LLM | Arq | Language models are not needed for safe data collection |
| Recommendations | Arq | Remediation advice requires analysis context |
| Cloud services | None | No component phones home or uploads data |
| Telemetry | None | No usage tracking exists anywhere in the platform |
This separation keeps the collector small, auditable, and safe to run in restricted environments where third-party analysis tools may not be permitted.
Arq Signals v0.5.0 — the collection engine, safety model, and snapshot format are stable and tested (800+ automated tests, 104 STDD requirements). Smoke-tested against PostgreSQL 14, 15, 16, 17, and 18. Released container images are published to GHCR and Docker Hub with SBOM (SPDX) and SLSA provenance.
Roadmap:
- Kubernetes deployment examples
- Community-contributed collectors
bloat_exact_v1/index_bloat_exact_v1—pgstattuple-gated precision variants of the existing statistical bloat collectors
This project follows STDD — Specification & Test-Driven Development. Specifications and tests define correct behavior. Implementation is written to satisfy those rules. The development policy is defined in CLAUDE.md.
We welcome contributions. See CONTRIBUTING.md for guidelines and GOVERNANCE.md for project governance.
In scope: new collectors, bug fixes, performance, documentation. Out of scope: analysis, scoring, AI (those belong in a downstream analyzer).
- Collector inventory — all 73 collectors with sources and cadences
- Runtime safety model — read-only enforcement details
- Adoption guide — production deployment guidance
- FAQ — common questions
- Changelog — release history
- Security policy — vulnerability reporting
- Citation — how to cite this project
- Elevarq — PostgreSQL tools for engineering teams
- Arq — commercial PostgreSQL intelligence platform; Arq Signals is its open-source collection layer
- pgAgroal Container — production-ready container distribution of pgagroal, a high-performance PostgreSQL connection pooler
BSD-3-Clause. See LICENSE.
Free to use, modify, and distribute for any purpose, including commercial use.