Kubernetes-style health probes and pluggable metrics for Plone.
- Liveness, readiness, and startup probes on a separate HTTP port
- Pluggable metrics endpoint (
@@metrics) with Prometheus and JSON output - Extensible via ZCA: custom health checks, metric providers, and formatters
Add plone.observability to your package dependencies:
[project]
dependencies = [
"plone.observability",
]Then include it in your ZCML:
<include package="plone.observability" />The package registers itself and starts the health server automatically when Zope starts via a IProcessStarting subscriber.
All configuration is done via environment variables.
| Variable | Default | Description |
|---|---|---|
PLONE_OBSERVABILITY_HEALTH_HOST |
0.0.0.0 |
Bind address for the health probe server |
PLONE_OBSERVABILITY_HEALTH_PORT |
8081 |
Port for the health probe server. Set to 0 to disable. |
PLONE_OBSERVABILITY_METRICS_ALLOWLIST |
(empty, open) | Comma-separated CIDRs allowed to access @@metrics. Empty means all IPs are allowed. |
PLONE_OBSERVABILITY_TRUSTED_PROXIES |
127.0.0.1,::1 |
Comma-separated CIDRs of trusted reverse proxies for X-Forwarded-For resolution. |
PLONE_OBSERVABILITY_METRICS_CACHE_TTL |
60 |
Seconds to cache content catalog metrics (expensive to collect). |
The health server runs on a dedicated port (default 8081) in a background daemon thread, separate from the Zope WSGI server. This means it answers even when all Zope threads are busy.
| Path | Purpose |
|---|---|
/live |
Liveness check — is the process alive? |
/ready |
Readiness check — can the process serve requests? |
/startup |
Startup check — has the process finished initializing? |
All endpoints return JSON with a 200 on success or 503 on failure:
{
"status": "ok",
"checks": {
"zodb": {"ok": true, "message": "ZODB connection ok"}
}
}livenessProbe:
httpGet:
path: /live
port: 8081
initialDelaySeconds: 10
periodSeconds: 30
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
startupProbe:
httpGet:
path: /startup
port: 8081
failureThreshold: 30
periodSeconds: 10Expose the probe port alongside the main Zope port:
ports:
- name: http
containerPort: 8080
- name: health
containerPort: 8081The @@metrics endpoint is a browser view registered on the application root (OFS.interfaces.IApplication). It collects metrics from all registered IMetricProvider adapters and serialises them using an IMetricFormatter utility.
http://your-plone-host/@@metrics
http://your-plone-host/@@metrics?format=json
The default format is Prometheus text. Pass ?format=json or an Accept: application/json header to get JSON.
| Metric | Type | Scope | Description |
|---|---|---|---|
plone_uptime_seconds |
gauge | instance | Process uptime |
plone_info |
info | instance | Python, Zope, and Plone version labels |
plone_threads_active |
gauge | instance | Active Python threads |
plone_process_rss_bytes |
gauge | instance | Resident set size |
plone_process_cpu_seconds |
counter | instance | Total CPU time (user + system) |
plone_requests_total |
counter | instance | Total HTTP requests served |
plone_request_duration_seconds_sum |
counter | instance | Cumulative request duration |
plone_request_duration_seconds_bucket |
counter | instance | Request duration histogram buckets |
plone_request_errors |
counter | instance | HTTP errors by status code |
plone_zodb_object_count |
gauge | global | Total objects in ZODB |
plone_zodb_db_size_bytes |
gauge | global | ZODB file size |
plone_zodb_connections |
gauge | instance | Open ZODB connections |
plone_zodb_cache_size |
gauge | instance | Objects in the ZODB object cache |
plone_zodb_cache_size_bytes |
gauge | instance | ZODB object cache size in bytes |
plone_content_total |
gauge | global | Content objects by portal type and site |
plone_content_by_state |
gauge | global | Content objects by workflow state and site |
Metrics carry a scope label with value "global" or "instance".
- global — the value is the same across all Plone instances sharing the same ZODB (e.g. object count, content totals). When aggregating in Prometheus, avoid double-counting by filtering to a single instance.
- instance — the value is specific to this process (e.g. request counts, RSS). Sum across instances when aggregating.
scrape_configs:
- job_name: plone
static_configs:
- targets: ["plone-host:8080"]
metrics_path: /@@metricsTotal requests across all instances:
sum(plone_requests_total{job="plone"})
Request rate per instance (5-minute window):
rate(plone_requests_total{job="plone"}[5m])
ZODB object count (global metric — pick one instance to avoid double-counting):
plone_zodb_object_count{scope="global"} * on(instance) group_left()
(plone_info{instance=~"plone-0.*"})
Or simply query a single instance:
plone_zodb_object_count{instance="plone-0:8080", scope="global"}
Average request duration (p50 approximation from histogram):
histogram_quantile(0.5,
sum(rate(plone_request_duration_seconds_bucket[5m])) by (le, instance)
)
Memory usage per instance (MB):
plone_process_rss_bytes{job="plone"} / 1024 / 1024
The plone_requests_total and plone_request_duration_seconds_* metrics are populated by the ObservabilityMiddleware WSGI middleware. You must add it to your WSGI pipeline to get request metrics.
[pipeline:main]
pipeline =
egg:plone.observability#observability
...
Zope
[filter:observability]
use = egg:plone.observability#observabilityfrom plone.observability.metrics.providers.request import ObservabilityMiddleware
application = ObservabilityMiddleware(application)All components are registered via ZCA and can be extended or replaced by third-party packages.
Implement ILivenessCheck and register it as a named utility. Liveness checks MUST NOT access ZODB or block.
from zope.interface import implementer
from plone.observability.interfaces import ILivenessCheck
@implementer(ILivenessCheck)
class MyLivenessCheck:
name = "myapp"
def __call__(self):
# Return (ok: bool, message: str)
return True, "all good"<utility
factory=".checks.MyLivenessCheck"
provides="plone.observability.interfaces.ILivenessCheck"
name="myapp"
/>Implement IReadinessCheck. Readiness checks may access ZODB.
from zope.interface import implementer
from plone.observability.interfaces import IReadinessCheck
@implementer(IReadinessCheck)
class MyReadinessCheck:
name = "myapp"
def __call__(self):
# Check a dependency
ok = _check_external_service()
return ok, "service ok" if ok else "service unavailable"<utility
factory=".checks.MyReadinessCheck"
provides="plone.observability.interfaces.IReadinessCheck"
name="myapp"
/>Implement IMetricProvider as an adapter on OFS.interfaces.IApplication.
from zope.interface import implementer
from plone.observability.interfaces import IMetricProvider
from plone.observability.metric import Metric
@implementer(IMetricProvider)
class MyMetricProvider:
name = "myapp"
scope = "instance"
def __init__(self, context):
self.context = context
def collect(self):
yield Metric(
name="myapp_queue_length",
value=get_queue_length(),
type="gauge",
scope="instance",
help="Number of items in the processing queue",
)<adapter
factory=".metrics.MyMetricProvider"
provides="plone.observability.interfaces.IMetricProvider"
for="OFS.interfaces.IApplication"
name="myapp"
/>Implement IMetricFormatter as a named utility to support additional wire formats.
from zope.interface import implementer
from plone.observability.interfaces import IMetricFormatter
@implementer(IMetricFormatter)
class CSVFormatter:
content_type = "text/csv"
def format(self, metrics):
lines = ["name,value,type,scope,help"]
for m in metrics:
lines.append(f"{m.name},{m.value},{m.type},{m.scope},{m.help}")
return "\n".join(lines)<utility
factory=".formatters.CSVFormatter"
provides="plone.observability.interfaces.IMetricFormatter"
name="csv"
/>Access it via @@metrics?format=csv.