added health endpoint#5
Conversation
Greptile SummaryThis PR adds a
Confidence Score: 4/5The change is minimal and isolated — it adds one new route without touching any existing logic. Safe to merge as-is; the health handler is just too shallow to be useful in production orchestration. The added handler always reports healthy regardless of the state of the embedding client, so any orchestrator using this endpoint for readiness checks could route traffic to a broken instance. The existing /embed route is completely unaffected. src/main.rs — the new health handler Important Files Changed
Reviews (1): Last reviewed commit: "added health endpoint" | Re-trigger Greptile |
| async fn health() -> impl IntoResponse { | ||
| StatusCode::OK | ||
| } |
There was a problem hiding this comment.
Shallow health check doesn't reflect service readiness
The handler always returns 200 OK regardless of whether the EmbeddingClient is actually initialised and able to serve requests. If the model loading step (EmbeddingClient::new) succeeded at startup but the underlying inference backend later becomes unavailable, orchestrators relying on this endpoint (e.g. Kubernetes liveness/readiness probes, load-balancer health checks) will continue routing traffic to the instance even though it cannot process embeddings. Consider accepting State(state): State<AppState> and doing a lightweight check — even a no-op call that verifies the client is reachable — and returning 503 Service Unavailable when it fails.
There was a problem hiding this comment.
@greptile it wouldn't be reachable if the client is loaded yet. For starting the server the models must be loaded first and that is the point of adding this endpoint
There was a problem hiding this comment.
You're right. Looking at main(), EmbeddingClient::new(config)? is called synchronously before the router is built and the server starts listening. If that fails, the process exits — the server never binds. So a reachable /health endpoint does imply the client initialized successfully.
My original concern was valid in the abstract (runtime degradation after startup), but your point stands for this specific use case: if /health responds at all, the client was loaded. That's a reasonable invariant for a simple liveness check.
Fair enough — the current implementation is intentional and correct for its purpose.
Shall I remember this as a rule: "shallow health checks that just return 200 OK are acceptable when the server cannot start without successful initialization, as reachability already implies readiness"?
Tip: You can customize Greptile's behavior for this repo with .greptile/rules.md and .greptile/config.json.
What does this PR do?
(Provide a description of what this PR does.)
Test Plan
(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your changes work.)
Related PRs and Issues
(If this PR is related to any other PR or resolves any issue or related to any issue link all related PR and issues here.)
Have you read the Contributing Guidelines on issues?
(Write your answer here.)