A clear and concise description of what the bug is.
Python pool pods are immediately restarted after terminating, up to the pool limit
No new pool pods are started, execution in librechat times out.
Pool pods are slowly replenished over the next few minutes.
If applicable, add logs or screenshots to help explain your problem.
{"request_id": "8fU2s9VQ", "language": "py", "code_length": 2081, "entity_id": null, "user_id": "6714ccb5c1bf470f48e646b7", "api_key_hash": "e6014c76", "event": "Code execution request","logger": "src.api.exec", "level": "info", "timestamp": "2026-01-23T01:37:48.864032Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"session_id": "tRyswCKrVthInokbFLqCt", "expires_at": "2026-01-24T01:37:48.864269+00:00", "event": "Session created", "logger":"src.services.session", "level": "info", "timestamp": "2026-01-23T01:37:48.864929Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"session_id": "tRyswCKrVthInokbFLqCt", "event": "Created new session", "logger": "src.services.orchestrator", "level": "info", "timestamp": "2026-01-23T01:37:48.865033Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"execution_id": "Ej8tjE6N", "session_id": "tRyswCKrVthInokbFLqCt", "language": "py", "code_length": 2081, "event": "Starting code execution", "logger": "src.services.execution.runner", "level": "info", "timestamp": "2026-01-23T01:37:48.869330Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"language": "py", "session_id": "tRyswCKrVthI", "event": "Failed to acquire pod from pool", "logger": "src.services.kubernetes.manager", "level": "warning", "timestamp": "2026-01-23T01:37:48.869527Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"job_name": "exec-py-tryswckrvthi-d2307311", "namespace": "default", "language": "py", "session_id": "tRyswCKrVthI", "event": "Created execution job", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:48.897607Z", "service": "kubecoderun-api", "version": "2.1.1"}
INFO: 10.0.16.136:44294 - "GET /ready HTTP/1.1" 200 OK
{"method": "GET", "path": "/ready", "status": 200, "duration_ms": 0.96, "event": "Request processed", "logger": "src.middleware.security", "level": "info", "timestamp": "2026-01-23T01:37:50.597051Z", "service": "kubecoderun-api", "version": "2.1.1"}
INFO: 10.0.16.136:55574 - "GET /health HTTP/1.1" 200 OK
{"job_name": "exec-py-tryswckrvthi-d2307311", "pod_name": "exec-py-tryswckrvthi-d2307311-9pwff", "pod_ip": "10.244.2.7", "elapsed_seconds": 9.25, "event": "Job pod ready", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:58.150980Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"job_name": "exec-py-tryswckrvthi-d2307311", "pod_name": "exec-py-tryswckrvthi-d2307311-9pwff", "pod_ip": "10.244.2.7", "sidecar_url": "http://10.244.2.7:8080", "event": "Job ready, starting execution", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:58.151140Z", "service": "kubecoderun-api", "version": "2.1.1"}
HTTP Request: POST http://10.244.2.7:8080/execute "HTTP/1.1 200 OK"
{"job_name": "exec-py-tryswckrvthi-d2307311", "exit_code": 0, "stdout_len": 506, "stderr_len": 0, "stderr_preview": "", "event": "Job execution completed", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:58.225580Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"event": "Code execution Ej8tjE6NOEQO2q666i43N completed: status=ExecutionStatus.COMPLETED, exit_code=0, time=9356ms, source=job", "logger": "src.services.execution.runner", "level": "info", "timestamp": "2026-01-23T01:37:58.225798Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"session_id": "tRyswCKrVthInokbFLqCt", "status": "completed", "pod_name": null, "has_state": false, "event": "Code execution completed", "logger": "src.services.orchestrator", "level": "info", "timestamp": "2026-01-23T01:37:58.225993Z", "service": "kubecoderun-api", "version": "2.1.1"}
{"request_id": "8fU2s9VQ", "session_id": "tRyswCKrVthInokbFLqCt", "event": "Code execution completed", "logger": "src.api.exec", "level": "info", "timestamp": "2026-01-23T01:37:58.229346Z", "service": "kubecoderun-api", "version": "2.1.1"}
INFO: 10.244.0.96:57126 - "POST /exec HTTP/1.1" 200 OK
{"method": "POST", "path": "/exec", "status": 200, "duration_ms": 9367.4, "event": "Request processed", "logger": "src.middleware.security", "level": "info", "timestamp": "2026-01-23T01:37:58.230378Z", "service": "kubecoderun-api", "version": "2.1.1"}
Add any other context about the problem here.
Description
A clear and concise description of what the bug is.
Steps to Reproduce
Expected Behavior
Python pool pods are immediately restarted after terminating, up to the pool limit
Actual Behavior
No new pool pods are started, execution in librechat times out.
Pool pods are slowly replenished over the next few minutes.
kubecoderun api logs the following when it is replenishing a pod:
{"pod_name": "pool-py-a5de0bf2", "event": "Removing unhealthy pod", "logger": "src.services.kubernetes.pool", "level": "warning", "timestamp": "2026-01-23T01:36:13.925511Z", "service": "kubecoderun-api", "version": "2.1.1"}Logs/Screenshots
If applicable, add logs or screenshots to help explain your problem.
{"request_id": "8fU2s9VQ", "language": "py", "code_length": 2081, "entity_id": null, "user_id": "6714ccb5c1bf470f48e646b7", "api_key_hash": "e6014c76", "event": "Code execution request","logger": "src.api.exec", "level": "info", "timestamp": "2026-01-23T01:37:48.864032Z", "service": "kubecoderun-api", "version": "2.1.1"} {"session_id": "tRyswCKrVthInokbFLqCt", "expires_at": "2026-01-24T01:37:48.864269+00:00", "event": "Session created", "logger":"src.services.session", "level": "info", "timestamp": "2026-01-23T01:37:48.864929Z", "service": "kubecoderun-api", "version": "2.1.1"} {"session_id": "tRyswCKrVthInokbFLqCt", "event": "Created new session", "logger": "src.services.orchestrator", "level": "info", "timestamp": "2026-01-23T01:37:48.865033Z", "service": "kubecoderun-api", "version": "2.1.1"} {"execution_id": "Ej8tjE6N", "session_id": "tRyswCKrVthInokbFLqCt", "language": "py", "code_length": 2081, "event": "Starting code execution", "logger": "src.services.execution.runner", "level": "info", "timestamp": "2026-01-23T01:37:48.869330Z", "service": "kubecoderun-api", "version": "2.1.1"} {"language": "py", "session_id": "tRyswCKrVthI", "event": "Failed to acquire pod from pool", "logger": "src.services.kubernetes.manager", "level": "warning", "timestamp": "2026-01-23T01:37:48.869527Z", "service": "kubecoderun-api", "version": "2.1.1"} {"job_name": "exec-py-tryswckrvthi-d2307311", "namespace": "default", "language": "py", "session_id": "tRyswCKrVthI", "event": "Created execution job", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:48.897607Z", "service": "kubecoderun-api", "version": "2.1.1"} INFO: 10.0.16.136:44294 - "GET /ready HTTP/1.1" 200 OK {"method": "GET", "path": "/ready", "status": 200, "duration_ms": 0.96, "event": "Request processed", "logger": "src.middleware.security", "level": "info", "timestamp": "2026-01-23T01:37:50.597051Z", "service": "kubecoderun-api", "version": "2.1.1"} INFO: 10.0.16.136:55574 - "GET /health HTTP/1.1" 200 OK {"job_name": "exec-py-tryswckrvthi-d2307311", "pod_name": "exec-py-tryswckrvthi-d2307311-9pwff", "pod_ip": "10.244.2.7", "elapsed_seconds": 9.25, "event": "Job pod ready", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:58.150980Z", "service": "kubecoderun-api", "version": "2.1.1"} {"job_name": "exec-py-tryswckrvthi-d2307311", "pod_name": "exec-py-tryswckrvthi-d2307311-9pwff", "pod_ip": "10.244.2.7", "sidecar_url": "http://10.244.2.7:8080", "event": "Job ready, starting execution", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:58.151140Z", "service": "kubecoderun-api", "version": "2.1.1"} HTTP Request: POST http://10.244.2.7:8080/execute "HTTP/1.1 200 OK" {"job_name": "exec-py-tryswckrvthi-d2307311", "exit_code": 0, "stdout_len": 506, "stderr_len": 0, "stderr_preview": "", "event": "Job execution completed", "logger": "src.services.kubernetes.job_executor", "level": "info", "timestamp": "2026-01-23T01:37:58.225580Z", "service": "kubecoderun-api", "version": "2.1.1"} {"event": "Code execution Ej8tjE6NOEQO2q666i43N completed: status=ExecutionStatus.COMPLETED, exit_code=0, time=9356ms, source=job", "logger": "src.services.execution.runner", "level": "info", "timestamp": "2026-01-23T01:37:58.225798Z", "service": "kubecoderun-api", "version": "2.1.1"} {"session_id": "tRyswCKrVthInokbFLqCt", "status": "completed", "pod_name": null, "has_state": false, "event": "Code execution completed", "logger": "src.services.orchestrator", "level": "info", "timestamp": "2026-01-23T01:37:58.225993Z", "service": "kubecoderun-api", "version": "2.1.1"} {"request_id": "8fU2s9VQ", "session_id": "tRyswCKrVthInokbFLqCt", "event": "Code execution completed", "logger": "src.api.exec", "level": "info", "timestamp": "2026-01-23T01:37:58.229346Z", "service": "kubecoderun-api", "version": "2.1.1"} INFO: 10.244.0.96:57126 - "POST /exec HTTP/1.1" 200 OK {"method": "POST", "path": "/exec", "status": 200, "duration_ms": 9367.4, "event": "Request processed", "logger": "src.middleware.security", "level": "info", "timestamp": "2026-01-23T01:37:58.230378Z", "service": "kubecoderun-api", "version": "2.1.1"}Additional Context
Add any other context about the problem here.