fix: production safety audit — 4 critical bugs across backend, server, and electron by eren-karakus0 · Pull Request #1290 · eigent-ai/eigent

eren-karakus0 · 2026-02-17T13:48:57Z

Summary

A cross-stack production safety audit uncovering 4 confirmed bugs across the Python backend, server auth, and Electron main process. Each bug is independently verified and includes test coverage where applicable.

Bug 1: Electron Child Windows Missing Context Isolation (CRITICAL)

File: electron/main/index.ts — open-win IPC handler

The main window correctly sets contextIsolation: true, but the open-win handler creates child BrowserWindow instances with nodeIntegration: true and contextIsolation: false. With context isolation disabled, any JavaScript running in a child renderer window shares the same context as Electron's built-in modules — meaning any XSS payload or injected script gains full require('child_process').exec(...) access with no sandbox barrier.

Before:

webPreferences: {
  preload,
  nodeIntegration: true,      // full Node.js access
  contextIsolation: false,    // no sandbox
}

After:

webPreferences: {
  preload,
  nodeIntegration: false,     // no direct Node.js access
  contextIsolation: true,     // isolated contexts
  sandbox: true,              // OS-level sandboxing
}

Bug 2: JWT Token Expiry Returns Wrong Error Code (HIGH)

File: server/app/component/auth.py — Auth.decode_token()

PyJWT's jwt.decode() already validates the exp claim and raises ExpiredSignatureError (a subclass of InvalidTokenError) for expired tokens. The manual payload["exp"] < datetime.now() check never executes because expired tokens don't reach that line. The except InvalidTokenError block catches both expired and invalid tokens, raising token_invalid for both cases.

This makes it impossible for clients to distinguish "token expired, please refresh" from "token is corrupted, please re-login", breaking any automatic token refresh logic.

Before:

try:
    payload = jwt.decode(token, Auth.SECRET_KEY, algorithms=["HS256"])
    id = payload["id"]
    if payload["exp"] < int(datetime.now().timestamp()):  # dead code
        raise TokenException(code.token_expired, ...)
except InvalidTokenError:  # catches ExpiredSignatureError too
    raise TokenException(code.token_invalid, ...)  # wrong code for expired

After:

try:
    payload = jwt.decode(token, Auth.SECRET_KEY, algorithms=["HS256"])
except ExpiredSignatureError:
    raise TokenException(code.token_expired, ...)   # correct code
except InvalidTokenError:
    raise TokenException(code.token_invalid, ...)   # only truly invalid tokens

Bug 3: `assert` Statements Used for Runtime Control Flow (HIGH)

File: backend/app/agent/listen_chat_agent.py — step() and astep()

Both step() and astep() use assert message is not None and assert res is not None for critical runtime checks. Python's assert statements are completely removed when running with -O (optimized mode), which is common in production deployments.

When stripped:

None message propagates into the SSE ActionDeactivateAgentData event, corrupting the frontend data stream
None response returns to callers, causing downstream AttributeError crashes

Before:

assert message is not None    # stripped with python -O
# ...
assert res is not None         # stripped with python -O
return res

After:

if message is None:
    message = ""
    logger.warning(f"Agent {self.agent_name}: message is None ...")

# ...
if res is None:
    raise RuntimeError(
        f"Agent {self.agent_name}: step() returned None without setting error_info"
    )
return res

Bug 4: `background_tasks.clear()` Doesn't Cancel Running Tasks (HIGH)

File: backend/app/controller/chat_controller.py — improve() handler

When a user sends a follow-up message after task completion, improve() calls task_lock.background_tasks.clear(). This only removes references from the Python set — it does not cancel the underlying asyncio.Task objects. Ghost tasks continue running in the event loop, holding resources, and potentially writing stale SSE events into the new conversation's stream.

Before:

if hasattr(task_lock, "background_tasks"):
    task_lock.background_tasks.clear()    # tasks keep running

After:

if hasattr(task_lock, "background_tasks"):
    for bg_task in list(task_lock.background_tasks):
        if not bg_task.done():
            bg_task.cancel()
    task_lock.background_tasks.clear()

Test Plan

Added 6 new unit tests:

Assert Guard Tests (test_listen_chat_agent.py):

test_step_handles_none_message_without_assert — verifies step() doesn't crash with AssertionError
test_step_raises_runtime_error_when_res_is_none — verifies RuntimeError instead of AssertionError
test_astep_handles_none_message_without_assert — async version of message guard
test_astep_raises_runtime_error_when_res_is_none — async version of res guard

Background Task Tests (test_chat_controller.py):

test_improve_cancels_background_tasks_when_done — verifies running tasks are cancelled, done tasks are not
test_improve_handles_missing_background_tasks_attr — verifies graceful handling without the attribute

Files Changed

File	Change
`electron/main/index.ts`	Set `nodeIntegration: false`, `contextIsolation: true`, `sandbox: true` in child windows
`server/app/component/auth.py`	Catch `ExpiredSignatureError` separately; remove dead manual exp check
`backend/app/agent/listen_chat_agent.py`	Replace 4 `assert` statements with `if`/`raise` guards
`backend/app/controller/chat_controller.py`	Cancel background tasks before clearing the set
`backend/tests/app/agent/test_listen_chat_agent.py`	+4 tests for assert replacement
`backend/tests/app/controller/test_chat_controller.py`	+2 tests for task cancellation

…, and electron 1. fix(electron): enforce context isolation in child windows The `open-win` IPC handler created child BrowserWindows with `nodeIntegration: true` and `contextIsolation: false`, while the main window correctly uses `contextIsolation: true`. Any XSS in a child window would gain full Node.js access. Fixed by setting `nodeIntegration: false`, `contextIsolation: true`, and `sandbox: true` to match the main window's security posture. 2. fix(server): separate ExpiredSignatureError from InvalidTokenError in JWT decoding `jwt.decode()` raises `ExpiredSignatureError` (a subclass of `InvalidTokenError`) for expired tokens before the manual `exp` check can execute. The `except InvalidTokenError` block catches both cases and raises `token_invalid`, making it impossible for clients to distinguish expired tokens from corrupted ones. This breaks automatic token refresh flows. Fixed by catching `ExpiredSignatureError` explicitly before `InvalidTokenError`. 3. fix(backend): replace assert statements with proper runtime guards in ListenChatAgent `assert message is not None` and `assert res is not None` in both `step()` and `astep()` are stripped when Python runs with `-O` (optimized mode). This causes silent `None` propagation into the SSE event stream and downstream consumers. Replaced with explicit `if`/`raise RuntimeError` guards that survive optimization and provide clear error messages. 4. fix(backend): cancel background tasks before clearing the set `background_tasks.clear()` in the `improve()` handler only removes references from the set — it does not cancel running asyncio.Task objects. Ghost tasks continue executing, potentially writing stale SSE events into a new conversation's stream. Fixed by iterating and calling `.cancel()` on each non-done task before clearing. Tests: - 4 new tests for assert guard replacements (step/astep, sync/async) - 2 new tests for background task cancellation behavior

bytecii

Thanks.

bytecii · 2026-02-19T10:57:53Z

backend/app/agent/listen_chat_agent.py

            )

-        assert message is not None
+        if message is None:


Why we change to warning and let the step continue?

Good point — changed to raise RuntimeError instead, keeping fail-fast behavior consistent with the res=None guard below. Fixed in latest push.

bytecii · 2026-02-19T10:58:47Z

backend/tests/app/agent/test_listen_chat_agent.py



+@pytest.mark.unit
+class TestAssertReplacedWithProperGuards:


Just use def test_xxx instead of creating class for now

Done — replaced class with plain def test_xxx functions.

Address review feedback: - message=None now raises RuntimeError instead of defaulting to empty string, keeping fail-fast behavior consistent with res=None handling - Replace test class with plain def test_xxx functions

bytecii reviewed Feb 19, 2026

View reviewed changes

fix: raise RuntimeError on None message instead of warning+continue

19386e2

Address review feedback: - message=None now raises RuntimeError instead of defaulting to empty string, keeping fail-fast behavior consistent with res=None handling - Replace test class with plain def test_xxx functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: production safety audit — 4 critical bugs across backend, server, and electron#1290

fix: production safety audit — 4 critical bugs across backend, server, and electron#1290
eren-karakus0 wants to merge 2 commits intoeigent-ai:mainfrom
eren-karakus0:fix/production-safety-audit-critical-bugs

eren-karakus0 commented Feb 17, 2026

Uh oh!

bytecii left a comment

Uh oh!

bytecii Feb 19, 2026

Uh oh!

eren-karakus0 Feb 19, 2026

Uh oh!

bytecii Feb 19, 2026

Uh oh!

eren-karakus0 Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

eren-karakus0 commented Feb 17, 2026

Summary

Bug 1: Electron Child Windows Missing Context Isolation (CRITICAL)

Bug 2: JWT Token Expiry Returns Wrong Error Code (HIGH)

Bug 3: assert Statements Used for Runtime Control Flow (HIGH)

Bug 4: background_tasks.clear() Doesn't Cancel Running Tasks (HIGH)

Test Plan

Files Changed

Uh oh!

bytecii left a comment

Choose a reason for hiding this comment

Uh oh!

bytecii Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

eren-karakus0 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

bytecii Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

eren-karakus0 Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Bug 3: `assert` Statements Used for Runtime Control Flow (HIGH)

Bug 4: `background_tasks.clear()` Doesn't Cancel Running Tasks (HIGH)