feat: add pull-based runner client (GitHub Actions runner style) by as535364 · Pull Request #50 · Normal-OJ/Sandbox

as535364 · 2026-03-23T13:40:54Z

Add runner_client.py that operates as a pull-based runner instead of the traditional push-based Flask server. The runner polls the backend for pending judge jobs, claims them, processes them in Docker containers, and reports results back — similar to GitHub Actions self-hosted runners.

Changes:

runner_client.py: Polling-based runner with concurrent job processing, heartbeat support, and graceful shutdown
Dockerfile: Support SANDBOX_MODE=pull env var to start runner client

https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

Add runner_client.py that operates as a pull-based runner instead of the traditional push-based Flask server. The runner polls the backend for pending judge jobs, claims them, processes them in Docker containers, and reports results back — similar to GitHub Actions self-hosted runners. Changes: - runner_client.py: Polling-based runner with concurrent job processing, heartbeat support, and graceful shutdown - Dockerfile: Support SANDBOX_MODE=pull env var to start runner client https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

aokblast · 2026-03-23T13:54:47Z

社長記得 format😭😭

Cover poll_for_jobs, claim_job, download_code, download_testdata, send_heartbeat, report_result, and the main run loop with 18 tests using mocked HTTP responses. https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

as535364 · 2026-03-23T14:05:23Z

是 claude 不乖ㄛ

- Patch runner_client.time.sleep instead of global time.sleep - Reset running_jobs in mock_sleep so the drain loop exits - Simplify test_claims_and_processes_jobs mock setup https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

Copilot

Pull request overview

This PR introduces a new pull-based sandbox runner client that polls a backend for pending judge jobs, claims them, executes them via existing SubmissionRunner Docker execution, and reports results—enabling a GitHub Actions runner–style workflow alongside the existing push-based Flask server.

Changes:

Added runner_client.py implementing polling, claiming, downloading artifacts, executing jobs concurrently, heartbeat, and shutdown handling.
Added tests/test_runner_client.py with unit tests for the runner client’s networking helpers and main-loop behavior.
Updated Dockerfile to support SANDBOX_MODE=pull to start the new runner client instead of gunicorn.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File	Description
`runner_client.py`	Implements the new pull-based runner client, including job lifecycle handling and Docker-based execution.
`tests/test_runner_client.py`	Adds tests for polling/claiming/download/report helpers and basic run-loop behavior.
`Dockerfile`	Adds an environment switch to run either the legacy push server or the new pull runner.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-24T06:06:15Z

runner_client.py

+            # Prepare submission directory
+            if submission_dir.exists():
+                shutil.rmtree(submission_dir)
+            submission_dir.mkdir(parents=True)


submission_dir is derived from submission_id (backend-controlled) and then passed to shutil.rmtree(). If submission_id contains path separators/.., this can delete outside SUBMISSION_DIR. Validate submission_id (e.g., strict allowed charset) and/or assert submission_dir.resolve() is within SUBMISSION_DIR.resolve() before deleting/creating.

Copilot · 2026-03-24T06:06:16Z

runner_client.py

+                    # Read config to get working_dir for host paths
+                    with open('.config/submission.json') as f:
+                        s_config = json.load(f)
+                    host_base = pathlib.Path(
+                        s_config['working_dir']) / submission_id / 'testcase'


.config/submission.json is opened and parsed inside the innermost testcase loop, so it will be re-read for every case. Load this config (and compute host_base) once per job (or once per process) before iterating cases to reduce IO and simplify the control flow.

Copilot · 2026-03-24T06:06:16Z

runner_client.py

+import sys
+import tempfile


sys and tempfile are imported but not used in this module. Removing unused imports will avoid lint noise and keep the dependency surface minimal.

Suggested change

import sys

import tempfile

Copilot · 2026-03-24T06:06:16Z

Dockerfile

+CMD if [ "$SANDBOX_MODE" = "pull" ]; then \
+      python runner_client.py; \
+    else \
+      gunicorn -c gunicorn.conf.py app:app; \
+    fi


This shell-form CMD runs under /bin/sh -c, so the actual process (python/gunicorn) won't be PID 1 and may not receive SIGTERM/SIGINT properly (can break graceful shutdown and health signaling). Prefer an ENTRYPOINT script or sh -c with exec in both branches so the selected process becomes PID 1.

Suggested change

CMD if [ "$SANDBOX_MODE" = "pull" ]; then \

python runner_client.py; \

else \

gunicorn -c gunicorn.conf.py app:app; \

fi

CMD ["/bin/sh", "-c", "if [ \"$SANDBOX_MODE\" = \"pull\" ]; then exec python runner_client.py; else exec gunicorn -c gunicorn.conf.py app:app; fi"]

Copilot · 2026-03-24T06:06:16Z

runner_client.py

+    def process_job(self, job: JobInfo):
+        """
+        Process a single job: download data, compile, execute, report.
+        Runs in a separate thread.
+        """


process_job() contains most of the new runner behavior (meta parsing, path construction, compile/run execution, result shaping, cleanup), but the current tests only cover the networking helpers and mock out process_job in the run-loop test. Add tests that exercise process_job with SubmissionRunner/network calls mocked so regressions in result formatting and cleanup are caught.

Copilot · 2026-03-24T06:06:17Z

runner_client.py

+        src_dir = dest_dir / 'src'
+        src_dir.mkdir(exist_ok=True)
+        with ZipFile(io.BytesIO(resp.content)) as zf:
+            zf.extractall(src_dir)


ZipFile.extractall() is used on backend-provided zip content without validating member paths. A zip with .. or absolute paths can write outside dest_dir/src (Zip Slip). Consider validating each ZipInfo.filename stays within src_dir before extracting (or extract to a temp dir and move safe paths).

Copilot · 2026-03-24T06:06:17Z

runner_client.py

+        testcase_dir = dest_dir / 'testcase'
+        testcase_dir.mkdir(exist_ok=True)
+        with ZipFile(io.BytesIO(resp.content)) as zf:
+            zf.extractall(testcase_dir)


ZipFile.extractall() is used on backend-provided zip content without validating member paths. A malicious zip can write outside dest_dir/testcase via .. or absolute paths. Add a safe-extraction check (reject/skip unsafe members) before extracting.

…PID 1 - Remove unused imports (sys, tempfile) - Add _safe_extractall() to prevent Zip Slip attacks on downloaded zips - Move .config/submission.json read outside the innermost test case loop - Use exec in Dockerfile CMD so the process becomes PID 1 https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-24T12:34:56Z

runner_client.py

+            with open('.config/submission.json') as f:
+                s_config = json.load(f)
+            host_base = pathlib.Path(
+                s_config['working_dir']) / submission_id / 'testcase'


process_job() writes files under SUBMISSION_DIR, but SubmissionRunner uses .config/submission.json's working_dir to locate src (and Docker bind mounts use the host path). If SUBMISSION_DIR and working_dir diverge, compile/run will fail because the expected working_dir/<id>/src and working_dir/<id>/testcase paths won’t exist. Consider deriving the submission directory from working_dir (or validating they match early and failing fast with a clear error).

Suggested change

s_config['working_dir']) / submission_id / 'testcase'

s_config['working_dir']) / submission_id / 'testcase'

# Ensure submission_dir root matches config working_dir to avoid

# mismatched paths between where files are written and where

# SubmissionRunner/Docker expect them.

config_root = pathlib.Path(s_config['working_dir'])

submission_root = submission_dir.parent

if submission_root != config_root:

logger.error(

'Configuration error: submission_dir root (%s) does not '

'match working_dir from .config/submission.json (%s). '

'Ensure SUBMISSION_DIR and working_dir are consistent.',

submission_root,

config_root,

)

raise RuntimeError(

'Runner misconfiguration: SUBMISSION_DIR and '

'working_dir mismatch'

)

SUBMISSION_DIR vs working_dir mismatch (runner_client.py L254)

This is by design. SUBMISSION_DIR is where the runner client writes files inside its own container. working_dir from submission.json is the Docker volume host mount path used by SubmissionRunner to map into the sandbox container. They are intentionally different paths serving different purposes.

from claude

Copilot · 2026-03-24T12:34:57Z

runner_client.py

+        signal.signal(signal.SIGINT, on_signal)
+        signal.signal(signal.SIGTERM, on_signal)
+
+        while not self.shutdown:
+            # Check if we can take more jobs
+            with self.running_lock:
+                available_slots = MAX_CONCURRENT - self.running_jobs
+
+            if available_slots <= 0:
+                time.sleep(POLL_INTERVAL)
+                continue
+
+            # Poll for jobs
+            jobs = self.poll_for_jobs()
+            if not jobs:
+                time.sleep(POLL_INTERVAL)
+                continue
+
+            # Try to claim and process jobs
+            for job_info in jobs[:available_slots]:
+                job = self.claim_job(job_info['submissionId'])
+                if job is None:
+                    continue
+
+                with self.running_lock:
+                    self.running_jobs += 1
+
+                thread = threading.Thread(
+                    target=self.process_job,
+                    args=(job, ),
+                    daemon=True,
+                )
+                thread.start()
+
+            time.sleep(POLL_INTERVAL)
+
+        # Wait for running jobs to finish
+        logger.info('Waiting for running jobs to complete...')
+        while True:
+            with self.running_lock:
+                if self.running_jobs == 0:
+                    break
+            time.sleep(1)
+        logger.info('Runner stopped')
+
+


run() unconditionally installs SIGINT/SIGTERM handlers. signal.signal() raises ValueError if called outside the main thread, and it also mutates global process state (which can interfere with pytest’s own SIGINT handling when tests call Runner.run()). Consider guarding this with a main-thread check (or a flag to disable signal handler installation for embedding/tests) and/or restoring previous handlers on exit.

Suggested change

signal.signal(signal.SIGINT, on_signal)

signal.signal(signal.SIGTERM, on_signal)

while not self.shutdown:

# Check if we can take more jobs

with self.running_lock:

available_slots = MAX_CONCURRENT - self.running_jobs

if available_slots <= 0:

time.sleep(POLL_INTERVAL)

continue

# Poll for jobs

jobs = self.poll_for_jobs()

if not jobs:

time.sleep(POLL_INTERVAL)

continue

# Try to claim and process jobs

for job_info in jobs[:available_slots]:

job = self.claim_job(job_info['submissionId'])

if job is None:

continue

with self.running_lock:

self.running_jobs += 1

thread = threading.Thread(

target=self.process_job,

args=(job, ),

daemon=True,

)

thread.start()

time.sleep(POLL_INTERVAL)

# Wait for running jobs to finish

logger.info('Waiting for running jobs to complete...')

while True:

with self.running_lock:

if self.running_jobs == 0:

break

time.sleep(1)

logger.info('Runner stopped')

is_main_thread = threading.current_thread() is threading.main_thread()

old_sigint_handler = None

old_sigterm_handler = None

if is_main_thread:

# Save existing handlers and install our own for graceful shutdown

old_sigint_handler = signal.getsignal(signal.SIGINT)

old_sigterm_handler = signal.getsignal(signal.SIGTERM)

signal.signal(signal.SIGINT, on_signal)

signal.signal(signal.SIGTERM, on_signal)

try:

while not self.shutdown:

# Check if we can take more jobs

with self.running_lock:

available_slots = MAX_CONCURRENT - self.running_jobs

if available_slots <= 0:

time.sleep(POLL_INTERVAL)

continue

# Poll for jobs

jobs = self.poll_for_jobs()

if not jobs:

time.sleep(POLL_INTERVAL)

continue

# Try to claim and process jobs

for job_info in jobs[:available_slots]:

job = self.claim_job(job_info['submissionId'])

if job is None:

continue

with self.running_lock:

self.running_jobs += 1

thread = threading.Thread(

target=self.process_job,

args=(job, ),

daemon=True,

)

thread.start()

time.sleep(POLL_INTERVAL)

# Wait for running jobs to finish

logger.info('Waiting for running jobs to complete...')

while True:

with self.running_lock:

if self.running_jobs == 0:

break

time.sleep(1)

logger.info('Runner stopped')

finally:

# Restore previous signal handlers to avoid leaking global state

if is_main_thread and old_sigint_handler is not None:

signal.signal(signal.SIGINT, old_sigint_handler)

if is_main_thread and old_sigterm_handler is not None:

signal.signal(signal.SIGTERM, old_sigterm_handler)

Signal handler outside main thread (runner_client.py L386)

run() is only called from if name == 'main', which is always the main thread. In tests, the loop is controlled via mocked poll_for_jobs — signal handlers are never exercised. Adding a main-thread guard is unnecessary complexity.

from claude

Copilot · 2026-03-24T12:34:57Z

tests/test_runner_client.py

+        runner.poll_for_jobs = MagicMock(return_value=[])
+
+        def set_shutdown(*_args):
+            runner.shutdown = True
+
+        runner.poll_for_jobs.side_effect = set_shutdown
+        runner.run()


In test_shutdown_stops_loop, poll_for_jobs.side_effect sets runner.shutdown but does not return a list. That makes jobs become None, so the loop still hits time.sleep(POLL_INTERVAL) once before exiting, adding an avoidable real-time delay to the test. Consider returning [] from the side effect and/or patching runner_client.time.sleep to a no-op in this test to keep the suite fast and deterministic.

Suggested change

runner.poll_for_jobs = MagicMock(return_value=[])

def set_shutdown(*_args):

runner.shutdown = True

runner.poll_for_jobs.side_effect = set_shutdown

runner.run()

runner.poll_for_jobs = MagicMock()

def set_shutdown(*_args):

runner.shutdown = True

return []

runner.poll_for_jobs.side_effect = set_shutdown

with patch.object(runner_client.time, 'sleep', return_value=None):

runner.run()

Copilot · 2026-03-24T12:34:57Z

runner_client.py

+            submission_dir.mkdir(parents=True)
+
+            # Write meta.json
+            meta = Meta.parse_obj(job.meta)


process_job() parses job.meta as a dispatcher.meta.Meta, but Meta requires a language field while claim_job() (and the unit test) only populate meta with tasks. This will raise a validation error at runtime and prevent writing meta.json / executing the job. Consider constructing a full Meta payload (e.g., merge job.meta with {'language': job.language}) or avoid Meta.parse_obj here and write a JSON that matches what the rest of the system expects.

Suggested change

meta = Meta.parse_obj(job.meta)

# Ensure Meta receives required fields (e.g., language) in addition to job.meta

meta_payload = {}

if isinstance(job.meta, dict):

meta_payload.update(job.meta)

elif job.meta is not None:

# Fallback: try to parse non-dict meta as JSON string

try:

meta_payload.update(json.loads(job.meta))

except (TypeError, ValueError):

logger.warning('Unexpected meta format for job %s; using empty meta', submission_id)

# Meta model requires language; use language from the job info if not present

if 'language' not in meta_payload:

meta_payload['language'] = job.language

meta = Meta.parse_obj(meta_payload)

Meta validation error (runner_client.py L207)

Not an issue. The backend's claim_job response already includes language in the meta dict (see model/runner.py L136: 'language': sub.language). Meta.parse_obj(job.meta) will parse correctly.

from claude

test: add unit tests for runner_client module

3d87436

Cover poll_for_jobs, claim_job, download_code, download_testdata, send_heartbeat, report_result, and the main run loop with 18 tests using mocked HTTP responses. https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

claude added 2 commits March 23, 2026 14:07

style: apply YAPF formatting to pass CI lint check

8ba017c

https://claude.ai/code/session_01CD3q84rDaDXMnAmQVh58qR

as535364 requested a review from Copilot March 24, 2026 06:01

Copilot started reviewing on behalf of as535364 March 24, 2026 06:01 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

as535364 requested a review from Copilot March 24, 2026 12:30

Copilot started reviewing on behalf of as535364 March 24, 2026 12:30 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

-                s_config['working_dir']) / submission_id / 'testcase'
+                s_config['working_dir']) / submission_id / 'testcase'
+            # Ensure submission_dir root matches config working_dir to avoid
+            # mismatched paths between where files are written and where
+            # SubmissionRunner/Docker expect them.
+            config_root = pathlib.Path(s_config['working_dir'])
+            submission_root = submission_dir.parent
+            if submission_root != config_root:
+                logger.error(
+                    'Configuration error: submission_dir root (%s) does not '
+                    'match working_dir from .config/submission.json (%s). '
+                    'Ensure SUBMISSION_DIR and working_dir are consistent.',
+                    submission_root,
+                    config_root,
+                )
+                raise RuntimeError(
+                    'Runner misconfiguration: SUBMISSION_DIR and '
+                    'working_dir mismatch'
+                )

-            meta = Meta.parse_obj(job.meta)
+            # Ensure Meta receives required fields (e.g., language) in addition to job.meta
+            meta_payload = {}
+            if isinstance(job.meta, dict):
+                meta_payload.update(job.meta)
+            elif job.meta is not None:
+                # Fallback: try to parse non-dict meta as JSON string
+                try:
+                    meta_payload.update(json.loads(job.meta))
+                except (TypeError, ValueError):
+                    logger.warning('Unexpected meta format for job %s; using empty meta', submission_id)
+            # Meta model requires language; use language from the job info if not present
+            if 'language' not in meta_payload:
+                meta_payload['language'] = job.language
+            meta = Meta.parse_obj(meta_payload)

Conversation

as535364 commented Mar 23, 2026

Uh oh!

aokblast commented Mar 23, 2026

Uh oh!

as535364 commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

as535364 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

as535364 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

as535364 Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants