Skip to content

feat(printer-models): PrintQueue worker with pause/resume/cancel/retry#50

Merged
strausmann merged 3 commits into
mainfrom
feat/print-queue
May 11, 2026
Merged

feat(printer-models): PrintQueue worker with pause/resume/cancel/retry#50
strausmann merged 3 commits into
mainfrom
feat/print-queue

Conversation

@strausmann
Copy link
Copy Markdown
Owner

Summary

PR B of split-Task 2.8. Adds the per-printer async work queue on top of PR #49's Job FSM. Brother PT/QL printers expose TCP/9100 as a single stream — there is no on-device queue — so the hub serialises jobs per printer with one asyncio worker task each.

What's in this PR

backend/app/services/print_queue.py

  • PrinterWorkerState(StrEnum)ACTIVE / PAUSED, orthogonal to per-job state.
  • _PrinterLike(Protocol) — minimal contract the queue depends on: id: str + async print_image(image, *, tape_mm: int, **options: Any) -> None. tape_mm is kw-only required so any conforming plugin must accept it (mypy enforces this).
  • PrintQueue:
    • __init__(printers) — per-printer asyncio.Queue, worker-state, resume Event (initially set), jobs/workers dicts
    • start() / stop() — spawn / cancel workers
    • submit(printer_id, image, tape_mm, **options) -> uuid
    • get(job_id) / wait_for_job(job_id, timeout_s) — await job._done_event
    • cancel(job_id) / pause_job(job_id) / resume_job(job_id) — per-job control (resume_job re-enqueues at tail; docstring documents the stale-reference + qsize() invariant)
    • retry_job(job_id) -> str | None — only from FAILED; new UUID, retry_count+1, parent_job_id linked
    • pause_printer(...) / resume_printer(...) — worker-level control via asyncio.Event
    • list_queue(printer_id) — non-terminal jobs (queued + paused + printing)
    • clear_queue(printer_id) -> int — cancels queued + paused
    • _worker(printer_id) — pause-aware loop, state-filter on pop, transitions queued→printing→completed/failed; tape_mm is None and image_payload is None guards before the printer call; CancelledError re-raised for clean queue.stop()

backend/tests/unit/services/test_print_queue.py — 6 asyncio tests

  • submit returns valid UUID (validated via uuid.UUID(job_id))
  • 2 jobs print serially on same printer
  • pause_job / resume_job round-trip
  • clear_queue cancels all pending
  • pause_printer blocks worker (deterministic via asyncio.sleep(0) + qsize check)
  • retry_job creates new job with parent link

What's NOT in this PR

  • No persistence — _jobs is in-memory. TODO(phase5) comment marks the eviction concern.
  • No Job.wait_done() public wrapper — wait_for_job accesses _done_event directly with a TODO(phase5).
  • No model-specific code — generic queue infrastructure in services/.

Test plan

Review history (subagent-driven)

  1. Implementer 996451c — initial commit.
  2. Spec compliance: 1 deviation found — _PrinterLike.print_image: **kwargs: Any instead of *, tape_mm: int, **options: Any (fallback used unnecessarily; mypy didn't trip).
  3. Code quality: APPROVED_WITH_NITS — recommended tightening the Protocol, adding tape_mm guard, UUID validation, two TODO(phase5) comments, and resume_job invariant doc.
  4. Fix commit c461174 — Protocol tightened, guards added, UUID test strict, TODOs + invariant doc.

Linked plan

docs/superpowers/plans/2026-05-11-label-printer-hub.md Task 2.8 split B/B. With this PR Phase 2 logic-foundation (FSM + Queue) is complete. Real PT/QL printer plugins (Tasks 2.1 / 2.2) will plug into _PrinterLike and the registry from PR #48.

strausmann and others added 2 commits May 11, 2026 08:53
…/resume/cancel/retry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…m/payload, document queue invariants

- Fix A (I2): replace **kwargs with explicit tape_mm: int kw-only + **options
  in _PrinterLike.print_image so mypy strict enforces the signature
- Fix B (I1): guard job.tape_mm is None in _worker before print_image call;
  raises RuntimeError → worker's except Exception → FAILED with clear error_msg
- Fix C (M3): replace loose len(job_id) >= 8 assert with uuid.UUID(job_id)
  for a proper UUID round-trip validation
- Fix D (M1): document resume_job stale-reference + qsize +1 invariant
- Fix E (M2/M4): add TODO(phase5) comments on _jobs eviction and
  wait_for_job private _done_event access

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 11, 2026 09:02
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes the core infrastructure for a printer job queue, enabling reliable asynchronous job processing for label printers. By implementing a dedicated worker-per-printer model, the system ensures serialized job execution and provides robust control mechanisms for job and printer states. This work completes the logic-foundation phase for the printer hub, setting the stage for upcoming model-specific printer plugins.

Highlights

  • PrintQueue Implementation: Introduced an asynchronous PrintQueue service that manages per-printer worker tasks, enabling serialized job execution for devices lacking on-device queuing.
  • Job Control Features: Added comprehensive job lifecycle management including submission, cancellation, pausing, resuming, and retrying failed jobs.
  • Printer Worker Control: Implemented printer-level pause and resume functionality using asyncio.Event to control worker loops without disrupting in-flight jobs.
  • Protocol Enforcement: Defined a _PrinterLike protocol to ensure strict type safety for printer plugins, requiring explicit tape_mm parameters.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a per-printer asynchronous work queue service and associated unit tests to manage print job serialization and lifecycle. The feedback identifies several critical issues: the current shutdown process violates the Brother specification by potentially cancelling jobs mid-print, a race condition exists in the worker loop when a printer is paused while idle, and CPU-intensive image encoding is blocking the event loop. Additionally, there are performance concerns regarding the O(N) complexity of queue operations due to unbounded job storage.

Comment thread backend/app/services/print_queue.py Outdated
Comment on lines +93 to +94
for task in self._workers.values():
task.cancel()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The stop() method cancels worker tasks immediately. If a worker is currently executing printer.print_image, this will interrupt the raster stream mid-print. According to the Repository Style Guide (Priority 22), mid-print cancels are forbidden by the Brother spec. Consider implementing a graceful shutdown (e.g., using a sentinel value like None in the queue) to allow the current job to finish before the task exits.

References
  1. Mid-print cancel is forbidden by the Brother spec; flagging it is a high priority. (link)

Comment thread backend/app/services/print_queue.py Outdated
Comment on lines +246 to +249
if self._worker_states[printer_id] == PrinterWorkerState.PAUSED:
await self._worker_resume_events[printer_id].wait()

job = await queue.get()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a race condition in the worker loop. If the worker is idle at await queue.get() and the printer is paused, the worker will not re-check the pause state after popping the next job. This allows a job submitted after the pause to begin printing immediately. The pause check should be performed (or repeated) after queue.get() returns.

Suggested change
if self._worker_states[printer_id] == PrinterWorkerState.PAUSED:
await self._worker_resume_events[printer_id].wait()
job = await queue.get()
job = await queue.get()
# Re-check pause state to handle the race where the printer was paused
# while the worker was idle/waiting for a job.
if self._worker_states[printer_id] == PrinterWorkerState.PAUSED:
await self._worker_resume_events[printer_id].wait()

Comment thread backend/app/services/print_queue.py Outdated
if printer_id not in self._queues:
raise KeyError(f"Unknown printer: {printer_id}")
buf = BytesIO()
image.save(buf, format="PNG")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Encoding an image to PNG is a CPU-intensive operation that blocks the event loop. Per the Repository Style Guide (Rule 33), synchronous blocking operations should be avoided in async functions. Use asyncio.to_thread to offload this to a worker thread.

Suggested change
image.save(buf, format="PNG")
await asyncio.to_thread(image.save, buf, format="PNG")
References
  1. Synchronous I/O (or blocking operations) in async functions should be flagged. (link)

raise KeyError(f"Unknown printer: {printer_id}")
non_terminal = (JobState.QUEUED, JobState.PAUSED, JobState.PRINTING)
return [
j for j in self._jobs.values() if j.printer_id == printer_id and j.state in non_terminal
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Iterating over self._jobs.values() in list_queue (and clear_queue) results in $O(N)$ complexity, where $N$ is the total number of jobs ever submitted. As noted in the TODO on line 74, _jobs grows unbounded. This will become a performance bottleneck over time. Consider maintaining a per-printer collection of active job IDs or implementing the eviction strategy sooner.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the backend’s per-printer asynchronous print queue/worker layer on top of the existing job lifecycle FSM, enabling serialized printing per device with job- and printer-level control (pause/resume/cancel/retry).

Changes:

  • Introduces PrintQueue with per-printer asyncio.Queue and one worker task per printer, plus job control APIs.
  • Implements printer-level pause/resume via worker state + asyncio.Event.
  • Adds unit tests covering submit/serial execution/pause+resume/clear/retry and printer pause behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
backend/app/services/print_queue.py New per-printer queue + worker implementation with job/printer control APIs.
backend/tests/unit/services/test_print_queue.py New asyncio unit tests for PrintQueue behaviors.

Comment thread backend/app/services/print_queue.py Outdated
Comment on lines +263 to +264
image = Image.open(BytesIO(job.image_payload))
await printer.print_image(image, tape_mm=job.tape_mm, **job.options)
Comment thread backend/app/services/print_queue.py Outdated
Comment on lines +245 to +249
# Block here if the printer is paused; resume_printer() sets the event.
if self._worker_states[printer_id] == PrinterWorkerState.PAUSED:
await self._worker_resume_events[printer_id].wait()

job = await queue.get()
Comment thread backend/app/services/print_queue.py Outdated
Comment on lines +92 to +96
async def stop(self) -> None:
for task in self._workers.values():
task.cancel()
if self._workers:
await asyncio.gather(*self._workers.values(), return_exceptions=True)
Comment on lines +218 to +231
async def clear_queue(self, printer_id: str) -> int:
"""Cancel all queued + paused jobs for a printer. Returns the count."""
if printer_id not in self._queues:
raise KeyError(f"Unknown printer: {printer_id}")
cancelled = 0
for job in self._jobs.values():
if job.printer_id == printer_id and job.state in (
JobState.QUEUED,
JobState.PAUSED,
):
JobStateMachine.transition(job, JobState.CANCELLED)
cancelled += 1
return cancelled

…age serialization

Fix A (HIGH): move pause-check to after queue.get() so a pause set while the
worker is idle at queue.get() is always honoured before the job transitions to
PRINTING. Add regression test test_queue_pause_after_idle_worker_is_respected.

Fix B (HIGH): replace cancel-only stop() with a graceful drain using a None
sentinel to wake idle workers and asyncio.wait_for() to give in-flight jobs up
to timeout_s (default 30 s) to complete before forcible cancel. Add regression
test test_queue_stop_drains_in_flight_job. Add _stopping flag so a paused
worker blocked on its resume-event exits immediately when stop() fires.

Fix C (MEDIUM): offload image.save() (submit) and Image.open() (worker) to
asyncio.to_thread() via the new module-level _serialize_image_to_png helper,
keeping the event loop unblocked for typical label payloads.

Fix D (MEDIUM): annotate list_queue() and clear_queue() with O(N) complexity
cross-reference to the existing TODO(phase5) comment on _jobs.

Update test_queue_pause_printer_blocks_worker: drop the now-stale qsize()==1
assertion (worker pops before blocking on pause); replace with a state+call-
count check that remains true under the new post-get pause loop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@strausmann strausmann merged commit dfdf6fe into main May 11, 2026
9 checks passed
github-actions Bot pushed a commit that referenced this pull request May 12, 2026
## 0.3.0 (2026-05-12)

* feat(config): pydantic-settings module with env-driven runtime configuration (#45) ([878e9e0](878e9e0)), closes [#45](#45)
* feat(integrations): AppLookupService aggregator — Phase 3 complete (#53) ([222bef4](222bef4)), closes [#53](#53)
* feat(integrations): Grocy + Spoolman lookup clients with shared NotFoundError base (#52) ([b1c9c3c](b1c9c3c)), closes [#52](#52)
* feat(integrations): LabelData schema + Snipe-IT lookup client (#51) ([3bc180f](3bc180f)), closes [#51](#51)
* feat(label-renderer): Template schema + Pillow/qrcode renderer for 1-bit label bitmaps (#54) ([fb77028](fb77028)), closes [#54](#54)
* feat(printer-models): Brother PT-Series TapeRegistry with TZe and heat-shrink specs (#47) ([7526019](7526019)), closes [#47](#47)
* feat(printer-models): Job lifecycle FSM with explicit state machine (#49) ([1a8c40e](1a8c40e)), closes [#49](#49)
* feat(printer-models): PrinterModel Protocol + ModelRegistry for plugin discovery (#48) ([2ae0e09](2ae0e09)), closes [#48](#48)
* feat(printer-models): PrintQueue worker with pause/resume/cancel/retry (#50) ([dfdf6fe](dfdf6fe)), closes [#50](#50)

[skip ci]
@strausmann strausmann deleted the feat/print-queue branch May 15, 2026 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants