Skip to content

0008 - add job-type GPU/session schema and coverage#20

Merged
las7 merged 2 commits intomainfrom
0008-job-type-gpu-schema
Mar 16, 2026
Merged

0008 - add job-type GPU/session schema and coverage#20
las7 merged 2 commits intomainfrom
0008-job-type-gpu-schema

Conversation

@ethanbailie
Copy link
Copy Markdown
Collaborator

This PR introduces the schema and serialization support needed to represent session-capable and GPU-capable job types. It is intentionally limited to config/schema/API surface for job-type metadata, without runtime execution behavior.

What Changed

  • Added GPU config model and validation in tako_vm/config.py
    • vendor/count/device_ids rules
    • invalid combinations rejected
  • Extended job type parsing and config merge behavior in tako_vm/job_types.py
  • Updated bundled defaults in tako_vm/job_types.json
  • Exposed new fields in job type API responses in tako_vm/server/app.py
    • session_enabled
    • gpu_enabled
    • gpu_vendor
  • Included new schema fields in digest computation in tako_vm/version.py

Tests Added/Updated

  • tests/test_config.py
    • GPU config validation matrix and normalization checks
  • tests/test_job_types.py
    • nested and legacy GPU parse compatibility
    • config merge behavior and persistence boundaries
  • tests/test_version.py
    • digest changes when session/GPU fields change
    • deterministic digest for reordered gpu_device_ids
  • tests/test_api.py
    • job type endpoint includes new fields

How To Review

  1. Start with tako_vm/config.py to validate the model constraints.
  2. Review tako_vm/job_types.py for parsing/merge behavior and backward compatibility.
  3. Verify API output wiring in tako_vm/server/app.py.
  4. Confirm test coverage in:
    • tests/test_config.py
    • tests/test_job_types.py
    • tests/test_version.py

Suggested Verification

  • ruff check tako_vm tests
  • pytest tests/test_config.py tests/test_job_types.py tests/test_version.py -v

Out of Scope

  • No container runtime selection or GPU runtime enforcement yet.
  • No session lifecycle implementation yet.

@ethanbailie ethanbailie added the enhancement New feature or request label Mar 13, 2026
@ethanbailie ethanbailie marked this pull request as ready for review March 16, 2026 02:06
@ethanbailie ethanbailie requested a review from las7 as a code owner March 16, 2026 02:06
Copy link
Copy Markdown
Owner

@las7 las7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass, code reviewed. Config.py shows new data model for GPU workloads. Some validation is donea against names. So far this is not being used yet, however will be useful for supporting GPU workloads in the future.

@las7 las7 merged commit 326f32e into main Mar 16, 2026
8 checks passed
Copy link
Copy Markdown
Owner

@las7 las7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Notes

Overall this is clean and well-structured — schema-only changes with solid validation and test coverage. A few things worth discussing:

Questions

1. API response exposes gpu_vendor but omits gpu_count and gpu_device_ids
JobTypeResponse includes session_enabled, gpu_enabled, gpu_vendor but not gpu_count or gpu_device_ids. If a client needs to know the full GPU capabilities of a job type, they'd get an incomplete picture. Is this intentional to keep the API lean, or should those be included?

Suggestions

2. from_dict behavior change — silent key dropping
The old from_dict was cls(**data), which would raise TypeError on unknown keys. The new explicit field enumeration silently ignores unrecognized keys. This is arguably better for forward compatibility, but it could mask data issues. Consider logging a warning for unrecognized keys.

3. Spurious whitespace change in config.py
A blank line before # Validate and create config was removed — unrelated to the PR. Minor diff noise.

4. job_types.json still missing trailing newline

Coordination note

PR #21 duplicates all of this PR's schema changes but has diverged — notably, #21's validate_device_ids is missing the duplicate detection (the seen set with case-insensitive dedup) that this PR added. If #21 merges first or gets rebased carelessly, that validation could be lost. Recommend merging this PR first, then rebasing #21 to only include its runtime additions.

Looks good

  • GPU validation model is thorough (vendor normalization, mutual exclusion of count/device_ids, AMD count restriction, duplicate detection, comma rejection)
  • merge_config_job_types is clean — in-memory only, returns count
  • register / register_many with persist param — good backward-compatible API
  • Version digest correctly includes new fields with sorted(gpu_device_ids) for determinism
  • Test coverage is solid across config validation, roundtrip, merge, and digest stability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants