Skip to content

Conversation

@jopemachine
Copy link
Member

@jopemachine jopemachine commented Jan 16, 2026

Resolves #8082 (BA-3917)

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

Summary

  • Add strawberry-based KernelV2 GraphQL types replacing legacy KernelNode
  • Introduce structured types to replace JSON scalar fields:
    • ResourceOptsGQL, ServicePortsGQL, StatusHistoryGQL, AttachedDevicesGQL, VFolderMountGQL
  • Add proper enum types: SessionTypesGQL, SessionResultGQL, MountPermissionGQL, VFolderUsageModeGQL
  • Add Relay pagination types with filtering and ordering support
  • Add KernelOrders.created_at() and KernelOrders.id() for ordering support

🤖 Generated with Claude Code


📚 Documentation preview 📚: https://sorna--8079.org.readthedocs.build/en/8079/


📚 Documentation preview 📚: https://sorna-ko--8079.org.readthedocs.build/ko/8079/

Copilot AI review requested due to automatic review settings January 16, 2026 04:49
@github-actions github-actions bot added size:XL 500~ LoC comp:manager Related to Manager component labels Jan 16, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds new strawberry-based GraphQL types for kernel management (KernelV2) to replace the legacy KernelNode. The changes introduce:

Changes:

  • Structured GraphQL types to replace JSON scalar fields (ResourceOpts, ServicePorts, StatusHistory, AttachedDevices, VFolderMount)
  • Enum types for SessionTypes, SessionResult, MountPermission, and VFolderUsageMode
  • Relay pagination support with filtering and ordering capabilities via KernelFilterGQL and KernelOrderByGQL
  • Two new order methods in KernelOrders: created_at() and id() for ordering support

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/ai/backend/manager/repositories/scheduler/options.py Adds created_at() and id() order methods to KernelOrders class for query ordering
src/ai/backend/manager/api/gql/kernel/types.py Defines comprehensive KernelV2 GraphQL types with structured fields, enums, filtering, and pagination support
src/ai/backend/manager/api/gql/kernel/init.py Exports the new kernel GraphQL types for use throughout the application

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

case KernelOrderFieldGQL.CREATED_AT:
return KernelOrders.created_at(ascending)
case KernelOrderFieldGQL.ID:
return KernelOrders.id(ascending)
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The to_query_order method doesn't handle all possible enum cases and has no default/fallback, which could cause a runtime error if a new enum value is added. Add a default case or raise an explicit error for unhandled values.

Suggested change
return KernelOrders.id(ascending)
return KernelOrders.id(ascending)
case _:
raise ValueError(f"Unhandled KernelOrderFieldGQL value: {self.field!r}")

Copilot uses AI. Check for mistakes.
Comment on lines 218 to 221
entries.append(
ServicePortEntryGQL(
name=name,
protocol=ServicePortProtocolGQL(port_info.get("protocol", "tcp")),
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential ValueError if the protocol string from port_info is not a valid ServicePortProtocolGQL enum value. Wrap this in a try-except or validate the value before creating the enum.

Suggested change
entries.append(
ServicePortEntryGQL(
name=name,
protocol=ServicePortProtocolGQL(port_info.get("protocol", "tcp")),
protocol_value = port_info.get("protocol", "tcp")
try:
protocol = ServicePortProtocolGQL(protocol_value)
except ValueError:
# Fallback to the default protocol if the provided value is invalid.
try:
protocol = ServicePortProtocolGQL("tcp")
except ValueError:
# If even the default is invalid, skip this entry to avoid runtime errors.
continue
entries.append(
ServicePortEntryGQL(
name=name,
protocol=protocol,

Copilot uses AI. Check for mistakes.
Comment on lines 398 to 405
return cls(
name=data.get("name", ""),
vfid=str(data.get("vfid", "")),
vfsubpath=str(data.get("vfsubpath", ".")),
host_path=str(data.get("host_path", "")),
kernel_path=str(data.get("kernel_path", "")),
mount_perm=MountPermissionGQL(data.get("mount_perm", "ro")),
usage_mode=VFolderUsageModeGQL(data.get("usage_mode", "general")),
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential ValueError if mount_perm or usage_mode values from data are not valid enum values. Add error handling or validation before creating these enum instances.

Suggested change
return cls(
name=data.get("name", ""),
vfid=str(data.get("vfid", "")),
vfsubpath=str(data.get("vfsubpath", ".")),
host_path=str(data.get("host_path", "")),
kernel_path=str(data.get("kernel_path", "")),
mount_perm=MountPermissionGQL(data.get("mount_perm", "ro")),
usage_mode=VFolderUsageModeGQL(data.get("usage_mode", "general")),
raw_mount_perm = data.get("mount_perm", "ro")
if isinstance(raw_mount_perm, MountPermissionGQL):
mount_perm = raw_mount_perm
elif raw_mount_perm in {m.value for m in MountPermissionGQL}:
mount_perm = MountPermissionGQL(raw_mount_perm)
else:
# Fallback to default permission if an invalid value is provided.
mount_perm = MountPermissionGQL("ro")
raw_usage_mode = data.get("usage_mode", "general")
if isinstance(raw_usage_mode, VFolderUsageModeGQL):
usage_mode = raw_usage_mode
elif raw_usage_mode in {m.value for m in VFolderUsageModeGQL}:
usage_mode = VFolderUsageModeGQL(raw_usage_mode)
else:
# Fallback to default usage mode if an invalid value is provided.
usage_mode = VFolderUsageModeGQL("general")
return cls(
name=data.get("name", ""),
vfid=str(data.get("vfid", "")),
vfsubpath=str(data.get("vfsubpath", ".")),
host_path=str(data.get("host_path", "")),
kernel_path=str(data.get("kernel_path", "")),
mount_perm=mount_perm,
usage_mode=usage_mode,

Copilot uses AI. Check for mistakes.
Comment on lines 276 to 279
entries.append(
StatusHistoryEntryGQL(
status=status,
timestamp=datetime.fromisoformat(timestamp),
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datetime.fromisoformat() can raise ValueError if the timestamp string is not in the correct ISO format. Add error handling to gracefully handle malformed timestamps.

Suggested change
entries.append(
StatusHistoryEntryGQL(
status=status,
timestamp=datetime.fromisoformat(timestamp),
try:
parsed_timestamp = datetime.fromisoformat(timestamp)
except ValueError:
# Skip entries with malformed timestamp strings
continue
entries.append(
StatusHistoryEntryGQL(
status=status,
timestamp=parsed_timestamp,

Copilot uses AI. Check for mistakes.
Comment on lines 789 to 398
count: int

def __init__(self, *args, count: int, **kwargs) -> None:
super().__init__(*args, **kwargs)
self.count = count
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The count field is defined twice - once as a class attribute and once set in __init__. This is redundant. Consider using a strawberry field with a resolver or removing the __init__ method and properly initializing the count field through the Connection initialization.

Suggested change
count: int
def __init__(self, *args, count: int, **kwargs) -> None:
super().__init__(*args, **kwargs)
self.count = count
count: int = strawberry.field(
description="Total number of kernels matching the filter, ignoring pagination."
)

Copilot uses AI. Check for mistakes.
@jopemachine jopemachine changed the title feat: Add KernelV2 GraphQL types with structured fields feat(BA-3917): Add KernelV2 GraphQL types with structured fields Jan 16, 2026
@jopemachine jopemachine added this to the 26.1 milestone Jan 16, 2026
@jopemachine jopemachine marked this pull request as draft January 16, 2026 05:18
@jopemachine jopemachine marked this pull request as ready for review January 16, 2026 07:26
@github-actions github-actions bot added the area:docs Documentations label Jan 16, 2026
@jopemachine jopemachine changed the title feat(BA-3917): Add KernelV2 GraphQL types with structured fields feat(BA-3917): Define KernelV2 GraphQL schema types with structured fields Jan 16, 2026
Copy link
Collaborator

@HyeockJinKim HyeockJinKim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please separate PR and write a BEP for this job.

Comment on lines 650 to 663
occupied_slots: ResourceSlotGQL = strawberry.field(
description=dedent_strip("""
The resource slots currently occupied by this kernel.
Contains entries with resource types (e.g., cpu, mem, cuda.shares) and their quantities.
""")
)
requested_slots: ResourceSlotGQL = strawberry.field(
description=dedent_strip("""
The resource slots originally requested for this kernel.
May differ from occupied_slots due to scheduling adjustments.
""")
)
occupied_shares: ResourceSlotGQL = strawberry.field(
description="The fractional resource shares occupied by this kernel."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jopemachine jopemachine modified the milestones: 26.1, 26.2 Jan 26, 2026
@jopemachine
Copy link
Member Author

Please separate PR and write a BEP for this job.

The contents of the BEP have been reflected in this PR (See BEP-1034 for this)

Could you please review it once more? @HyeockJinKim

@jopemachine jopemachine marked this pull request as ready for review February 3, 2026 03:26
@jopemachine jopemachine force-pushed the feat/kernel-v2-types branch 2 times, most recently from 4b96036 to 1b45b18 Compare February 3, 2026 03:41
Co-authored-by: octodog <mu001@lablup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentations comp:manager Related to Manager component size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Define KernelV2 GraphQL schema types with structured fields

4 participants