InterLink is a framework for executing Kubernetes pods on remote resources capable of managing container execution lifecycles. It consists of two main components:
- Virtual Kubelet (Virtual Node): translates Kubernetes pod execution requests into remote calls to the InterLink API server
- InterLink API Server: a modular, pluggable REST server with provider-specific plugins for different execution environments
This repository implements an InterLink plugin for Kubernetes, which allows offloading pods from a local Kubernetes cluster to a remote Kubernetes cluster.
The plugin's core responsibilities include :
- FastAPI service that receives InterLink API requests
- Kubernetes offloading logic (pods, volumes, logs, cleanup)
- Optional TCP tunnel provisioning through Helm charts (deprecated)
src/main.py: ASGI import point (app)src/app/microservice.py: FastAPI app setup, middleware, exception handlers, router loadingsrc/app/controllers/v1/kubernetes_plugin_controller.py: HTTP endpointssrc/app/services/kubernetes_plugin_service.py: core offloading logicsrc/app/dependencies.py: DI wiring and Kubernetes/Helm client provisioningsrc/app/common/config.py: config optionssrc/app/entities/mappers.py: InterLink <-> Kubernetes model mappingsrc/private/config.sample.ini: baseline runtime configurationsrc/infr/charts/tcp-tunnel/*: gateway/bastion Helm charts (deprecated)test/infr/manifests/*: manual test manifests
These are the code-level foundations that shape how this plugin works today.
- Runtime composition and dependency graph:
microservice.pyinitializes config and logger throughdependencies.py, loads controllers from configured API versions, registers global exception handlers, and wires optional request logging middleware from config.dependencies.pyowns singleton construction forConfig, logger,KubernetesPluginServiceand third-party clients. - Controller-to-service contract:
KubernetesPluginControlleris intentionally thin and async. Endpoints delegate directly toKubernetesPluginService. - Model translation boundary:
InterLink-to-Kubernetes and Kubernetes-to-InterLink conversion is centralized in
entities/mappers.py. Service code relies on these mappers instead of manual nested dict/object conversions. - Resource scoping and traceability model:
KubernetesPluginServicescopes target namespace and object names (_scope_ns_name,_scope_obj_name) and sanitizes names for RFC1123 compliance (_ensure_subdomain_compliance)._scope_metadatainjects traceability labels/annotations (interlink.io/source.*,interlink.io/source.pod_uid) that tie remote resources back to the original pod. - Offloading lifecycle orchestration:
create_podfollows a strict sequence: ensure namespace, create supported dependent volumes/resources, then create the remote pod. Failures trigger rollback throughdelete_pod(..., rollback=True). Deletion is best-effort and cleanup-oriented: pod first, then tunnel resources, then scoped ConfigMaps/Secrets/PVCs. - Volume and PVC policy:
_filter_volumeskeeps only supported volume types (configMap,secret,emptyDir, selected PVCs) and rewrites related references involumeMounts,env.valueFrom, andenvFrom. PVC offloading is opt-in viainterlink.io/remote-pvc; PVC deletion honorsinterlink.io/pvc-retention-policy. - Networking feature paths:
Mesh support is implemented by parsing
slurm-job.vk.io/pre-exec, extracting heredoc content, and injecting setup containers/volumes based on mesh config flags. TCP tunnel logic is deprecated but still active; install/uninstall flow must remain symmetric to avoid orphaned Helm releases and Services.
- Config precedence is: programmatic overrides -> env vars ->
config.ini. - When adding config keys:
- add
Optionenum insrc/app/common/config.py - wire behavior in
dependencies.py/ service layer - document in
src/private/config.sample.ini - document in
README.md
- add
Use this checklist when implementing or reviewing code changes.
- Place changes in the right layer: keep request/response orchestration in controllers, implementation in services, object translation in mappers, and wiring in dependencies/config.
- Preserve scoping and metadata semantics: any new namespaced or pod-related resource must use the existing scoping helpers and keep InterLink traceability labels/annotations consistent.
- Keep lifecycle operations rollback-safe: if adding new resources in create flows, add matching cleanup in delete/rollback flows and handle partial failure without breaking cleanup of remaining resources.
- Respect volume handling rules: when introducing new volume-related logic, ensure references are consistently updated across pod spec sections and cleanup behavior remains predictable.
- Follow config extension workflow:
add new keys to
Option, consume them in service/dependencies, and update bothsrc/private/config.sample.iniandREADME.md. - Keep API compatibility stable: maintain existing endpoint contracts and InterLink model shapes unless the change explicitly includes an API change.
- Logging and errors: log resource operations with resource name/namespace context, prefer structured actionable error messages, and keep error responses JSON-compatible through existing exception handling paths.
If behavior changes, update all relevant docs together:
README.md(user-facing setup/feature behavior)src/private/config.sample.ini(config defaults/options)test/infr/manifests/*(example manifests)