OCPNODE-4052: Add enhancement for additional storage configuration in CRI-O by saschagrunert · Pull Request #1934 · openshift/enhancements

saschagrunert · 2026-01-29T11:52:19Z

Extend ContainerRuntimeConfig API with two features for AI/ML workload performance improvements:

additionalLayerStores: Enable lazy image pulling via storage plugins (BYOS approach with stargz-snapshotter, nydus-storage-plugin)
additionalImageStores: Read-only container image caches on shared or high-performance storage (NFS, SSD) for faster startup and reduced network overhead
additionalArtifactStores: Configurable OCI artifact storage locations (SSD-backed storage, pre-populated caches, air-gapped deployments)

Tech Preview for 4.22 behind TechPreviewNoUpgrade feature gate.

Path-based configuration with FUSE filesystem interface. Graceful fallback to standard behavior on failure.

PTAL @haircommander @QiWang19 @sairameshv @harche @bitoku @yuqi-zhang

API: openshift/api#2681

openshift-ci-robot · 2026-01-29T11:52:24Z

@saschagrunert: This pull request references OCPNODE-4052 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Extend ContainerRuntimeConfig API with two features for AI/ML workload performance improvements:

additionalLayerStores: Enable lazy image pulling via storage plugins (BYOS approach with stargz-snapshotter, nydus-storage-plugin)

additionalArtifactStores: Configurable OCI artifact storage locations (SSD-backed storage, pre-populated caches, air-gapped deployments)

Tech Preview for 4.22 behind TechPreviewNoUpgrade feature gate.

Path-based configuration with FUSE filesystem interface. Graceful fallback to standard behavior on failure.

PTAL @haircommander @QiWang19 @sairameshv @harche @bitoku @yuqi-zhang

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

enhancements/machine-config/additional-storage-config-crio.md

openshift-ci-robot · 2026-01-29T15:05:36Z

@saschagrunert: This pull request references OCPNODE-4052 which is a valid jira issue.

Details

In response to this:

Extend ContainerRuntimeConfig API with two features for AI/ML workload performance improvements:

additionalLayerStores: Enable lazy image pulling via storage plugins (BYOS approach with stargz-snapshotter, nydus-storage-plugin)

additionalImageStores: Read-only container image caches on shared or high-performance storage (NFS, SSD) for faster startup and reduced network overhead

additionalArtifactStores: Configurable OCI artifact storage locations (SSD-backed storage, pre-populated caches, air-gapped deployments)

Tech Preview for 4.22 behind TechPreviewNoUpgrade feature gate.

Path-based configuration with FUSE filesystem interface. Graceful fallback to standard behavior on failure.

PTAL @haircommander @QiWang19 @sairameshv @harche @bitoku @yuqi-zhang

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

enhancements/machine-config/additional-storage-config-crio.md

haircommander · 2026-01-29T16:13:46Z

FYI @ktock @GerrySeidman

ktock

Thanks for working on this.

enhancements/machine-config/additional-storage-config-crio.md

haircommander · 2026-01-30T14:56:38Z

/lgtm

GerrySeidman · 2026-01-30T16:42:15Z

This is awsome. Thanks for making this happen.

This is particularly great for the AuriStor's ACA AdditionalLayerStore. We have users who have been waiting for this capability! - Again Thank you!!

Note Regarding "Supported Image formats and Technologies"

The "No plugin needed;" Note for the zstd:chunked and composefs may be a bit confusing and could possibly read as contradicting that they are supported image formats.
Also still sad that AdditionalLayerStore does not support the standard OCI Image format because the AuriStor AdditionalLayerStore works with any OCI Image. Additional Layer Store no longer works for layers without TOC Digest annotations containers/container-libs#109

yuqi-zhang

The general idea lgtm, just some comments/questions inline

yuqi-zhang · 2026-01-30T21:33:41Z

enhancements/machine-config/additional-storage-config-crio.md

+
+#### Additional Layer Stores Workflow
+
+1. Cluster administrator identifies slow pod startup times from large images (>5GB) and installs a storage plugin (e.g., stargz-store) on target nodes via DaemonSet or MachineConfig


Curious how this could be done today. Presumably the only supported way is to use image mode to build an RHCOS image with the plugin installed?

Yeah they could either use a DaemonSet, a Machine Config with systemd unit or image mode. I clarified that inthe statement.

yuqi-zhang · 2026-01-30T21:37:57Z

enhancements/machine-config/additional-storage-config-crio.md

+       additionalLayerStores:
+         - path: /var/lib/stargz-store
+   ```
+3. MCO generates `storage.conf`, creates MachineConfig, and applies to selected pool; nodes reboot


I'm curious if these be dropin files instead of modifying the main configuration file? I guess the benefit of modifying the main configuration file is we maintain our overwrite-not-merge management of CRC objects.

(The background is that due to how the CRC->MC rendering is implemented today, we don't do a merge of configs unless they happen to touch different files/units on-disk, which is a bit of a weird behaviour detail. So if someone were to apply all 3 example CRCs to the cluster, only a subset of them would take effect, depending on how they actually render to files. Perhaps we should document it as "merge this into the main containerruntimeconfig object so you only have 1 CRC per pool still").

support for a drop-in is on the horizon for storage.conf, but I don't think we'll get it until RHEL 10.4 or so (podman 6 timeframe)

I added a note wrt that in the doc.

enhancements/machine-config/additional-storage-config-crio.md

yuqi-zhang · 2026-01-30T21:45:06Z

enhancements/machine-config/additional-storage-config-crio.md

+   - For `additionalArtifactStores`: Update CRI-O configuration with `additional_artifact_stores` array
+3. **Create MachineConfig**: Bundle generated configuration into a MachineConfig
+4. **Apply to nodes**: MCO applies configuration to nodes matching the `machineConfigPoolSelector`
+5. **Trigger reboot**: Nodes reboot to apply new storage/runtime configuration


Is a reboot required for these changes or would a reload/restart of crio be sufficient?

we could add support for reload but we currently don't AFAIU

Updated to clarify that reboot is currently required, but noted that CRI-O reload/restart without reboot may be considered for future releases.

enhancements/machine-config/additional-storage-config-crio.md

… CRI-O This enhancement extends ContainerRuntimeConfig API with three features for improved container storage flexibility: - additionalLayerStores: Enable lazy image pulling via storage plugins (BYOS approach with stargz-snapshotter, nydus-storage-plugin) - additionalImageStores: Read-only container image caches on shared or high-performance storage (NFS, SSD) for faster startup and reduced network overhead - additionalArtifactStores: Configurable OCI artifact storage locations (SSD-backed storage, pre-populated caches, air-gapped deployments) All features target AI/ML workload performance improvements and will ship as Tech Preview in 4.22 behind TechPreviewNoUpgrade feature gate. Path-based configuration with graceful fallback to standard behavior on failure. Unified API design pattern across all three features. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>

yuqi-zhang

In the spirit of trying to get more enhancements merged before we start work on the API/feature, the general plan lgtm, so leaving an approval from the MCO side

openshift-ci · 2026-02-02T20:21:20Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ktock, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [yuqi-zhang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

saschagrunert · 2026-02-03T08:10:04Z

@mrunalp @JoelSpeed PTAL

JoelSpeed · 2026-02-03T17:48:58Z

enhancements/machine-config/additional-storage-config-crio.md

+         pools.operator.machineconfiguration.openshift.io/worker: ""
+     containerRuntimeConfig:
+       additionalImageStores:
+         - path: /mnt/nfs-images


Does this only have a path for now? If there were multiple entries in the list, what does CRIO do? Does it balance across the stores somehow?

Yes, for now it's just a path but in the future there may be more (distinct) configuration options. That's also one reason I decided to not merge the 3 types into one.

The underlying library does not do any deduplication, means we would have access to the configured store multiple times in the order specified. I don't have a use case in mind where this would make sense, though. 🤷

JoelSpeed · 2026-02-03T17:50:48Z

enhancements/machine-config/additional-storage-config-crio.md

+    // +optional
+    // +listType=atomic
+    // +kubebuilder:validation:MinItems=1
+    // +kubebuilder:validation:MaxItems=5


Out of interest, why is this limited to 5, but the other two fields limited to 10?

Layer-level lookups are more granular than regular image/artifact accesses. When a container starts, CRI-O may need to look up multiple individual layers per image/artifact. Considering that, there is also a higher risk of corruption and sequential lookups.

Beside that, the amount of available plugin types for this use case is really limited. It's harder to setup and I can't imagine that a user requires to setup more than 5 distinct additional layer stores. We only have 4 supported technologies right now.

I am not sure what you mean by "higher risk of corruption and sequential lookups." related to multiple individual layers per image/artifact. Are you saying the ALS processed layers are tightly bound to the container image?

TL/DR; This is not about the original comment on limits, but on the ALS itself

Example:

docker.io/python:3.13-alpine is processed in such that all of its layer can be handled by an ALS plugin (for example with stargz conversions or AuriStorFS Layer Volumes generated per unique Layer ID in the distributed file system)

Container Image FOO is "FROM docker.io/python:3.13-alpine" , but container FOO is not explicitly ALS processed for the ALS plugin

Expected Behavior:

All the layers from docker.io/python:3.13-alpine inherited from Container FOO will be able to leverage the ALS plugin optimization

The Layers from FOO above docker.io/python:3.13-alpine would fall back to the default (non ALS optimized) processing.

Namely the ALS FUSE plugin is queried for <root>/store/<image-ref>/<layer-digest>/diff paths, but it is entirely up to the plugin how to interpret that path. The same content would be used across multiple .

Note: Unlike stargz which presumably would require specially built stargz "FROM images", the AuriStor ALS with with any OCI container image manifests and layers without the need for special image builds. Additionally its plugin is less strict on with its ALS Plugin configurable with wildcard container image name equivalencies, (ie github.com/your-org/* and docker.io/*) all being mapped to the same namespace of Layer Volumes in the Distributed File System based on Layer ID

Background: The AuriStor Additional Layer Store leverages the AuriStorFS distributed file system and the local cache manager. Unlike stargz, AuriStorFS does not require a special layer format, but it does does check if the layer content has been made available in the distributed file system (general done in the CI/CD pipeline after the container is pushed to the registry) with a AuriStor Volume created per Layer based on the Layer Digest/ID

JoelSpeed · 2026-02-04T11:48:11Z

LGTM, will leave the labelling for other reviewers

haircommander · 2026-02-04T16:00:03Z

/lgtm

openshift-ci · 2026-02-04T16:26:58Z

@saschagrunert: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

lance5890 · 2026-02-20T00:54:53Z

enhancements/machine-config/additional-storage-config-crio.md

+#### Single-node Deployments or MicroShift
+
+- **Single-node OpenShift (SNO)**: Supported. All three features work on SNO deployments, though the BYOS approach for layer stores may be operationally complex on single-node systems with limited resources.
+- **MicroShift**: Not supported. MicroShift does not use the Machine Config Operator, so the API is unavailable. These features are deferred for MicroShift environments.


since microshift support multi-nodes now microshift support multi nodes, should we manually configure these features in the future?

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 29, 2026

saschagrunert force-pushed the additional-storage-config-crio branch from 0fb47dd to 81eb91b Compare January 29, 2026 11:52

openshift-ci bot requested review from LalatenduMohanty and syed January 29, 2026 11:52

saschagrunert force-pushed the additional-storage-config-crio branch 6 times, most recently from ded08b2 to 2c776ef Compare January 29, 2026 13:37

haircommander reviewed Jan 29, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

haircommander reviewed Jan 29, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

haircommander reviewed Jan 29, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

haircommander reviewed Jan 29, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

haircommander reviewed Jan 29, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Show resolved Hide resolved

saschagrunert force-pushed the additional-storage-config-crio branch 2 times, most recently from 6a65cb2 to 06e6f90 Compare January 29, 2026 15:03

saschagrunert force-pushed the additional-storage-config-crio branch 7 times, most recently from 5fac884 to f9d5c35 Compare January 29, 2026 15:49

haircommander reviewed Jan 29, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

ktock reviewed Jan 30, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Show resolved Hide resolved

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

saschagrunert force-pushed the additional-storage-config-crio branch from f9d5c35 to 702a3c2 Compare January 30, 2026 07:44

saschagrunert force-pushed the additional-storage-config-crio branch from 13b9f60 to 206470d Compare January 30, 2026 10:25

openshift-ci bot assigned haircommander Jan 30, 2026

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 30, 2026

yuqi-zhang reviewed Jan 30, 2026

View reviewed changes

GerrySeidman reviewed Jan 30, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

GerrySeidman reviewed Jan 30, 2026

View reviewed changes

enhancements/machine-config/additional-storage-config-crio.md Outdated Show resolved Hide resolved

saschagrunert force-pushed the additional-storage-config-crio branch from 206470d to 5a3d190 Compare February 2, 2026 09:10

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 2, 2026

saschagrunert force-pushed the additional-storage-config-crio branch 4 times, most recently from c8630b5 to fbd9ea2 Compare February 2, 2026 10:57

saschagrunert force-pushed the additional-storage-config-crio branch from fbd9ea2 to 525af9b Compare February 2, 2026 11:43

yuqi-zhang approved these changes Feb 2, 2026

View reviewed changes

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2026

JoelSpeed reviewed Feb 3, 2026

View reviewed changes

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 4, 2026

openshift-merge-bot bot merged commit 078ee67 into openshift:master Feb 4, 2026
2 checks passed

saschagrunert deleted the additional-storage-config-crio branch February 4, 2026 22:10

saschagrunert mentioned this pull request Feb 17, 2026

OCPNODE-4074: Add additional storage configuration support for CRI-O openshift/machine-config-operator#5666

Open

lance5890 reviewed Feb 20, 2026

View reviewed changes

saschagrunert mentioned this pull request Feb 23, 2026

Default artifact location cri-o/cri-o#9570

Open


		#### Additional Layer Stores Workflow

		1. Cluster administrator identifies slow pod startup times from large images (>5GB) and installs a storage plugin (e.g., stargz-store) on target nodes via DaemonSet or MachineConfig

Conversation

saschagrunert commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci-robot commented Jan 29, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci-robot commented Jan 29, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

haircommander commented Jan 29, 2026

Uh oh!

ktock left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

haircommander commented Jan 30, 2026

Uh oh!

GerrySeidman commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuqi-zhang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yuqi-zhang left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Feb 2, 2026

Uh oh!

saschagrunert commented Feb 3, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saschagrunert Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GerrySeidman Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JoelSpeed commented Feb 4, 2026

Uh oh!

haircommander commented Feb 4, 2026

Uh oh!

openshift-ci bot commented Feb 4, 2026

Uh oh!

Uh oh!

saschagrunert commented Jan 29, 2026 •

edited

Loading

openshift-ci-robot commented Jan 29, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Jan 29, 2026 •

edited by openshift-ci bot

Loading

GerrySeidman commented Jan 30, 2026 •

edited

Loading

saschagrunert Feb 4, 2026 •

edited

Loading

GerrySeidman Feb 5, 2026 •

edited

Loading