Skip to content

Conversation

@kurlov
Copy link
Member

@kurlov kurlov commented Jul 1, 2025

  • Adds a new Go script to generates catalog-template.yaml from a new bundles.yaml file.
  • The bundles.yaml has a lists operator bundle images with versions, the oldest_supported_version and a list of broken_versions.
    • oldest_supported_version specifies what is the lowest supported version. Any version or channel before oldest_supported_version will be marked as deprecated.
    • broken_versions a list of versions which should be skipped. For each broken version X.Y.Z the script adds "skips" for all versions > X.Y.Z and < X.Y+2.0
  • Add CI step to checks which validates that catalog-template.yaml is up-to-date with bundles.yaml
  • Add CI to run go unit tests if cmd folder is changed

catalog-template.yaml changes:

  • Now it's autogenerated
  • All yaml anchors are dropped
  • Add schema: olm.bundle deprecation references for all version < oldest_supported_version. So not only channels are deprecated.
  • Add rhacs-3.63 channel
  • Dropped skips
  • Channels keeps all previous version within the same major version (e.g. 3.66 channel has all version <= 3.66.x)
  • latest channel has all 3.X.X versions
  • stable channel has all 4.X.X versions

Testing this PR catalogs on OCP 4.19:

image: quay.io/rhacs-eng/stackrox-operator-index@sha256:9345ef4e16b463205ab9dcf4446c5a34babc8c497ad9cbbeae327fb44b2f356d
commit: ca7b4bd

Now not only channels but versions are also deprecated.

catalog_gen_test

Test locally:

make generate-catalog-template
make clean && make valid-catalogs

@red-hat-konflux

This comment was marked as off-topic.

@kurlov kurlov changed the title WIP script to generate catalog-template.yaml DO NOT REVIEW YET: script to generate catalog-template.yaml Jul 1, 2025
@kurlov kurlov changed the title DO NOT REVIEW YET: script to generate catalog-template.yaml WIP: script to generate catalog-template.yaml Jul 1, 2025
@kurlov kurlov changed the title WIP: script to generate catalog-template.yaml Script to generate catalog-template.yaml Jul 9, 2025
@kurlov kurlov marked this pull request as ready for review July 10, 2025 12:23
@kurlov kurlov changed the title Script to generate catalog-template.yaml ROX-30064: Script to generate catalog-template.yaml Jul 10, 2025
@kurlov kurlov requested review from mclasmeier and porridge July 11, 2025 12:57
@msugakov msugakov self-requested a review July 11, 2025 13:46
Copy link
Contributor

@msugakov msugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratched the surface a bit. Got enough comments for the first round of review.
Please expect more review rounds and more comments.

@kurlov kurlov requested a review from msugakov July 15, 2025 10:05
Copy link
Contributor

@msugakov msugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, I completed the full pass now.

image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:f61189397263f05214c2d36b4dc0a71a924c2481a1e365b7fb3c71d8dfce6b27
- schema: olm.bundle
image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:b0590a2248d948f82e8a116e37a2be42f49a3edeb4a92d41416420ea604d5b34
- schema: olm.bundle
Copy link
Contributor

@msugakov msugakov Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.b.: once we finish editing the code, let's "smart"-diff contents of master v.s. new catalog-template.yaml. The amount of diff is quite big. We should make sure the new thing won't break customers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, and also I should make a test deployment to see how OLM deals with it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Now it's time for this.
Tom and I have figured that yq 'explode(.)' <filename> will inline all aliases/anchors and so this can be done on the catalog-template.yaml from master. Then send it together with this catalog-template.yaml to some YAML differ, and we should see what's changed.

I'll do this myself but I also recommend you trying too.

Copy link
Contributor

@msugakov msugakov Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

olm.deprecations look good.
olm.bundle-s look good too (some entries in the middle reordered but nothing added or removed compared to master).
olm.package no diff, hence good.
olm.channel were the hardest, looked through with the eyes and they seem good.


When looking at olm.package-s, I realized it's O(N^2) problem: if N is the total number of patches in the stable lineage, then the number of channels is O(N), i.e. linear from N. Therefore, the file size grows O(N) times O(N) -> O(N^2).
Each single channel remains having O(N) versions and that's a good thing for OLM so it can efficiently figure out upgrades when some channel is selected, but the overall file grows quickly in size. I don't know whether it will become a problem, e.g. due to long times loading the file since these are file-based catalogs (not sqlite-based as the former ones).

O(N^2) grows quickly and suddenly so I don't know how much time we may have until this becomes a problem.
Today catalog-template.yaml is just 152Kb and the rendered catalogs are:

$ du -h ./catalog-csv-metadata/rhacs-operator/catalog.json ./catalog-bundle-object/rhacs-operator/catalog.json
12M     ./catalog-csv-metadata/rhacs-operator/catalog.json
24M     ./catalog-bundle-object/rhacs-operator/catalog.json

Not terribly bad but actually larger than I would expect them to be.

What's actionable for us here? Well, the "unpublishing" check in FBC pipeline/Conforma forced us to this O(N^2) thing which we did not do before. Before, we could get away by unpublishing previous patches from latest and stable channels and that allowed to stay O(N). Maybe we should find ways to allow unpublishing deprecated versions from channels?

Let's cut a ticket to address it. Though, something I don't want to happen is that we cut a ticket and it dies in the backlog because nobody understands the problem since nobody was involved in the problem discussion.
Let me additionally ping @porridge here so that you both have a chance to share your thoughts on this.

Following up on Slack: https://redhat-internal.slack.com/archives/C05TS9N0S7L/p1764973355627099

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more precise to call space complexity K*N where K is the number of supported versions. Because it's not like we add a new entry to each version for every patch but rather to the supported versions.

I wonder what catalog size has some big bundles? Just to have a picture what file size is still ok-ish for the OLM.

Note: I did a manual test with bundle from this PR and via OLM UI the bundle looks good. I've update the gif attached to this PR. Since manual testing is fine I am going to merge this PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msugakov

When looking at olm.package-s, I realized it's O(N^2) problem

Did you mean

When looking at olm.channel-s, I realized it's O(N^2) problem

?

@kurlov I believe @msugakov is correct, total size of olm.channel objects is O(N²/2) which is the same as O(N²).

I used this guide and the following shell snippet to compute the polynomial regression of the combined size of the olm.channel rhacs-X.Y objects to estimate the size in ~10 years (assuming ~4 minor releases per year) and computer said ~200KB, which is not too bad I think.

minors=$(jq 'select(.schema=="olm.channel" and (.name |startswith("rhacs-")))' catalog-csv-metadata/rhacs-operator/catalog.json |grep ^\ \ .name|cut -d \" -f 4)
for m in $minors; do echo -n "$m ";jq "select(.schema==\"olm.channel\" and .name == \"$m\")" catalog-csv-metadata/rhacs-operator/catalog.json |wc -c;done
Size of olm channel objects

FTR, I ignored 4.7-4.9 since the growth there was sub-linear, I guess because they have not lived as long as the older channels yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what catalog size has some big bundles? Just to have a picture what file size is still ok-ish for the OLM.

I thought we would be able to find those in the ultimate index images like registry.redhat.io/openshift4/ose-operator-registry-rhel9:v4.19, but I'm not finding catalogs there.

[operator-index]$ opm migrate registry.redhat.io/redhat/redhat-operator-index:v4.21 ./catalog-migrate
INFO[0000] rendering index "registry.redhat.io/redhat/redhat-operator-index:v4.21" as file-based catalog 
INFO[0050] wrote rendered file-based catalog to "./catalog-migrate" 
[operator-index]$ ncdu catalog-migrate
24.6 MiB [#################################] /ansible-automation-platform-operator
16.1 MiB [#####################            ] /amq-broker-rhel8
 8.8 MiB [###########                      ] /rhacs-operator
 4.3 MiB [#####                            ] /cryostat-operator
 3.3 MiB [####                             ] /amq-streams
 3.1 MiB [####                             ] /amq-broker-rhel9
 2.2 MiB [##                               ] /openshift-gitops-operator
 1.9 MiB [##                               ] /tempo-product
 1.7 MiB [##                               ] /quay-operator
 1.5 MiB [#                                ] /devspaces
 1.4 MiB [#                                ] /quay-bridge-operator
 1.3 MiB [#                                ] /datagrid
 1.3 MiB [#                                ] /service-registry-operator
 1.1 MiB [#                                ] /kubevirt-hyperconverged
 1.1 MiB [#                                ] /businessautomation-operator
[...]

Copy link
Contributor

@msugakov msugakov Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks! opm migrate did not occur to me.
It's interesting that we are the third largest. Clearly, most of space isn't occupied by the olm.channels.

Here's a little study of this

$ opm migrate registry.redhat.io/redhat/redhat-operator-index:v4.19 ./
$ cd rhacs-operator

$ cat catalog.json | python3 -c 'import sys; size=len(sys.stdin.buffer.read()); print(f"{size:,}")'
11,665,037

$ jq '.' catalog.json | python3 -c 'import sys; size=len(sys.stdin.buffer.read()); print(f"{size:,}")'
9,129,759
# 22% less after jq, but that's ok

# Packages:
$ jq 'select(.schema == "olm.package")' catalog.json | python3 -c 'import sys; size=len(sys.stdin.buffer.read()); print(f"{size:,}")'
11,520

# Channels:
$ jq 'select(.schema == "olm.channel")' catalog.json | python3 -c 'import sys; size=len(sys.stdin.buffer.read()); print(f"{size:,}")'
94,486

# Bundles:
$ jq 'select(.schema == "olm.bundle")' catalog.json | python3 -c 'import sys; size=len(sys.stdin.buffer.read()); print(f"{size:,}")'
9,023,753

# Here are counts:
$ jq '.schema' catalog.json | uniq -c
      1 "olm.package"
     24 "olm.channel"
    122 "olm.bundle"

Comparing that to the biggest ansible-automation-platform-operator:

$ jq '.schema' catalog.json | uniq -c 
      1 "olm.package"
      4 "olm.channel"
     62 "olm.bundle"

# Byte sizes:
# olm.package - 12,666
# olm.channel - 10,458
# olm.bundle - 18,442,021

amq-broker-rhel8, the second biggest:

$ jq '.schema' catalog.json | uniq -c
      1 "olm.package"
      3 "olm.channel"
     69 "olm.bundle"
# Byte sizes:
# olm.package - 11,974
# olm.channel - 8,960
# olm.bundle - 12,703,918

We can also compare that to our FBC file:

$ du -h ./catalog-bundle-object/rhacs-operator/catalog.json
24M     ./catalog-bundle-object/rhacs-operator/catalog.json

$ jq '.schema' ./catalog-bundle-object/rhacs-operator/catalog.json | uniq -c
      1 "olm.package"
     24 "olm.channel"
    122 "olm.bundle"
      1 "olm.deprecations"

# Byte sizes:
# olm.package - 11,520
# olm.channel - 94,486
# olm.bundle - 24,838,426
# olm.deprecations - 5,092

Copy link
Contributor

@msugakov msugakov Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't deprecations that dominate the size. Channels are also small (despite O(N^2)).
The most space is taken by bundles. I think I see why it happens.

One way or another, we're among the biggest ones and it's clear that ACS catalogs will only get bigger over time.
We need to do something in order to prevent us forgetting that and things silently grow until the size is too big and everything is broken and we're in emergency. Certainly we can't stop releasing, nor should we.

We don't know how big is too big. Maybe nobody knows. We can start finding this out now, or we can postpone. If the former, the question who is "we", who would own this?
If we are to postpone, I suggest creating a ticket with this context (for later) and adding a simple check to the checks pipeline (now) that would fail as soon as either of ./catalog-*/rhacs-operator/catalog.json files grows bigger than 40MB.
When this check fails - the affected engineer bumps the size limit a bit more and flags the Install team has to prioritize working on the mentioned ticket.

How does that sound?
If you agree, let us please do it as a separate PR (again, need owner).

Copy link
Member Author

@kurlov kurlov Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we need such check. Created a ticket: https://issues.redhat.com/browse/ROX-32232

@kurlov kurlov requested a review from msugakov December 5, 2025 07:20
Copy link
Contributor

@msugakov msugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time, only thoughts on interdiff. We're converging.

kurlov and others added 2 commits December 5, 2025 13:40
Co-authored-by: Misha Sugakov <537715+msugakov@users.noreply.github.com>
@kurlov kurlov requested a review from msugakov December 5, 2025 14:55
@msugakov msugakov mentioned this pull request Dec 5, 2025
msugakov and others added 3 commits December 5, 2025 19:43
@kurlov kurlov requested a review from msugakov December 5, 2025 19:05
Copy link
Contributor

@msugakov msugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time has come to diff the yaml and do some final tests.

image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:f61189397263f05214c2d36b4dc0a71a924c2481a1e365b7fb3c71d8dfce6b27
- schema: olm.bundle
image: registry.redhat.io/advanced-cluster-security/rhacs-operator-bundle@sha256:b0590a2248d948f82e8a116e37a2be42f49a3edeb4a92d41416420ea604d5b34
- schema: olm.bundle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Now it's time for this.
Tom and I have figured that yq 'explode(.)' <filename> will inline all aliases/anchors and so this can be done on the catalog-template.yaml from master. Then send it together with this catalog-template.yaml to some YAML differ, and we should see what's changed.

I'll do this myself but I also recommend you trying too.

Copy link
Contributor

@msugakov msugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, this PR is good to be merged at any point.

@kurlov kurlov merged commit c1acf8e into master Dec 8, 2025
20 checks passed
@kurlov kurlov deleted the akurlov/add-catalog-generation-from-folder-structure branch December 8, 2025 10:24
@msugakov
Copy link
Contributor

msugakov commented Dec 8, 2025

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants