Skip to content

Ensure task list is single element, introduce upgrade task#252

Merged
nephio-prow[bot] merged 37 commits intonephio-project:mainfrom
nokia:issue-892
Jul 30, 2025
Merged

Ensure task list is single element, introduce upgrade task#252
nephio-prow[bot] merged 37 commits intonephio-project:mainfrom
nokia:issue-892

Conversation

@mozesl-nokia
Copy link
Copy Markdown
Collaborator

Closes #725

Quite a big PR, co-developed with @dgyorgy-nokia, implementing @kispaljr suggestions in #725.

Done in this PR:

  • Reduce task list size to 1 element (init, edit, clone or the new upgrade)
  • Effectively deprecate all other non-initial task types (eval/render, patch, update)
  • Deprecate the porchctl rpkg update command, add a new porchctl rpkg upgrade command with basic testing
  • Introduce a new upgrade initial task, which performs a semi-custom 3-way merge on the input PackageRevision(Resource)s if the strategy is resource-merge

Known limitations:

  • The current 3-way-merge implementation will always choose the new upstream version when there is a conflict
  • CRDs must be part of the package itself in order for them to be picked up into the schema. This is relevant when merging associative lists. However, associative list inference is enabled, but it is pretty basic and only works if the associative key is a name field.
  • Upgrade only work or published PackageRevisions (but this is artificially limited)
  • Remove reclone and replay.

Not done in this PR:

  • Remove the deprecated task types
  • Extensively test new merging and upgrade
  • Configurability of the 3-way-merge
  • Clean up unnecessary task/mutation-related code
  • Remove deprecated test cases

Other changes:

  • Make targets for reloading specific components on the cluster (rebuild image, load image, restart deployment)
  • Support authentication for porchctl rpkg clone
  • Pass namespace correctly in the clone mutation
  • Utility test file to run a custom cli e2e test based on folder name (very hacky, just meant for debugging). The repository related params are no longer necessary since E2E testing : Test to scale up SMF ( Instantiating a new POD with more resources) nephio#240 is merged.
  • Utility function to add mutators to the Kptfile in the E2E tests, since eval is deprecated.
  • Smaller enhancements to kpt-function-sdk

@nephio-prow nephio-prow Bot requested review from henderiw and kispaljr May 14, 2025 16:40
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented May 14, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

Comment thread pkg/util/merge3/merge3.go

func Merge(original, updated, destination fn.KubeObjects, additionalSchemas []byte) (fn.KubeObjects, error) {
if additionalSchemas != nil {
if err := openapi.AddSchema(additionalSchemas); err != nil {
Copy link
Copy Markdown
Contributor

@nagygergo nagygergo May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the openapi destroyed after every merge operation, or this leaks memory with the increasing number of CRDs that a porch server has seen?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check the kyaml code here. It looks like it will not re-add the same schema for the same kind, but yes, I think it persists between merges.

If this is a concern, we can call openapi.ResetOpenAPI() after every merge, but I assume that also axes all the built-in schemas.

Copy link
Copy Markdown
Contributor

@nagygergo nagygergo May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was reading through the kyaml code + readme a bit.
The compiled-in k8s api version stayed at 1.21, while the docs describe that this should be bumped, noone really bothered to do that.
Either Porch should onboard openapis for supported k8s versions here, or give a way for schemas that don't have an associated CRD be onboarded (for example ValidatingAdmissionPolicies as a recent addition to the default k8s API).
I think that this part of the kyaml implementation could use some patching up, i.e.: the openapi.ResetOpenAPI() should also reset the globalschema.schemaInitso that the defaults are re-loaded the next time some function runs SchemaForResourceType

Also I find it troublesome that the kyaml implementation maintains a single schemalock RWmutex, so only a single routine can execute.

My 2 cents out of this is that probably we need to update the compiled-in version from 1.21, the openapi indeed should not be reset, because it operates on global variables and can cause race conditions or pancis between concurrent processes and that we probably need to refactor the kyaml so that it allows for creating local openapi schema registries so that porch won't leak memory by storing more and more CRDs.

@nagygergo
Copy link
Copy Markdown
Contributor

General comment/question about CRD usage in packages.
Would it make sense to have a special annotation that tells Porch to start consuming the CRD for 3way merge, so this is an opt-in behavior. Also, when writing documentation for this, it would be important to highlight that CRDs that are only in the package for porch 3way merge should be marked with config.kubernetes.io/local-config: "true" for ConfigSync/Flux and whatever the equivalent of that is for Argo.

@nagygergo
Copy link
Copy Markdown
Contributor

This is a question I should've asked before this went into implementation, but if porchctl rpkg upgrade is always creating a new packageRevision, is it possible that packageVariant controller that doesn't restrict on a specific version of the upstream package would create a new draft packageRevision each time it sees a change? Is that even a problem, or intended behavior?

@mozesl-nokia
Copy link
Copy Markdown
Collaborator Author

Would it make sense to have a special annotation that tells Porch to start consuming the CRD for 3way merge, so this is an opt-in behavior.

Not a bad idea, imo.

...is it possible that packageVariant controller that doesn't restrict on a specific version of the upstream package would create a new draft packageRevision each time it sees a change?

If there is a change in the upstream it has to create a new PR with the upgrade task, so yes, if there is a constantly changing upstream, it will keep making new downstream drafts (but only if the previous downstream got published, since the upgrade inputs must be published).

Comment thread api/porch/v1alpha1/types.go
}

func loadResourcesFromDirectory(directoryPath string, mergeSourceAnnotation string) (fn.KubeObjects, error) {
exclusions, err := findExclusions(directoryPath)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What resources are getting excluded?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resources in subpackages

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... It sounds like it has a lot of implications on how subpackages need to behave, and I won't have time to think it through today, will come up with something over the weekend if that's ok.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fully undocumented territory of Porch, but as I understand how subpackages work in Porch:

  1. You can create packages with subpacakges. The rendering works the same as it works with KPT, with a depth-first render of all subpackages.
  2. If a package with subpackages is not merged to main, the subpackages don't appear on the API.
  3. If a package with subpackages is merged to main, then eventually subpackages will be discovered on main.
  4. For the subpackages, no separate versioned packages will be created.
  5. If a package with subpackages is changed (updated either via update, or manually), then the subpackages are re-rendered.

From this I think No.1 and No.5 is important to keep.

Would it be possible that the 3way Merge also considers the sub-packages as well? Is there any disadvantages of that?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, this was hardcoded to false here so we decided to keep it.

IncludeSubPackages: false,

Curiously though, here it looks like we are only updating subpackages...

// Find all subpackages in local, upstream and original. They are sorted
// in increasing order based on the depth of the subpackage relative to the
// root package.
subPkgPaths, err := pkgutil.FindSubpackagesForPaths(pkg.Local, true,
options.LocalPath, options.UpdatedPath, options.OriginPath)
if err != nil {
return errors.E(op, types.UniquePath(options.LocalPath), err)
}
// Update each package and subpackage. Parent package is updated before
// subpackages to make sure auto-setters can work correctly.
for _, subPkgPath := range append([]string{"."}, subPkgPaths...) {
isRootPkg := false
if subPkgPath == "." && options.IsRoot {
isRootPkg = true
}
localSubPkgPath := filepath.Join(options.LocalPath, subPkgPath)
updatedSubPkgPath := filepath.Join(options.UpdatedPath, subPkgPath)
originalSubPkgPath := filepath.Join(options.OriginPath, subPkgPath)
err := u.updatePackage(subPkgPath, localSubPkgPath, updatedSubPkgPath, originalSubPkgPath, isRootPkg)
if err != nil {
return errors.E(op, types.UniquePath(localSubPkgPath), err)
}
}

Of course, we can just set this exclusion to empty and see what happens if we want to include subpackages.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was testing locally, setting the exclusions to an empty array exclusions := []string{} and IncludeSubpackages: true, the subfolders seems to get the upgrade treatment alright.

image

I think this behaviour would be more aligned to what a user would expect, especially if we are planning to deprecate the update with copy-merge which was the most simple strategy to handle conflicts, and it handled sub-folders.

Comment thread pkg/cli/commands/rpkg/docs/docs.go Outdated
Comment thread pkg/cli/commands/rpkg/docs/docs.go Outdated
Comment thread pkg/engine/engine.go
Comment thread pkg/task/upgrade.go Outdated
Comment thread pkg/util/merge3/resource_matcher.go Outdated
Comment thread internal/kpt/util/update/resource-merge.go
Comment thread third_party/GoogleContainerTools/kpt-functions-sdk/go/fn/object.go
@kispaljr
Copy link
Copy Markdown
Member

is it possible that packageVariant controller that doesn't restrict on a specific version of the upstream package would create a new draft packageRevision each time it sees a change? Is that even a problem, or intended behavior?

The upstream package revision is a mandatory field of the PackageVariant, so there is no such PV that "doesn't restrict on a specific version of the upstream package". You explicitly have to change the upstream package revision in the PV so that an upgrade happens, and that will create a new draft. This is exactly how PV worked before, and there is no change is this behvior now, with the only exception that upgrade will be rejected until a draft pkgrev is present in the downstream package.

@nagygergo
Copy link
Copy Markdown
Contributor

You explicitly have to change the upstream package revision in the PV so that an upgrade happens, and that will create a new draft

Sorry, worked from memory instead of testing recent versions. I remember that when the revision main was specified before, packageVariant was tacking it through updates to the main packageRevision. Tested it, it's not doing the same anymore, but I couldn't roll back far enough in history to find where that was dropped off (or my memory is faulty).

@mozesl-nokia
Copy link
Copy Markdown
Collaborator Author

/retest

@mozesl-nokia
Copy link
Copy Markdown
Collaborator Author

Which of these threads can be resolved?

@nagygergo
#252 (comment)
#252 (comment)
#252 (comment)
#252 (comment)

@lapentad
#252 (comment)
#252 (comment)

Also, what else needs to be addressed in this PR? Should I try fixing the sonar issues? The duplication seems unavoidable unless I remove the update command altogether or refactor discover into a common place.

@mozesl-nokia
Copy link
Copy Markdown
Collaborator Author

/retest

Copy link
Copy Markdown
Collaborator

@efiacor efiacor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big PR. Should really be 2. Task list and 3way merge.
Haven't tested the 3way merge but I did run the chnages through our test-infra e2e and as discussed there are issues with infinite loops of commits on the packageRevision(Resources)

Comment thread api/porch/v1alpha1/types.go
Comment thread api/porch/v1alpha1/types.go Outdated
Comment thread api/porch/v1alpha1/types.go Outdated
downstream, err = r.copyPublished(ctx, downstream, pv, prList)
downstream, err = r.createEditDraft(ctx, downstream, pv, prList)
if err != nil {
klog.Errorf("package variant %q failed to copy %q: %s", pv.Name, oldDS.Name, err.Error())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the error msg to reflect the Task type?

Comment thread pkg/task/generictaskhandler.go
Comment thread pkg/task/upgrade.go Outdated
Comment thread test/e2e/cli/testdata/rpkg-update/config.yaml Outdated
Comment thread test/e2e/e2e_test.go Outdated
Comment thread test/e2e/update_test.go Outdated
Copy link
Copy Markdown
Collaborator

@efiacor efiacor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still quite a few unresolved comments. The github UI sometimes "collapses" review comments so you need to expand them.

Comment thread api/porch/v1alpha1/types.go
LocalPackageRevisionRef: porchapi.PackageRevisionRef{
Name: source.Name,
},
Strategy: porchapi.ResourceMerge,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Does it mean if the user uses a PV it will always use ResourceMerge?

Comment thread pkg/engine/engine.go
@mozesl-nokia
Copy link
Copy Markdown
Collaborator Author

Anybody else's opinion on this? #252 (comment)
@nagygergo @efiacor @Catalin-Stratulat-Ericsson

If nobody I'll revert it.

@kushnaidu kushnaidu self-requested a review July 18, 2025 10:18
@liamfallon
Copy link
Copy Markdown
Member

@mozesl-nokia I wonder can we get a rebase on this PR. It's very nearly ready to go in now. Also if @efiacor @nagygergo and @lapentad could check the outstanding comments, that would be great.

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
72.2% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@efiacor
Copy link
Copy Markdown
Collaborator

efiacor commented Jul 30, 2025

/approve
/lgtm

@nephio-prow nephio-prow Bot added the lgtm #ededed label Jul 30, 2025
@nephio-prow
Copy link
Copy Markdown
Contributor

nephio-prow Bot commented Jul 30, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: efiacor, mozesl-nokia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nephio-prow nephio-prow Bot added the approved #ededed label Jul 30, 2025
@nephio-prow nephio-prow Bot merged commit a7c09d4 into nephio-project:main Jul 30, 2025
10 checks passed
@mozesl-nokia
Copy link
Copy Markdown
Collaborator Author

PS: My attempt at handling/mitigating the infinite commit loop discovered in the free5gc test can be found in these branches:
https://github.com/nokia/nephio-nephio/tree/issue-892-inprogress
https://github.com/nokia/nephio-porch/tree/issue-892-inprogress

My fix attempt focused on preventing porch from creating or pushing a commit if the file contents are unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved #ededed lgtm #ededed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PIP: Prevent PackageRevision objects to grow indefinitely

8 participants