Add ability to sync specific architectures.#3128
Add ability to sync specific architectures.#3128lukaszraczylo wants to merge 9 commits intoproject-zot:mainfrom
Conversation
| { | ||
| "prefix": "/repo4/**", | ||
| "tags": { | ||
| "excludeRegex": ".*-(amd64|arm64)$" |
There was a problem hiding this comment.
tag-based exclusion if platform info is in the image name?
There was a problem hiding this comment.
I'd say "meh". Platform info in the image name does not mean anything to be honest. I've seen ( and even accidentally released myself ) quite a few images tagged with one platform declared as part of the tag and image itself / binaries within being different platform.
|
@lukaszraczylo thanks for your PR. This has been a long standing ask from the community. |
|
@rchincha I'll work on the suggestions as well, and I'd PR it faster, but I'm using zotregistry for past two days :D |
If you able to test locally, you can iterate a lot quicker. |
e752310 to
0c1c99f
Compare
|
@rchincha sorted. I also have an idea re: resources usage improvement, but I'll leave it for the next PR to not scope creep too much :) |
|
|
||
| // ParsePlatform parses a platform string into a Platform struct | ||
| // The string can be in the format "os/arch" or just "arch" | ||
| func ParsePlatform(platform string) Platform { |
There was a problem hiding this comment.
Ok, instead of a nested map, it is a flat string with a '/' separator
Looking forward to that also. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3128 +/- ##
==========================================
- Coverage 90.79% 90.34% -0.46%
==========================================
Files 172 172
Lines 32385 32577 +192
==========================================
+ Hits 29405 29431 +26
- Misses 2241 2403 +162
- Partials 739 743 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@rchincha I resolved all the issues / suggestions from what I see :) If anything else - I'll pick it up tomorrow |
rchincha
left a comment
There was a problem hiding this comment.
Call this config-sync-platforms.json instead
rchincha
left a comment
There was a problem hiding this comment.
test/blackbox tests - just make one file right?
3eecc29 to
e34d4e1
Compare
|
@rchincha that's done. Also rewrote the commit messages to adhere to the rules |
andaaron
left a comment
There was a problem hiding this comment.
@eusebiu-constantin-petu-dbk , please review this PR
@rchincha when sync-ing an index, do we want an exact copy, or a new index containing just a subset of the synced manifests? The image would contain invalid references in the 1st case, and digest would change in the 2nd case. There is also a 3rd option of saving the new content with the old digest, but I don't know what the implications would be, since it doesn't respect the content addressability.
| tempStoreController storage.StoreController, // temp store controller | ||
| metaDB mTypes.MetaDB, | ||
| log log.Logger, | ||
| config ...*syncconf.RegistryConfig, // optional config for filtering |
| return err | ||
| } | ||
|
|
||
| manifestContent = updatedContent |
There was a problem hiding this comment.
Note this will break any referrer pointing to the original index digest.
Unless the same digest is kept in the filename, but a different one would be computed based on the new filename, are we doing this?
|
|
||
| for _, spec := range platformSpecs { | ||
| parts := strings.Split(spec, "/") | ||
| if len(parts) == 2 { |
There was a problem hiding this comment.
- Do we need to handle the variant here as well?
- Maybe we should have a platform method for matching against a different platform object (including partial matches in case filtering is done just for a subset of platform attributes of course)?
| } | ||
|
|
||
| // shouldIncludeArchitecture determines if an architecture should be included in the sync | ||
| // DEPRECATED: Use shouldIncludePlatform instead |
There was a problem hiding this comment.
Why commit this code if it is deprecated?
| if !service.shouldIncludePlatform(platform) { | ||
| platformDesc := platform.Architecture | ||
| if platform.OS != "" { | ||
| platformDesc = platform.OS + "/" + platform.Architecture |
There was a problem hiding this comment.
- Do we need to handle variant here?
- Maybe we should have a Platform method to return the os/arch/variant string for that instance.
|
Sorry for being late to the party :D. Why do you filter platforms in destination.go ? It will commit whatever it finds inside the synced repo, so if you filter upstream there's no need to filter again. Or am I missing something? Also there is an API in regclient to filter platforms, I'll make a PR based on this one. |
|
@lukaszraczylo more folks are hitting this and complaining. https://cloud-native.slack.com/archives/C03EGRE4QGH |
|
@rchincha I may try to sit down on it this weekend although I'm a little bit busy with family and baby on the way so no promises |
|
@lukaszraczylo pls do let us know if you are able to continue work on this PR. |
|
@rchincha, sure - I can resume the work, although I'm disappointed with the review style and last-minute submissions of the change requests, as they've already been working on the previous comments, making it a bit of a wild goose chase. |
@lukaszraczylo feedback noted. Just so you are aware, the contributors and maintainers of this project work from diverse locations/timezones and mostly contributing their free time, hence feels a little choppy at times - overall the goal is to continue delivering a high quality project. |
@rchincha let's sync on this before more effort is put in |
|
For example, upstream repo has: Assuming we want to sync/mirror only archA, we can either make:
(OR)
|
|
There appears to be a notion of "sparse index" which is option 1. We should probably implement that. We should fix any GC issues that pop up. |
|
Ok, let's go for the sparse index. We'll need to make a list of all cases where we are walking repo content and check if we handle missing references nicely. |
|
@lukaszraczylo I have merged #3503 today. |
|
@lukaszraczylo Are you still working on this? Would love to limit the data we pull from docker hub as well :) |
|
Hi, I'm a new zot user, then I did find this issue/pr |
What type of PR is this?
Feature
Which issue does this PR fix:
Adds ability to specify the images architecture for mirroring.
What does this PR do / Why do we need it:
Instead of pulling 10s of GB of images for different architectures, you can now pull only architectures you use.
If an issue # is not available please add repro steps and logs showing the issue:
Testing done on this change:
Automation added to e2e:
Will this break upgrades or downgrades?
Meh.
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.