AGENT-1416: Add default NodeDisruptionPolicy for IRI#5683
AGENT-1416: Add default NodeDisruptionPolicy for IRI#5683bfournie wants to merge 1 commit intoopenshift:mainfrom
Conversation
Avoid node reboot upon deletion of the IRI resource as its not necessary. Added "None" action policy for the following which are affected by the IRI resource deletion: Files: 1. /etc/iri-registry - TLS certificates directory 2. /usr/local/bin/load-registry-image.sh - Registry loading script 3. /var/lib/iri-registry - Registry data directory and subdirs Units: 1. iri-registry.service - The systemd service that runs the registry
|
@bfournie: This pull request references AGENT-1416 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@bfournie: This pull request references AGENT-1416 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@bfournie: This pull request references AGENT-1416 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cc @andfasano |
@bfournie I think that as part of this task only the first point could be relevant to be tested |
|
@bfournie: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Testing confirmed new policies have been added as below Applied a test file to /var/lib/iri-registry/ and confirmed no node reboot. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bfournie, zaneb The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| }, | ||
| }, | ||
| { | ||
| Path: "/var/lib/iri-registry", |
There was a problem hiding this comment.
Q: since there's no part in the IRI machine configs about /var/lib/iri-registry (that part will be handled by the daemon directly), is that really required?
There was a problem hiding this comment.
(Not sure if the SpecialStatusAction may be required instead?)
| Name: "iri-registry.service", | ||
| Actions: []opv1.NodeDisruptionPolicyStatusAction{ | ||
| { | ||
| Type: opv1.NoneStatusAction, |
There was a problem hiding this comment.
I'm not really sure this is going to work? I was expecting at least a ReloadStatusAction
| Type: opv1.NoneStatusAction, | ||
| }, | ||
| }, | ||
| }, |
There was a problem hiding this comment.
What about the IRI TLS ca cert added to the node trust store? IIRC (please @djoshy make me honest) that may force a node reboot in any case
Avoid node reboot upon deletion of the IRI resource as its not necessary.
Added "None" action policy for the following which are affected by the IRI resource deletion:
Files:
Units:
- What I did
Added default NodeDisruptionPolicy for InternalReleaseImage
Added "None" action policy for the following which are affected by the IRI resource deletion:
Files:
Units:
- How to verify it
Check that the new policy is added in the
nodeDisruptionPolicyStatus$ oc get MachineConfiguration/cluster -o yamlCheck that a IRI resource exists:
$ oc get internalreleaseimage cluster -n openshift-machine-config-operatorDelete the InternalReleaseImage resource:
$ oc delete internalreleaseimage cluster -n openshift-machine-config-operator
Check logs
$ oc logs -n openshift-machine-config-operator | grep -i "node disruption|iri-registry"
Confirm
- Description for the changelog
Added default NodeDisruptionPolicy for InternalReleaseImage