HDDS-14764. Allow Datanode to dynamically reconfigure SCM node addresses#9863
HDDS-14764. Allow Datanode to dynamically reconfigure SCM node addresses#9863adoroszlai merged 7 commits intoapache:masterfrom
Conversation
adoroszlai
left a comment
There was a problem hiding this comment.
Thanks @ivandika3 for the patch.
| if (reconfigurationHandler != null) { | ||
| try { | ||
| reconfigurationHandler.close(); | ||
| } catch (IOException e) { | ||
| LOG.error("DatanodeReconfigurationHandler stop failed", e); | ||
| } | ||
| } |
There was a problem hiding this comment.
Nice find about reconfigurationHandler not being closed. Should we fix it in separate patch, since it affects OM and SCM, too?
nit: can simplify?
import org.apache.hadoop.hdds.utils.IOUtils;
IOUtils.close(LOG, reconfigurationHandler);There was a problem hiding this comment.
Thanks, I'll remove it for now. Let's fix it in another patch.
Raised HDDS-14766.
| reconfigurationHandler = | ||
| new ReconfigurationHandler("DN", conf, this::checkAdminPrivilege) | ||
| new DatanodeReconfigurationHandler("DN", conf, this::checkAdminPrivilege) | ||
| .registerPrefix(ConfUtils.addKeySuffixes(OZONE_SCM_ADDRESS_KEY, true, scmServiceId)) |
There was a problem hiding this comment.
Shouldn't registerPrefix be called within if (scmServiceId != null) block?
There was a problem hiding this comment.
Yes, it should. Thanks, updated.
| return addSuffix(key, keySuffix); | ||
| String suffix = addSuffix(key, keySuffix); | ||
| if (withTrailingSeparator) { | ||
| suffix = suffix.concat("."); |
There was a problem hiding this comment.
I'd rather append . in DatanodeReconfigurationHandler.registerPrefix:
- safety: ensure that
prefixends with.instead of relying on caller to add it - simplicity: new
addKeySuffixesfunction just to append.seems overkill
| @Override | ||
| public List<String> listReconfigureProperties() throws IOException { | ||
| Set<String> reconfigureSet = new TreeSet<>(super.listReconfigureProperties()); | ||
| reconfigureSet.addAll(prefixProperties); |
There was a problem hiding this comment.
TestDatanodeReconfiguration.reconfigurableProperties needs to be updated.
https://github.com/apache/ozone/actions/runs/22660446735/job/65680352472?pr=9863#step:13:5791
There was a problem hiding this comment.
Currently TestDatanodeReconfiguration.reconfigurableProperties has null scmServiceId so it should not show up in the listReconfigureProperties after the recent commits.
adoroszlai
left a comment
There was a problem hiding this comment.
Thanks @ivandika3 for updating the patch, LGTM, just a question about a possible improvement.
| if (scmServiceId != null) { | ||
| reconfigurationHandler.register(OZONE_SCM_NODES_KEY + "." + scmServiceId, | ||
| this::reconfigScmNodes); | ||
| ((DatanodeReconfigurationHandler) reconfigurationHandler) |
There was a problem hiding this comment.
Should reconfigurationHandler be declared as DatanodeReconfigurationHandler to avoid this cast? (Also need to split instantiation and register() calls.) Or even add prefix support in ReconfigurationHandler and skip introducing the new subclass?
There was a problem hiding this comment.
Thanks, since DatanodeReconfigurationHandler evolved to supporting prefix (instead of the initial implementation that's specific to DN), let's just merge it to ReconfigurationHandler
|
Thanks @ivandika3 for the patch. |
|
Thanks @adoroszlai for the review. |
What changes were proposed in this pull request?
HDDS-13890 supports reconfiguring oozone.scm.nodes."service", but not ozone.scm.address."service"."node" since reconfiguration framework requires reconfigurable properties to specified in advance.
We can support to allow fully zero datanode restarts for SCM reconfiguration.
However, this requires two reconfigurations
If both configurations are changed at the same time, the ordering of the reconfiguration is not determined and can cause unexpected behaviors.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14764
How was this patch tested?
Update IT.
Clean CI: https://github.com/ivandika3/ozone/actions/runs/22653730808