@@ -233,11 +233,14 @@ Potential issues
233233 .. code-block :: yaml
234234
235235 mariabackup_image_full : " {{ docker_registry }}/stackhpc/rocky-source-mariadb-server:yoga-20230310T170929"
236- - When using Octavia load balancers, restarting Neutron causes load balancers
237- with floating IPs to stop processing traffic. See `LP#2042938
238- <https://bugs.launchpad.net/neutron/+bug/2042938> `__ for details. The issue
239- may be worked around after Neutron has been restarted by detaching then
240- reattaching the floating IP to the load balancer's virtual IP.
236+ - When using Octavia load balancers, restarting Neutron causes load balancers
237+ with floating IPs to stop processing traffic. See `LP#2042938
238+ <https://bugs.launchpad.net/neutron/+bug/2042938> `__ for details. The issue
239+ may be worked around after Neutron has been restarted by detaching then
240+ reattaching the floating IP to the load balancer's virtual IP.
241+
242+ - If you are using hyper-convered Ceph, please also note the potential issues
243+ in the Storage section below.
241244
242245Full procedure for one host
243246---------------------------
@@ -466,6 +469,45 @@ Potential issues
466469 be identical, now that the "maintenance mode approach" is being used.
467470 It is still recommended to do the bootstrap host last.
468471
472+ - Prior to reprovisioning the bootstrap host, it can be beneficial to backup
473+ ``/etc/ceph `` and ``/var/lib/ceph ``, as sometimes the keys, config, etc.
474+ stored here will not be moved/recreated correctly.
475+
476+ - When a host is taken out of maintenance, you may see errors relating to
477+ permissions of /tmp/etc and /tmp/var. These issues should be resolved in
478+ Ceph version 17.2.7. See issue: https://github.com/ceph/ceph/pull/50736. In
479+ the meantime, you can work around this by running the command below. You may
480+ need to omit one or the other of ``/tmp/etc `` and ``/tmp/var ``. You will
481+ likely need to run this multiple times. Run ``ceph -W cephadm `` to monitor
482+ the logs and see when permissions issues are hit.
483+
484+ .. code-block :: console
485+
486+ kayobe overcloud host command run --command "chown -R stack:stack /tmp/etc /tmp/var" -b -l storage
487+
488+
489+ - It has been seen that sometimes the Ceph containers do not come up after
490+ reprovisioning. This seems to be related to having ``/var/lib/ceph
491+ ``persisted through the reprovision (e.g. seen at a customer in a volume
492+ with software RAID). (Note: further investigation is needed for the root
493+ cause). When this occurs, you will need to redeploy the daemons:
494+
495+ List the daemons on the host:
496+
497+ .. code-block:: console
498+
499+ ceph orch ps <hostname>
500+
501+
502+ Redeploy the daemons, one at a time. It is recommended that you start with
503+ the crash daemon, as this will have the least impact if unexpected issues
504+ occur.
505+
506+ .. code-block:: console
507+
508+ ceph orch daemon redeploy <daemon name> to redeploy a daemon.
509+
510+
469511- Commands starting with ``ceph `` are all run on the cephadm bootstrap
470512 host in a cephadm shell unless stated otherwise.
471513
0 commit comments