So when nova configures cells v2 deployment it uses localhost in transport_url and saves such url in cell mapping. When subnode-2 wants to start pre live migration steps on newer node, it takes cell_mapping from database and uses transport url which points to localhost. Unfortunately rabbit is not running on the same node, therefore every time LM+grenade job fails on starting pre live migration steps.
Finally found the root cause. This patch forces grenade to set cells v2
https:/ /github. com/openstack- dev/grenade/ commit/ ecfaa7a5a235a69 a0ead8b53dd4b28 07093c9cfa
After this commit LM+grenade job started to fail every time. This is because grenade forces RABBIT_HOST to be localhost on controller node
https:/ /github. com/openstack- dev/grenade/ blob/b7ae1b4660 2e34d53b5489ba9 e0a7cb73d6ea418 /devstack. localrc. base#L21 /github. com/openstack- dev/grenade/ blob/b7ae1b4660 2e34d53b5489ba9 e0a7cb73d6ea418 /devstack. localrc. target# L21
https:/
So when nova configures cells v2 deployment it uses localhost in transport_url and saves such url in cell mapping. When subnode-2 wants to start pre live migration steps on newer node, it takes cell_mapping from database and uses transport url which points to localhost. Unfortunately rabbit is not running on the same node, therefore every time LM+grenade job fails on starting pre live migration steps.