Tempest ssh to guest intermittently fails, "GROWROOT: NOCHANGE: partition 1 is size 2078687. it cannot be grown" seen in guest console log

Bug #1843610 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Gate
New
Undecided
Unassigned

Bug Description

Seen here for example:

https://00b9edcb114e0ac8e05a-b611493cf8fd4459149d00d14c03b361.ssl.cf5.rackcdn.com/670715/15/check/cinder-tempest-dsvm-lvm-lio-barbican/bb230c6/job-output.txt

2019-09-11 11:50:09.034110 | primary | sh: write error: No space left on device
2019-09-11 11:50:09.034181 | primary | Top of dropbear init script
2019-09-11 11:50:09.034251 | primary | Starting dropbear sshd: OK
2019-09-11 11:50:09.034376 | primary | GROWROOT: NOCHANGE: partition 1 is size 2078687. it cannot be grown
2019-09-11 11:50:09.034456 | primary | resize-rootfs already run per once
2019-09-11 11:50:09.034579 | primary | /run/cirros/datasource/data/user-data was not '#!' or executable

Note that this might not be the reason for the ssh failure into the guest, we could be hitting this in successful runs as well but only see this on ssh failure because that's when we dump the console log. Note that the network info was retrieved:

2019-09-11 11:50:30.311189 | primary | === network info ===
2019-09-11 11:50:30.311262 | primary | if-info: lo,up,127.0.0.1,8,,
2019-09-11 11:50:30.311377 | primary | if-info: eth0,up,10.1.0.14,28,fe80::f816:3eff:fec5:b98b/64,
2019-09-11 11:50:30.311465 | primary | ip-route:default via 10.1.0.1 dev eth0
2019-09-11 11:50:30.311561 | primary | ip-route:10.1.0.0/28 dev eth0 src 10.1.0.14
2019-09-11 11:50:30.311659 | primary | ip-route:169.254.169.254 via 10.1.0.1 dev eth0
2019-09-11 11:50:30.311749 | primary | ip-route6:fe80::/64 dev eth0 metric 256
2019-09-11 11:50:30.311864 | primary | ip-route6:unreachable default dev lo metric -1 error -101
2019-09-11 11:50:30.311952 | primary | ip-route6:ff00::/8 dev eth0 metric 256
2019-09-11 11:50:30.312068 | primary | ip-route6:unreachable default dev lo metric -1 error -101

We should, however, attempt to get rid of that growroot error so it's not a red herring in debugging.

19 hits in 7 days, check and gate, all failures:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22GROWROOT%3A%20NOCHANGE%3A%20partition%201%20is%20size%5C%22%20AND%20message%3A%5C%22it%20cannot%20be%20grown%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d

Matt Riedemann (mriedem)
description: updated
Revision history for this message
Clark Boylan (cboylan) wrote :

Note we only dump console logs during failures. It is possible that this happens on successful jobs too and isn't the cause of these failures (we just don't have that data).

That said I think fixing errors like this (the job in question should have a 1GB boot from volume disk) is likely to fix bugs and avoid distracting errors when debugging underlying issues.

Revision history for this message
melanie witt (melwitt) wrote :

This looks like a duplicate of:

https://bugs.launchpad.net/openstack-gate/+bug/1808010

Not sure if/what the difference is.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.