Ubuntu
juju-core package

machine unit connects to apiserver but stays in agent-state: pending

Bug #1393444 reported by JuanJo Ciarlante on 2014-11-17

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	juju-core	Invalid	Medium	Unassigned
	juju-core (Ubuntu)	Invalid	Undecided	Unassigned

Bug Description

FYI this is the same environment from lp#1392810 (1.18->1.19->1.20),
juju version: 1.20.11-trusty-amd64

New units deployed (to LXC over maas) stay at "agent-state: pending":
http://paste.ubuntu.com/9057045/

#1 TCP connects ok to node0:17070
- at the unit:
ubuntu@juju-machine-18-lxc-5:~$ netstat -tn
tcp 0 0 x.x.x.167:57937 x.x.x.8:17070 ESTABLISHED

- at node0:
ubunte@node0:~$ sudo netstat -tnp|grep 167
tcp6 0 3807 x.x.x.8:17070 x.x.x.167:57937 ESTABLISHED 1993/jujud

Interesting there is that node0's socket tcp receive queue (3807 bytes)
is not being read by jujud.

#2 machine-0.log:
- nothing shows at unit's connection time
(ie restart jujud-machine-18-lxc-5)

- after 4~5minutes, connection drops, and this is logged:
2014-11-17 14:28:56 ERROR juju.state.apiserver.common resource.go:102 error stopping *apiserver.pingTimeout resource: ping timeout

Tags:

JuanJo Ciarlante (jjo) on 2014-11-17

tags:	added: canonical-bootstack
tags:	added: canonical-is
summary:	- machine unit connects to apiserver but doesn't deploy service + machine unit connects to apiserver but stays in agent-state: pending

Curtis Hovey (sinzui) on 2014-11-17

Changed in juju-core:
status:	New → Triaged
importance:	Undecided → High
tags:	added: upgrade-juju
tags:	added: lxc
Changed in juju-core:
milestone:	none → 1.22

Revision history for this message

JuanJo Ciarlante (jjo) wrote on 2014-11-17:

/var/log/juju/machine-18-lxc-5.log: http://paste.ubuntu.com/9057287/
NOTE there the repeated log stanzas are because of my manual restarts.

Revision history for this message

JuanJo Ciarlante (jjo) wrote on 2014-11-17:

strace at both sides (grepped for specific sockets): http://paste.ubuntu.com/9057691/,
mind the subsecond date diff.

Revision history for this message

JuanJo Ciarlante (jjo) wrote on 2014-11-19:

This deployment has 2 metal nodes hosting LXC units (machine:
0, 18), then 'juju deploy cs:ubuntu --to lxc:0' does ok, while
'--to lxc:18' was consistently failing as described above.

FYI I've worked around this by removing machine 18 down to
'maas ready' and reacquiring it from juju, now all new LXC
units there behave normally.

IMO still worth digging what state bits left there for that
machine were triggering this issue, copied a juju backup
tarball to ~natefinch in case this is feasible.

Canonical Juju QA Bot (juju-qa-bot) on 2015-01-13

Changed in juju-core:
milestone:	1.22-alpha1 → 1.23

Curtis Hovey (sinzui) on 2015-01-21

Changed in juju-core:
milestone:	1.23 → none
importance:	High → Medium

Revision history for this message

JuanJo Ciarlante (jjo) wrote on 2015-01-30:

@sinzui: closing this as invalid, as I later confirmed this to be a MTU issue.

Changed in juju-core:
status:	Triaged → Invalid
Changed in juju-core (Ubuntu):
status:	New → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntujuju-core package

machine unit connects to apiserver but stays in agent-state: pending

Bug Description

Other bug subscribers

Remote bug watches

Ubuntu
juju-core package