lost console output early in boot

Bug #682831 reported by Scott Moser
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Fix Released
Undecided
Unassigned
Lucid
Won't Fix
Undecided
Unassigned
linux-ec2 (Ubuntu)
Invalid
Undecided
Unassigned
Lucid
Invalid
Undecided
Unassigned
plymouth (Ubuntu)
Fix Released
Medium
Unassigned
Lucid
Won't Fix
Undecided
Unassigned

Bug Description

Binary package hint: plymouth

I see some lost output in the ec2 console log when I boot via pv-grub (meaning a kernel and ramdisk is loaded from inside the instance).

Some info:
 * changing 'console=hvc0' to 'console=tty0' does not seem to change behavior
 * the "lost" messages are in /var/log/boot.log
 * disabling plymouth fixes the problem (sudo sh -c 'for x in /etc/init/plymouth*; do mv ${x} ${x}.disabled; done')
 * I found this when attempting to get lucid able to use pv-grub kernels (Bug 671103)
 * We saw a similar issue in maverick, but when adding 'console=hvc0' it went away (Bug 606373)

Related bugs:
 * bug 1235231: plymouth loses output to /dev/console (such as ci-info: messages)

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: plymouth 0.8.2-2ubuntu2.2
ProcVersionSignature: User Name 2.6.32-310.20-ec2 2.6.32.24+drm33.11
Uname: Linux 2.6.32-310-ec2 i686
Architecture: i386
CurrentDmesg: [ 13.731707] eth0: no IPv6 routers present
Date: Mon Nov 29 19:37:00 2010
DefaultPlymouth: Error: command ['readlink', '/etc/alternatives/default.plymouth'] failed with exit code 1:
Ec2AMI: ami-8c0c5cc9
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-west-1b
Ec2InstanceType: t1.micro
Ec2Kernel: aki-99a0f1dc
Ec2Ramdisk: unavailable
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1:
ProcCmdLine: root=UUID=ce492d03-a18c-4075-871e-1631ed7e0244 ro console=hvc0
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcModules: ipv6 243034 10 - Live 0xe7a5f000
SourcePackage: plymouth
TextPlymouth: /lib/plymouth/themes/ubuntu-text/ubuntu-text.plymouth

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

I'm marking 'fix released' as this is not an issue on maverick (worked around by adding 'console=hvc0' under Bug 606373)

Changed in plymouth (Ubuntu):
importance: Undecided → Medium
status: New → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

I've nominated this bug for lucid.
We've got all the support into lucid for booting via pv-grub, and recent daily images have an appropriate /boot/grub/menu.lst inside and will "just work".

I've verified today that on both i386 and amd64 early console output is lost. I've also verified that the lost data is in /var/log/boot.log

Revision history for this message
Scott Moser (smoser) wrote :

attached init wrapper

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

just for reference here, the i386 log accompanies the following. Note, that boot.log indeed had our missing data.

$ cat /var/log/boot.log
MY_HACK: line 13: /handler: Read-only file system
[0]: 1, could not write to /handler
init.real: console-setup main process (178) terminated with status 1
[1]: wrote to /handler after 2
[1]: 149 reports i=1
[2]: 149 reports i=2
[3]: 149 reports i=3
[4]: 149 reports i=4
[5]: 149 reports i=5
[6]: 149 reports i=6
[7]: 149 reports i=7
cloud-init running: Wed, 15 Dec 2010 22:02:12 +0000. up 8.47 seconds
init.real: ureadahead-other main process (431) terminated with status 4
[8]: 149 reports i=8
 * Starting AppArmor profiles [ OK ]
[9]: 149 reports i=9
 * Starting Postfix Mail Transport Agent postfix [10]: 149 reports i=10
                                                                         [ OK ]
[11]: 149 reports i=11
landscape-client is not configured, please run landscape-config.
ubuntu@ip-10-112-50-141:~$ uname -a
Linux ip-10-112-50-141 2.6.32-311-ec2 #23-Ubuntu SMP Thu Dec 2 11:14:40 UTC 2010 i686 GNU/Linux

Revision history for this message
Scott Moser (smoser) wrote :

Well, I think I've eliminated plymouth as the source of the problem. I did:
a.) booted ami-2c738545 with pv-grub kernel aki-407d9529
us-east-1 ami-2c738545 canonical ebs/ubuntu-lucid-daily-i386-server-20101214
us-east-1 aki-407d9529 amazon pv-grub-hd0-V1.01-i386.gz
$ cat /proc/cmdline
root=LABEL=uec-rootfs ro console=hvc0
$ dpkg -S /boot/vmlinuz-$(uname -r)
linux-image-2.6.32-310-ec2: /boot/vmlinuz-2.6.32-310-ec2

b.) wrote a MARKER message to /dev/console and added the /sbin/init hack
c.) rebooted
d.) collected console
e.) installed from maverick-updates:
libdrm-nouveau1_2.4.21-1ubuntu2.1_i386.deb
libplymouth2_0.8.2-2ubuntu5.1_i386.deb
plymouth_0.8.2-2ubuntu5.1_i386.deb
plymouth-theme-ubuntu-text_0.8.2-2ubuntu5.1_i386.deb
f.) wrote a MARKER mesage to /dev/console
g.) rebooted
h.) collected console (attached here)

The maverick plymouth with lucid kernel still shows the issue.
Next thing to try swapping out is the kernel.

Revision history for this message
Scott Moser (smoser) wrote :

After doing all of the previous comment, I replaced the lucid -ec2 kernel with maverick -virtual kernel:
$ cat /proc/cmdline
root=LABEL=uec-rootfs ro console=hvc0
$ dpkg -S /boot/vmlinuz-$(uname -r)linux-image-2.6.35-23-virtual: /boot/vmlinuz-2.6.35-23-virtual

All data got to the console.

Revision history for this message
Scott Moser (smoser) wrote :

I did some more poking, and found a lot of good information. I noticed in drivers/xen/console/console.c (kernel) the following comments:
# * Modes:
# * 'xencons=off' [XC_OFF]: Console is disabled.
# * 'xencons=tty' [XC_TTY]: Console attached to '/dev/tty[0-9]+'.
# * 'xencons=ttyS' [XC_SERIAL]: Console attached to '/dev/ttyS[0-9]+'.
# * 'xencons=xvc' [XC_XVC]: Console attached to '/dev/xvc0'.
# * 'xencons=hvc' [XC_HVC]: Console attached to '/dev/hvc0'.
# * default: XC_XVC

That lead me to try some different kernel command lines:
- cmdline: console=hvc0 (this is what works in maverick)
  result: some lost console output
  /dev/hvc0 does not exist
- cmdline: xencons=ttyS console=ttyS0
  result: no console output past kernel output
  /dev/ttyS0 exists
  input output error on try to write to /dev/console
- cmdline: console=ttyS0
  result: no console output other than getty prompt showing 'tty1'
  tee: /dev/console: Input/output error
- cmdline: console=tty1
  result all data got to ec2-get-console.

I thought to try 'console=tty1' because I noticed the following in the kernel boot messages:
* [ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro console=hvc0
  [ 0.000000] console [tty-1] enabled
  [ 1.505699] Xen virtual console successfully installed as tty1

* [ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro xencons=ttyS console=ttyS0
  [ 0.000000] console [ttyS0] enabled
  [ 0.000000] console [ttyS0] enabled

* [ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro console=ttyS0
  [ 0.000000] console [ttyS0] enabled
  [ 0.000000] console [tty-1] enabled
  [ 1.478630] Xen virtual console successfully installed as tty1

* [ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro console=tty1
  [ 0.000000] console [tty1] enabled
  [ 0.000000] console [tty1] enabled
  [ 1.295756] Xen virtual console successfully installed as tty1

I've not dug into what all that means, but when I used 'console=tty1', I got all the data I was hoping for on the console, and the 'Xen virtual console' matched up with 'console='.

One other point of information is that /dev/hvc0 *never* exists in the lucid boot (well, at least after I can ssh to it). I tried having the 'myhack' script (running before /sbin/init) create a correct /dev/hvc0 node with 'mknod' but that had no effect).

It seems to me, then that the difference from plymouth's point of view between the lucid and maverick kernels that have the same (console=hvc0) command line is that in lucid, the device node for /dev/hvc0 does not get created. This could potentially cause plymouth to then be unable to replay the messages it caught from /dev/console when it used TIOCCONS.

Revision history for this message
Scott Moser (smoser) wrote :

OK. so I realize how I incorrectly read the comments in drivers/xen/console/console.c. I had assumed that you should pass 'xencons=hvc' and it would allocate one hvcX device. Instead, I tried:

xencons=hvc0 console=hvc0 and I see:
[ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro xencons=hvc0 console=hvc0
[ 0.000000] console [hvc0] enabled
[ 0.836722] Xen virtual console successfully installed as hvc0

and, I get all the data.

So, I see 2 solutions to this:
a.) change update-grub-legacy-ec2 to add 'xencons=hvc0 console=hvc0' (it already does console=hvc0).
b.) change the kernel default to be hvc0. I haven't verified it, but just from the data above, it would appear that it is defaulting to 'tty1'.

Revision history for this message
Scott Moser (smoser) wrote :

at this point I would really like to get jjohansen or smb's thoughts on this.

Revision history for this message
Scott Moser (smoser) wrote :

I just commited the change to the build process at http://bazaar.launchpad.net/%7Eubuntu-on-ec2/vmbuilder/automated-ec2-builds/revision/210 .
What that does is make a change to lucid /boot/grub/menu.lst entries so that they have 'xencons=hvc0'.

A couple things to point out
a.) I'm only doing this on lucid builds
b.) we can also fix this in grub-legacy-ec2 in lucid
c.) there is no negative affect of having this on maverick kernels (which could happen on upgrade lucid->maverick).

Revision history for this message
Scott Moser (smoser) wrote :

In lucid, I see:
$ egrep "(console.*enabled|Kernel command|console.*installed|Linux ver)" lucid-console.txt
[ 0.000000] Linux version 2.6.32-310-ec2 (buildd@palmer) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #21-Ubuntu SMP Mon Nov 29 19:50:24 UTC 2010 (Ubuntu 2.6.32-310.21-ec2 2.6.32.24+drm33.11)
[ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro console=hvc0
[ 0.000000] console [tty-1] enabled
[ 0.951646] Xen virtual console successfully installed as tty1
[ 0.000000] Linux version 2.6.32-310-ec2 (buildd@palmer) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #21-Ubuntu SMP Mon Nov 29 19:50:24 UTC 2010 (Ubuntu 2.6.32-310.21-ec2 2.6.32.24+drm33.11)
[ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro xencons=hvc0 console=hvc0
[ 0.000000] console [hvc0] enabled
[ 1.077993] Xen virtual console successfully installed as hvc0

Revision history for this message
Scott Moser (smoser) wrote :

In maverick, I see
$ egrep "(console.*enabled|Kernel command|console.*installed|Linux ver)" maverick-console.txt
[ 0.000000] Linux version 2.6.35-23-virtual (buildd@roseapple) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #41-Ubuntu SMP Wed Nov 24 10:53:14 UTC 2010 (Ubuntu 2.6.35-23.41-virtual 2.6.35.7)
[ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro console=hvc0
[ 0.000000] console [tty0] enabled
[ 0.000000] console [hvc0] enabled
[ 0.000000] Linux version 2.6.35-23-virtual (buildd@roseapple) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #41-Ubuntu SMP Wed Nov 24 10:53:14 UTC 2010 (Ubuntu 2.6.35-23.41-virtual 2.6.35.7)
[ 0.000000] Kernel command line: root=LABEL=uec-rootfs ro xencons=hvc0 console=hvc0
[ 0.000000] console [tty0] enabled
[ 0.000000] console [hvc0] enabled

Revision history for this message
Scott Moser (smoser) wrote :

Oh, one more note, it seems that
drivers/xen/console/console.c:
# * Modes:
# * 'xencons=off' [XC_OFF]: Console is disabled.
# * 'xencons=tty' [XC_TTY]: Console attached to '/dev/tty[0-9]+'.
# * 'xencons=ttyS' [XC_SERIAL]: Console attached to '/dev/ttyS[0-9]+'.
# * 'xencons=xvc' [XC_XVC]: Console attached to '/dev/xvc0'.
# * 'xencons=hvc' [XC_HVC]: Console attached to '/dev/hvc0'.
# * default: XC_XVC

'default' seems incorrect. I see:
static enum {
   XC_OFF, XC_TTY, XC_SERIAL, XC_XVC, XC_HVC
} xc_mode;
static int xc_num = -1;
...
void xencons_early_setup(void)
{
    if (is_initial_xendomain()) {
        xc_mode = XC_SERIAL;
    } else {
        xc_mode = XC_TTY;
        console_use_vt = 0;
    }
}

That is more in line with what I'm seeing, where XC_TTY would cause the default behavior for xencons to be tty1.

Revision history for this message
Scott Moser (smoser) wrote :

I just tested with the most recent lucid build, and the adding of 'xencons' param in the post-build fixups fixes this problem

# us-east-1 ami-f8b44591 canonical ebs/ubuntu-lucid-daily-i386-server-20101220
# us-east-1 aki-407d9529 amazon pv-grub-hd0-V1.01-i386.gz
$ ec2-run-instances ami-f8b44591 --kernel aki-407d9529 --instance-type t1.micro

Then, I get all console output.

Changed in cloud-init (Ubuntu):
status: New → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :

There is, in my opinion, a real bug here in plymouth. However, it can be easily worked around by adding 'xencons=hvc0 console=hvc0' as I've done now in [1] (a slight fix at rev 211 also). I think this can be further fixed in legacy-grub-ec2 in lucid by adding the appropriate xencons to the default config file, making the fix in vmbuilder-uec-ec2-fixes unnessesary.

[1] http://bazaar.launchpad.net/%7Eubuntu-on-ec2/vmbuilder/automated-ec2-builds/revision/210

Revision history for this message
Alon Swartz (alonswartz) wrote :

I was tracking down this exact same bug today when I came across this bug report.
I can confirm that adding "xencons=hvc0" works around the issue on Lucid based builds, thanks Scott!

Changed in cloud-init (Ubuntu Lucid):
status: New → Triaged
Changed in plymouth (Ubuntu Lucid):
status: New → Triaged
Changed in linux-ec2 (Ubuntu Lucid):
status: New → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-ec2 (Ubuntu):
status: New → Confirmed
Revision history for this message
Emmanuel Kasper (emmanuel-kasper) wrote :

@scott: hit by this bug today any ideas when this will be fixed in legacy-grub-ec2 ?

in the meantime. for others running an Ubuntu 10.04.* domU with the linux-image-ec2 kernel and getting no console, the proper way to fix this is to add

console=hvc0 xencons=hvc0

in /boot/grub/menu.lst in the kopt line, so it looks like

# kopt=root=UUID=ca7eba50-21d7-4756-b797-50b13a637791 ro console=hvc0 xencons=hvc0

then you should run /usr/sbin/update-grub-legacy-ec2 ( which will also be run after each kernel update ) from the package grub-legacy-ec2.

You can verify that the xen console option was properly added to your kernel entries:

grep ^kernel /boot/grub/menu.lst
kernel /boot/vmlinuz-2.6.32-340-ec2 root=UUID=ca7eba50-21d7-4756-b797-50b13a637791 ro console=hvc0 xencons=hvc0
kernel /boot/vmlinuz-2.6.32-340-ec2 root=UUID=ca7eba50-21d7-4756-b797-50b13a637791 ro console=hvc0 xencons=hvc0 single

Revision history for this message
Scott Moser (smoser) wrote :

@Emmanuel,
  I don't expect to fix this in grub-legacy-ec2. It onliy affects lucid era kernels, as far as I know/remember.

Revision history for this message
Adolfo Jayme Barrientos (fitojb) wrote :

(I'm untargetting Lucid as it has reached EOL.)

no longer affects: cloud-init (Ubuntu Lucid)
no longer affects: linux-ec2 (Ubuntu Lucid)
no longer affects: plymouth (Ubuntu Lucid)
Revision history for this message
Scott Moser (smoser) wrote :

Adolfo,
  Lucid has *not* reached end of life. The desktop did, but not the server.
  Please be aware, and if you've changed other things similarly, please put them back also.
 https://wiki.ubuntu.com/Releases

Changed in cloud-init (Ubuntu Lucid):
status: New → Triaged
Changed in linux-ec2 (Ubuntu Lucid):
status: New → Triaged
Changed in plymouth (Ubuntu Lucid):
status: New → Triaged
Scott Moser (smoser)
Changed in plymouth (Ubuntu Lucid):
status: Triaged → Won't Fix
Changed in linux-ec2 (Ubuntu):
status: Confirmed → Invalid
Changed in linux-ec2 (Ubuntu Lucid):
status: Triaged → Invalid
Changed in cloud-init (Ubuntu Lucid):
status: Triaged → Won't Fix
description: updated
Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/682831

tags: added: iso-testing
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.