ply_boot_client_flush() does not read replies (plymouth stuck during/after filesystem check or error)

Bug #554737 reported by mikbini
304
This bug affects 124 people
Affects Status Importance Assigned to Milestone
Plymouth
Fix Released
Undecided
Unassigned
plymouth (Ubuntu)
Fix Released
High
Steve Langasek
Lucid
Fix Released
High
Steve Langasek

Bug Description

Binary package hint: plymouth

When fsck runs at boot the graphical boot hangs. The last message it shows is "Checking disk 1/1, 74%".

The first couple of times I thought the computer has hanged but then I realized the boot goes on and I am able to switch to text consoles and login, although gdm never shows the graphical login screen.

tune2fs reports updated "Last mount time" and "Mount count", so it seems fsck completes successfully, too; also, an fsck from a rescue disk completes successfully.

This is always reproducible forcing an fsck at boot with a touch /forcefsck.

I'm attaching the a ps axwwwf I obtained from the text console while the "Checking disk 1/1, 74%" was shown.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: plymouth 0.8.1-4
ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-19-generic i686
NonfreeKernelModules: wl
Architecture: i386
Date: Sat Apr 3 20:02:02 2010
DefaultPlymouth: /lib/plymouth/themes/ubuntu-logo/ubuntu-logo.plymouth
InstallationMedia: Error: [Errno 13] Permission denied: '/var/log/installer/media-info'
MachineType: Apple Inc. MacBook4,1
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-19-generic root=UUID=3f7919f6-a881-4335-ab54-c8e485e3afb4 ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.utf8
 SHELL=/bin/bash
ProcFB: 0 inteldrmfb
SourcePackage: plymouth
TextPlymouth: /lib/plymouth/themes/ubuntu-text/ubuntu-text.plymouth
dmi.bios.date: 02/09/08
dmi.bios.vendor: Apple Inc.
dmi.bios.version: MB41.88Z.00C1.B00.0802091535
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Mac-F22788A9
dmi.board.vendor: Apple Inc.
dmi.board.version: PVT
dmi.chassis.asset.tag: Asset Tag#
dmi.chassis.type: 2
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-F22788A9
dmi.modalias: dmi:bvnAppleInc.:bvrMB41.88Z.00C1.B00.0802091535:bd02/09/08:svnAppleInc.:pnMacBook4,1:pvr1.0:rvnAppleInc.:rnMac-F22788A9:rvrPVT:cvnAppleInc.:ct2:cvrMac-F22788A9:
dmi.product.name: MacBook4,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.

Related branches

Revision history for this message
mikbini (mikbini) wrote :
Revision history for this message
mikbini (mikbini) wrote :

Also, here is the boot log.

Revision history for this message
Steve Langasek (vorlon) wrote :

No evidence that this is the same as bug #554519. Unduping.

Revision history for this message
Hugo van der Wijst (hugwijst) wrote :

I also had this issue. When killing plymouth, gdm starts normally though.

mikbini (mikbini)
Changed in plymouth (Ubuntu):
status: New → Confirmed
Revision history for this message
Berend De Schouwer (berend-de-schouwer) wrote :

Same issue. GDM starts, but you can't see the login screen (presumably because plymouth hides gdm)

Killing '/bin/plymouth quit --retain-splash' shows the gdm login screen.

Plymouth's fsck screen never goes past seventy-something percent, even though fsck has quit. I presume that's why it doesn't quit.

Undecided: fsck quits fine or crashes.

Boot.log contains fsck lines and two startup messages. I'm a little bit surprised to see the other two messages.

Revision history for this message
Steve Langasek (vorlon) wrote :

> Killing '/bin/plymouth quit --retain-splash' shows the gdm login screen.

Are you saying that there was a running 'plymouth quit' process that you had to kill?

Revision history for this message
Hugo van der Wijst (hugwijst) wrote :

I can reproduce this by simply touching the file '/forcefsck' and rebooting.

Killing either 'plymouth quit' or 'plymouth --mode=boot --attach-to-session' shows gdm again, all though kill 'plymouth --mode=boot --attach-to-session' spams the terminal with the following message: 'mountall: Plymouth command failed'.

Revision history for this message
mikbini (mikbini) wrote :

@Steve Langasek: in my case (see the ps.txt I attached to my report) there definitely is a plymouth quit process.

Revision history for this message
Steve Langasek (vorlon) wrote :

If someone can reproduce this error (plymouth quit is running indefinitely), please grab a backtrace of plymouthd with 'sudo gdb plymouthd $pid_of_plymouthd'.

Changed in plymouth (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → High
Revision history for this message
mikbini (mikbini) wrote :

Here you are the backtrace of plymouthd.

While getting this I also noticed that if, while the fsck is in process and plymouth is not hanged, I switch to vt1 and wait until I see an [ Ok ] (I don't know which boot step prints this as it's the only thing on vt1) and then switch to vt7 plymouth terminates correctly and the gdm login appears on vt7.

Revision history for this message
mikbini (mikbini) wrote :

For the sake of completeness, here it is the bt for the "plymouth quit" process

Revision history for this message
Steve Langasek (vorlon) wrote :

mkibini,

When this hang happens, is there a mountall process running?

Revision history for this message
mikbini (mikbini) wrote :

Yes, there is:

mountall --daemon --force-fsck

Keep in mind that I touched /forcefsck to trigger the problem.

Revision history for this message
Steve Langasek (vorlon) wrote :

Suspected as much.

Can you get a backtrace of this mountall process, too?

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote : Re: [Bug 554737] Re: Graphical bootstrap hangs on fsck

Reading symbols from /sbin/mountall...Reading symbols from
/usr/lib/debug/sbin/mountall...done.
done.
Attaching to program: /sbin/mountall, process 343
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
[Thread debugging using libthread_db enabled]
0x00007f00fa5368c0 in __write_nocancel () from /lib/libpthread.so.0
(gdb) bt
#0 0x00007f00fa5368c0 in __write_nocancel () from /lib/libpthread.so.0
#1 0x00007f00fa114d8b in ply_write (fd=11, buffer=0x7f00fc14c7b0,
number_of_bytes=<value optimized out>) at ply-utils.c:306
#2 0x00007f00f9f02eb5 in ply_boot_client_send_request
(client=0x7f00fc096810) at ./ply-boot-client.c:407
#3 ply_boot_client_process_pending_requests (client=0x7f00fc096810) at
./ply-boot-client.c:445
#4 0x00007f00f9f030a8 in ply_boot_client_flush (client=0x7f00fc096810)
at ./ply-boot-client.c:752
#5 0x00007f00fafd2ea5 in main (argc=<value optimized out>, argv=<value
optimized out>) at mountall.c:3289
(gdb)

Revision history for this message
Steve Langasek (vorlon) wrote : Re: Graphical bootstrap hangs on fsck

Aha - looks like a pretty classic deadlock then, both ends in a blocking write and nobody reading.

Thanks for helping to pin this down.

Scott, you know the plymouth protocol code better than I do; can you confirm that this is what's happening here? Is this a plymouth bug, or a mountall bug for misusing the client library?

Changed in plymouth (Ubuntu Lucid):
assignee: nobody → Scott James Remnant (scott)
Revision history for this message
Steve Langasek (vorlon) wrote :

Looks to me like this is probably a bug in the ply_boot_client_flush() added in revision 1296.

Revision history for this message
323232 (323232) wrote :

possible duplicate of 554079 ?

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I agree with your diagnosis.

It looks like ply_boot_client_flush() is blocking on a write() to plymouthd, which is blocking on a write() back to the process calling that, because the former function only flushes pending writes (in a blocking way) without reading responses

I'd say then that the bug is in libply-boot-client, and that ply_boot_client_flush() should instead be called in such a way that replies are dealt with

summary: - Graphical bootstrap hangs on fsck
+ ply_boot_client_flush() does not read replies (plymouth stuck
+ during/after filesystem check or error)
Changed in plymouth (Ubuntu Lucid):
milestone: none → ubuntu-10.04
Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

This is probably the same as bug #554079

Steve Langasek (vorlon)
Changed in plymouth (Ubuntu Lucid):
assignee: Scott James Remnant (scott) → Steve Langasek (vorlon)
Revision history for this message
NoOp (glgxg) wrote :

Perhaps 554079 should be moved/merged to this bug? Please see my comment:
https://bugs.launchpad.net/ubuntu/+source/mountall/+bug/554079/comments/53
I ssh'ed into the machine, killed plymouthd (after all disk activity stopped), and gdm comes up and I could then continue to login w/o issue.

Revision history for this message
Steve Langasek (vorlon) wrote :

The distinguishing marker for this bug is the backtraces of the plymouthd and mountall processes. If someone wants to grab that information from the submitters of bug #554079 and duplicates, that's fine; otherwise that bug can be revisited and confirmed as a duplicate after we have this one fixed.

Revision history for this message
Mathew Cairns (mat-cairns) wrote :

There is also a possible additional duplicate of this, as bug #559761. In that case, the mountall process remains after boot, using 100% CPU time. However, there are no plymouth processes running at the time. I have attached a backtrace of the mountall process to comment 8 of that bug.

Revision history for this message
Rowan (cross-fell-box) wrote :

This affected me for the first time today - fsck and plymouth froze at 93% checking disk 1 of 1 - fully updated Lucid Beta 2 - kernel 2.6.32-20.

Revision history for this message
Steve Langasek (vorlon) wrote :

Could those experiencing this issue please run 'sudo add-apt-repository ppa:vorlon/ppa' and install the new version of plymouth from there, and let us know if this fixes the problem or causes any new ones?

Revision history for this message
Hugo van der Wijst (hugwijst) wrote :

These new versions fix it for me. I don't see any new problems after a quick restart and hibernate test round.

Revision history for this message
Mathew Cairns (mat-cairns) wrote :

The problem still exists for me. However, in my case, no plymouth processes were present after booting, even though mountall failed to terminate (see comment 23 above). The only difference I have observed is that KDM is running on a different console.

Booting the 2.6.32-20 kernel with 'nosplash' option, setting high mount count on an ext3 partition with tune2fs prior to rebooting.

Before upgrade: plymouth:
Plymouth version: 0.8.1-4ubuntu1
"mountall: Plymouth command failed" scrolling on VT1
KDM loading on VT8

After upgrade:
Plymouth version: 0.8.2-0ubuntu1~ppa2
"mountall: Plymouth command failed" scrolling on VT1
KDM loading on VT7

Revision history for this message
Jorge Suárez de Lis (ys) wrote :

It worked for me as well, thank you!

At least it boots, but there still some remaining issues:

* I created a file named /forcefsck so only the / should be checked, isn't it? But Plymouth says 1 of 1 partitions, then 1 of 2, 1 of 3 and 1 of 4. (I suppose these additional partitions are /var, /tmp and /home).
* GDM is shown when Plymouth still says 1 of 4 partitions, at 79%.

Obviously, there's still something wrong in here, but perhaps this is a separate issue.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

The updated plymouth packages solved it for me.

Disk check completes normally in splash screen, after that -> tty.
As expected.

Great work, Steve.

Revision history for this message
mikbini (mikbini) wrote :

It works with two minor problems.

First the %completion goes up to ~80% and then sticks there for ~5-6 second until gdm shows.

Second on vt1 I see a lot (many tens-few hundreds) of "mountall: Plymouth command failed" messages.

Revision history for this message
ingo (ingo-steiner) wrote :

Here still al the same:

with latest Lucid-amd64 updates things did not cange in any way: boot process hangs at 70%
(initziated by 'touch /forcefsck')

Revision history for this message
ingo (ingo-steiner) wrote :

Checked again:

fsck is definitely not finished: file /forcefsck is not erased!

So it is not just killing plymouth, start-up procedure is faulty. See also this bug report:
https://bugs.launchpad.net/bugs/538810
and
https://bugs.launchpad.net/bugs/554079

Revision history for this message
SAL-e (sal-electronics) wrote :

It works for me also, but I notice some small problems also.
1. I create /forcefsck file in order to force the disk check during the next boot.
2. Reboot the laptop.
3. The disk check complete too fast in order to see the proper 100% count. I see about 90%+ or something like that.
4. The gdm start on VT7.
5. Login into Gnome and after some time I just logout. During the logout process I saw many errors across the screen, but eventually the gdm will start on VT8 this time.
6. Switching back to VT7 reviles the errors that I saw: "mountall: Plymouth command failed".
7. I login again into Gnome and it is working, but this time it is running on VT8.

Can I do something more to help?

Revision history for this message
323232 (323232) wrote :

Thanx for the efforts!
did a sudo touch /forcefsck and restarted witout the splash option; everything seemed to work and the /forcefsck was removed

did a second test with different results
did a sudo touch /forcefsck, and restarted with splash
GDM is shown when Plymouth still says 1 of 2 partitions, at 70%. logged in and rebooted with splash
Fsck was done again; GDM is shown when Plymouth still says 1 of 2 partitions, at 70%. Logged in and rebooted without splash
did see fscheck messages (some messages and the black screen before gdm); rebooted with splash
GDM is shown when Plymouth still says 1 of 2 partitions, at 70%: logged in: the forcefsck was still there; removed it and rebooted with splash
Saw the blink line for a second that the disk were checked without the percentage

So; the freezes stopped but there are still some thing not as they should be
Hope thos helps

Revision history for this message
Jorge Suárez de Lis (ys) wrote :

ingo, I noticed that the file /forcefsck was not deleted too after starting the session, but it was gone after a while (less than a minute). Perhaps the file is deleted at some more advanced stage (some boot process that's still in the background after the session has already been started?). Chan you wait a minute and check again if the file is deleted?

Revision history for this message
ingo (ingo-steiner) wrote :

An it is Ubuntu-specific:

Debian Squeeze performs fsck correctly and removes /forcefsck
Also happens on Lucid with grub-legacy instead of grub2, no matter whether ext3 or ext4 filesystem!

Revision history for this message
323232 (323232) wrote :

Yes, Directly after login de /forcefsck file is there, and it vanished a minute later...

Revision history for this message
ingo (ingo-steiner) wrote :

Ah, so if you can't login /forcefsck is not removed - that's really odd.

So to get rid of all that trouble, I have renamed '/sbin/plymouthd' and all is fine here as well. No more hangs with black screen. To see the boot-messages I already reverted to good old grub-legacy.

Unfortunately there is no easy/convenient way to un-install plymouth - so this is considered a dirty hack, but it solves all my problems!

Revision history for this message
Philip Muškovac (yofel) wrote :

As for why /forcefsck is not removed:

that file is not removed by fsck, instead the mountall init script contains:

post-stop script
    rm -f /forcefsck 2>dev/null || true
end script

which will delete the file after all checks have been done and all filesystems are mounted. So unless mountall stops fine, /forcefsck will not be removed. And as the moment when this happens is only defined as "after mountall finishes", this might be after login too.

Revision history for this message
Steve Langasek (vorlon) wrote :

Mathew Cairns,

mountall running and chewing CPU after plymouth has exited is not the same bug.

Jorge Suárez,

mountall interprets /forcefsck as "force check of all filesystems", so this is expected behavior.

Changed in plymouth (Ubuntu Lucid):
status: Triaged → Fix Committed
Revision history for this message
ingo (ingo-steiner) wrote :

If you have fixed that issue, please also correct dependencies of packet 'plymouth'.

Trying to remove it with apt-get will de-install almost the whole system - which definitely is not required!

Revision history for this message
ingo (ingo-steiner) wrote :

Steve Langasek, you said:
"mountall interprets /forcefsck as "force check of all filesystems", so this is expected behavior."

In that case there is another bug for mountall to be reported:
/forcefsck shoud respect the figure in column 6 of /etc/fstab and not just checkk all.

Revision history for this message
volRot (info-rothert) wrote :

Steve Langasek, you said:
"mountall interprets /forcefsck as "force check of all filesystems", so this is expected behavior."

In that case there is another bug for mountall to be reported:
/forcefsck shoud respect the figure in column 6 of /etc/fstab and not just checkk all.
is correct!

Revision history for this message
kulight (kulight) wrote :

im still getting this bug

Revision history for this message
Rowan (cross-fell-box) wrote :

Downloaded new plymouth from the vorlon ppa and did the "sudo touch /forcefsck" test.
Plymouth gets to "1 of 2 partitions, at 72%." then the screen blacks out for 5 seconds. then GDM login appears. I logged in and checked / - the forcefsck file is gone.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package plymouth - 0.8.2-1

---------------
plymouth (0.8.2-1) lucid; urgency=low

  [ Steve Langasek ]
  * New upstream release.
  * debian/plymouth.plymouth-stop.upstart: trust the new kdm to stop plymouth
    for us, too. LP: #540177.
  * plymouth needs to depend on initramfs-tools to avoid calling
    update-initramfs when the stack isn't configured yet and rendering the
    system unbootable with a broken initramfs. LP: #358654.
  * src/client/ply-boot-client.c: ply_boot_client_flush() needs to process
    the incoming queue before the outgoing, otherwise we get a deadlock.
    LP: #554737.
  * src/plugins/splash/script/script-lib-image.c: call script_obj_is_null()
    to check for the presence of the alpha argument, otherwise all our labels
    are rendered invisibly...
  * src/plugins/renderers/drm/plugin.c: temporarily disable the drm backend
    for nouveau, until we can get to the bottom of the DRM lockup in bug
    #539655. This is not as pretty, but it boots to the desktop without
    hanging regardless of how many displays are used, and that takes
    precedence. LP: #533135.

  [ Scott James Remnant ]
  * Implement a Window.GetBitsPerPixel() function in the script library,
    and the necessary support in ply_pixel_display, ply_renderer and in
    the vga16fb and frame-buffer renderers to make that information
    available. This will return "0" on drm and x11 where the answer is
    "lots". LP: #558352.
  * Modify vga16fb to only fallback to reducing colors if it's overflowed
    the 16-color palette already, so if you stick to the same 16 colors,
    you can get exact matches.
  * Use alternate 16-color images when we have only 4bpp. LP: #551013.

  [ Alberto Milone ]
  * ubuntu-logo: accept a format string from mountall for the disk check
    progress, so that localization is possible. LP: #553954.
 -- Steve Langasek <email address hidden> Thu, 15 Apr 2010 00:55:46 +0000

Changed in plymouth (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Jorge Suárez de Lis (ys) wrote :

With the PPA version I'm always getting the message: Your drives are beign checked for errors, this may take some time. Press C to cancel all checks currently in progress.

But no checks seems to be performed: there is no information about the progress and the system boots very fast. I've restarted the system three times and I always get this message. Screenshot attached.

Sorry if this is a separate issue.

Revision history for this message
ingo (ingo-steiner) wrote :

The root cause of all that trouble is plymouth!

See here: https://bugs.launchpad.net/ubuntu/+source/mountall/+bug/556372 how to get rid of it.

Revision history for this message
Savvas Radevic (medigeek) wrote : Re: [Bug 554737] Re: ply_boot_client_flush() does not read replies (plymouth stuck during/after filesystem check or error)

> But no checks seems to be performed: there is no information about the
> progress and the system boots very fast.

I don't think it's a bug -- probably there is nothing to check and the
drives are OK? :)

Revision history for this message
Chandru (chandru-in-deactivatedaccount) wrote :

Savvas. I face the same behavior too. If there nothing to check the message should not be displayed at all. If there is some check happening and the drives found to be OK, why is the check being performed for each boot in the first place?

Changed in plymouth (Ubuntu Lucid):
status: Fix Released → Confirmed
Revision history for this message
Chandru (chandru-in-deactivatedaccount) wrote :

Marking it as a confirmed with the assumption that a fix to this bug has caused the new behavior. If some of the maintainers confirm that it is a totally independent regression, I'm ready to raise a separate bug.

Revision history for this message
Savvas Radevic (medigeek) wrote :

> Marking it as a confirmed with the assumption that a fix to this bug has
> caused the new behavior.

The new package is a new upstream release - I think it's better to
leave it as "Fix released" and report a new bug report against the new
package version (include a link to this bug and mention a possible
regression?). I'm not a maintainer though!

Revision history for this message
kulight (kulight) wrote : Re: [Bug 554737] Re: ply_boot_client_flush() does not read replies (plymouth stuck during/after filesystem check or error)

if the new package does not fix the problem there is no reason to mark
it as fixed.

and it seems it doesnt fix the issue

On Thu, 2010-04-15 at 18:55 +0000, Savvas Radevic wrote:
> > Marking it as a confirmed with the assumption that a fix to this bug has
> > caused the new behavior.
>
> The new package is a new upstream release - I think it's better to
> leave it as "Fix released" and report a new bug report against the new
> package version (include a link to this bug and mention a possible
> regression?). I'm not a maintainer though!
>

Revision history for this message
Steve Langasek (vorlon) wrote :

This bug is fixed. Your request to not show the message when no drives need to be checked is a separate issue that you should file as a report against the mountall package.

Changed in plymouth (Ubuntu Lucid):
status: Confirmed → Fix Released
Revision history for this message
Chandru (chandru-in-deactivatedaccount) wrote :
Revision history for this message
Mike Guo (hylinux) wrote :

this fixed is not work for me.
I got the "mountall : plymount command failed " still
and htere is no splash show.

the plymouth version is : 0,8.2-2

and here is the boot.log

Revision history for this message
Steve Langasek (vorlon) wrote :

Mike,

that issue is not related to this bug.

Revision history for this message
Barry Drake (b-drake) wrote :

I have the identical problem on a Dell Inspiron Mini 10v running Lucid with latest updates. Currently I'm getting round the problem using GRUB_CMDLINE_LINUX="noapic, nolapic, noacpi" in /etc/default/grub followed by sudo update-grub. I don't really understand why or how this is working or what consequences it might have: but it seems to work!

Revision history for this message
Steve Langasek (vorlon) wrote :

Barry,

If that grub setting works for you, then you're seeing a kernel bug that's unrelated to this bug.

Revision history for this message
Barry Drake (b-drake) wrote : Re: [Bug 554737] Re: ply_boot_client_flush() does not read replies (plymouth stuck during/after filesystem check or error)

Steve Langasek wrote:
> If that grub setting works for you, then you're seeing a kernel bug
> that's unrelated to this bug.
>
Oh .... any thoughts on whether or not I ought to file a bug report on
this? I'm happy to put time in doing tests and reporting, but I'm a bit
lost on kernel and other low-level stuff. I'm not yet certain that the
grub setting has got rid of the problem, but fsck has now run twice as
part of the boot sequence and completed OK.

Thanks for your reply.

Barry

-- From Barry Drake (The Revd) Health and Healing advisor to the East Midlands Synod of the United Reformed Church. See http://www.urc5.org.uk/index for information about the synod, and http://www.urc5.org.uk/?q=node/703 for the Synod Healing pages.

Replies - <email address hidden>

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Apr 20, 2010 at 11:31:21AM -0000, Barry Drake wrote:
> Steve Langasek wrote:
> > If that grub setting works for you, then you're seeing a kernel bug
> > that's unrelated to this bug.

> Oh .... any thoughts on whether or not I ought to file a bug report on
> this? I'm happy to put time in doing tests and reporting, but I'm a bit
> lost on kernel and other low-level stuff. I'm not yet certain that the
> grub setting has got rid of the problem, but fsck has now run twice as
> part of the boot sequence and completed OK.

Yes, I would encourage you to file a bug report about this.

Cheers,
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
sebastienserre (sebastien-serre) wrote :

I attached a pics of my error

Revision history for this message
Barry Drake (b-drake) wrote :

I did some more testing this morning. I think my trial of "noapic nolapic noacpi" might have been co-incidental with your last fix. I've taken them out of grub and the problem has not shown. I created /forcefsck three times and re-booted since with no sign of the earlier problem. (Dell Inspiron Mini 10v running Lucid - Plymouth 0.8.2-2)

Revision history for this message
Loïc Minier (lool) wrote :

FTR, didn't reproduce the high CPU issue in the last couple of days anymore

Revision history for this message
chef (adotei) wrote :

Hi, I think I have a similar bug to report. Please let me know if this bug report is supposed to be elsewhere.

I updated lucid with the latest updates today 26/04/10. Before that, I was running a vanilla kernel of 2.6.34-rc5 without any custom patches. This worked well until the upgrade. Now I get an error message and it defaulting at boot to the command shell. I have attached a picture to show the error messages obtained.

If I switch to another vanilla kernel of 2.6.33.2, it boots alright. Hope this is satisfactory for someone to pin point what is causing the error. If more information is needed, let me know.

Revision history for this message
Steve Langasek (vorlon) wrote :

chef,

You're seeing bug #570289, a regression that seems to have been introduced in the latest upload. Investigation is ongoing.

Revision history for this message
Barry Drake (b-drake) wrote :

On Tue, 2010-04-27 at 05:40 +0000, Steve Langasek wrote:
> You're seeing bug #570289, a regression that seems to have been
> introduced in the latest upload. Investigation is ongoing.

By way of information. Dell Inspiron Mini 10v. Lucid updates yesterday
and today.

After updating yesterday, seeing that Plymouth packages were included in
the update, I forced fsck on boot. I've done the same a few minutes ago
as there was a further Plymouth update today. On both occasions, fsck
completed OK and finished the boot process.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Hey Steve,

After applying last updates, mountall hangs again.
When killing it, terminal becomes unstable.

I start up without quiet and splash in grub kernel commandline.
Otherwise system hangs up to 5 minutes on graphical screen.
Filesystem check goed up to 71%, then counting up very slowly.

Case of bad programming, imho.
It shouldn't say it is checking disks while it is actually waiting for something else (probably a timeout to mountall?).
I suggest the code is changed to refelect the real situation (waiting for ....).

Linux srv-1 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC 2010 x86_64 GNU/Linux

libplymouth-dev 0.8.2-2ubuntu2
libplymouth2 0.8.2-2ubuntu2
mountall 2.14
plymouth 0.8.2-2ubuntu2
plymouth-label 0.8.2-2ubuntu2
plymouth-theme-text 0.8.2-2ubuntu2
plymouth-theme-ubuntu-logo 0.8.2-2ubuntu2
plymouth-theme-ubuntu-text 0.8.2-2ubuntu2

Revision history for this message
Martin Erik Werner (arand) wrote :

I am currently seeing this on a virtualbox instance. After issuing a forced fsck it stalls at 70%, creeps slowly up to 74% where it seemingly stops indefinitely.

Revision history for this message
Martin Erik Werner (arand) wrote :

...Booting with quiet and splash removed allows normal start.

Install is current as of now.

Revision history for this message
Steve Langasek (vorlon) wrote :

> I start up without quiet and splash in grub kernel commandline.
> Otherwise system hangs up to 5 minutes on graphical screen.
> Filesystem check goed up to 71%, then counting up very slowly.

That doesn't sound like this bug, and the behavior you describe is not reproducible here. It's possible that fsck is running longer than it should due to a backed-up queue of messages being sent to plymouth, but this would not be a hang, only a slow-down. If you can confirm that plymouth hangs *indefinitely* when a fsck is run at boot (say, for more than 15 minutes with no progress), please open a new bug report about this.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

How do you explain the following then?

When I remove 'quiet splash' form the grub kernel commandline, the system boots up quickly.
No hangs on fsck there.

Otherwise behavior as said. I'll investigate.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

I just waited it out -> 5:30 minutes before having a reachable server.

mountall hasnt finished yet:
root 315 1 2 10:30 ? 00:00:07 mountall --daemon

Info for you to reproduce:

/etc/default/cryptdisks:
CRYPTDISKS_ENABLE=No
CRYPTDISKS_MOUNT=""
CRYPTDISKS_CHECK=blkid
CRYPTDISKS_PRECHECK=

/etc/crypttab:
secure /dev/sda6 none cipher=aes-cbc-essiv:sha256

bootvolume is ext4 with auto check ( tune2fs -c 1 ) at startup
encrypted volume is xfs (so, no fsck) and is not present in /etc/fstab

I'm curious if you can reproduce it now...

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

Same problem here since the last Plymouth update; fsck gets to the fateful 70% mark and then slows to a crawl (though not hanging).

The machine is a Dell Mini 9 with a tiny 32G drive so there is no possibility that fsck is simply taking the "right" time to finish.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Ok, I purged cryptsetup to see if it caused the problems. It doesn't.

I find it *very strange* that you cannot reproduce the error, as it seems quite simple to reproduce on this side.

With 'quiet splash' on the grub kernel commandline, and 1 ext4 partition to check at startup I get this problem.
Without 'quiet splash' startup is quick.
In both cases 'mountall --deamon' is still active after boot, it *should* finish after starting up, shouldn't it?
As I said earlier, when I kill 'mountall' manually, the terminal I/O is weird.

So, it doesnt wait indefinitely, but it takes *much* longer to start up, and I also have the feeling that my mounted filesystems aren't really stable (scp and rsync disconnect errors). A few days back this was OK, now it is not.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

Oh, and just to be complete:

THis happens for me only on 10.04 64-bit server, not on 64-bit desktop.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

I've had it with plymouth.

Why use this crap anyway on server dists?
We had a good and working server dist before it.
And who needs fancy splash screens on a server anyway?

Come on Canonical, drop the whole stuff for server.
It's not ready for release yet. And you guys know it.

Debian did the right thing by pushing it back to testing.

For server developement I'm now switching to Debian.
I'll check back in a few months.

Revision history for this message
ingo (ingo-steiner) wrote :

> Why use this crap anyway on server dists?

It even does not belong into a desktop release, especially one whith LTS. This widely untested stuff just for eye-catching belongs into a development version in early alpha stage. Most of the problems which still persist till today (release date of Lucid) can easily be solved by just uninstalling plymouth. Canonical should postpone the release date to avoid damage for its reputation.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

I totally agree, Ingo.
As said before I see the need for hiding all kinds of stuff from the average user, but there are alternatives for that.
And for desktop startup there can a lot be done there, just by using VESA modes.

As for servers: server dists should become less dependant on hardware.
While M$ is making it's servers bigger and bigger (and more complicated, as that's where the money is), Open Source servers should focus on becoming smaller and less complicated.

But no, now we have plymouth, which won't work as it should anyway on most hardware (Nvidia and ATI drivers still don't support it, AFAIK), and things become more complicated. Bad choice.

And all that for a shiny startup that nobody needs (on server hardware)?

Where's Canonical ? Steve?

Revision history for this message
Steve Langasek (vorlon) wrote :

letstrynl,

The issues you are describing are *not* this bug. Please file a new bug report for the problems you're seeing.

> Why use this crap anyway on server dists?
> We had a good and working server dist before it.
> And who needs fancy splash screens on a server anyway?

The graphical splash screen is not even *included* by default in Ubuntu 10.04 LTS Server, and no splash is used by default when installing 10.04 LTS Server. For upgrades, the existing 'splash' argument on the kernel commandline is retained by default. There's more information about this in the release notes at <https://wiki.ubuntu.com/LucidLynx/ReleaseNotes#Changes%20in%20boot-time%20output%20on%20Ubuntu%20Server>.

> Come on Canonical, drop the whole stuff for server.
> It's not ready for release yet. And you guys know it.

On the contrary, plymouth is *essential* to have in this release and all of our testing indicates that it is ready. If mountall is still running after your system has come up, that points to a problem in your fstab, and has nothing to do with plymouth.

Revision history for this message
letstrynl (letstry-deactivatedaccount) wrote :

... my fstab is as fresh as a just-born baby, same one as I installed at install time.

We'll see again in some time about plymouth.
I surely hope for Ubuntu that you're right and I'm wrong.

Time will tell.

Revision history for this message
tekstr1der (tekstr1der) wrote :

I see this bug is becoming quite confusing and convoluted. Sorry to add to the noise. I am simply wondering if the behavior described in comments #73,74 has been reported in a separate bug that I could follow. I am experiencing the same behavior where the filesystem check no longer hangs, but slows to a long (4-5min) crawl before finally completing and displaying the gdm. This is on a x64 with ext4. Prior to all this plymouth business, fsck's were so fast, I barely noticed when they took place.

Revision history for this message
mikbini (mikbini) wrote :

I just submitted bug #571707.

Revision history for this message
D J Eddyshaw (david-eddyshaw) wrote :

Plymouth is *not* ready. My Dell Mini 9 is essentially unusable (takes over 40 mins to fsck) unless I disable the quiet option at boot.
This is a serious unfixed problem to leave in the final release and is likely to stop unsophisticated new users dead.

Revision history for this message
Barry Drake (b-drake) wrote :

On Thu, 2010-04-29 at 04:50 +0000, Steve Langasek wrote:
> That doesn't sound like this bug, and the behavior you describe is not
> reproducible here. It's possible that fsck is running longer than it
> should due to a backed-up queue of messages being sent to plymouth, but
> this would not be a hang, only a slow-down.

In my case, I think there may be something like you mention. I get the
same behaviour BUT fsck does complete, but takes over 20 mins and that
is with a tiny 8gig SSD

Revision history for this message
mikbini (mikbini) wrote :

Barry, your problem looks like bug #571707.

In my opinion this bug (554737) was solved by Steve PPA and the "slow disk check" is a different problem, so I'm setting this bug state to "Fix Released".

Changed in plymouth:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.