Ubuntu
ldm package

nbd+squashfs errors when rebooting ltsp thin clients

Bug #457702 reported by Veli-Matti Lintu on 2009-10-21

This bug affects 4 people

Affects		Status	Importance	Assigned to	Milestone
	ldm (Ubuntu)	Invalid	Medium	Stéphane Graber
	ltsp (Ubuntu)	Fix Released	Undecided	Unassigned

Bug Description

Testing done on ltsp image built on Karmic from the newest packages available in the repos.

Server: Karmic amd64
Thin client: Karmic i386
Image built with command: ltsp-build-client --arch=i386

After building a fresh image with newest Karmic packages, the ltsp thin clients do not reboot properly, but throw nbd+squashfs related errors on the console. After the errors are shown, the thin client is frozen and it does not reboot automatically. Shutdown works properly with no error messages. The same error happens on every boot and this has happened on multiple thin clients.

Attached is logfile for the entire session when the thin client boots from network and reboot is selected from ldm menu. The logfile is captured from serial console.

The shutdown part gives out these messages:

init: hal main process (856) terminated with status 1
init: cron main process (2247) killed by TERM signal
init: tty1 main process (3341) killed by TERM signal
init: Disconnected from system bus
init: rsyslog-kmsg main process (444) killed by TERM signal
* Asking all remaining processes to terminate... init: hwclock-save main process (3406) killed by TERM signal
[ 57.931304] nbd0: Receive control failed (result -4)
[ 57.955127] nbd0: Attempted send on closed socket
[ 57.959994] end_request: I/O error, dev nbd0, sector 416250
[ 57.965709] SQUASHFS error: squashfs_read_data failed to read block 0xcb3f6c9
[ 57.972983] SQUASHFS error: Unable to read metadata cache entry [cb3f6c9]
[ 57.979941] SQUASHFS error: Unable to read directory block [cb3f6c9:193c]
[ 57.986973] SQUASHFS error: Unable to read metadata cache entry [cb3f6c9]
[ 57.993887] SQUASHFS error: Unable to read directory block [cb3f6c9:193c]
[ 58.000885] SQUASHFS error: Unable to read metadata cache entry [cb3f6c9]
...

Tags:

Revision history for this message

Veli-Matti Lintu (vmlintu) wrote on 2009-10-21:

console output from booting the client and selecting reboot from ldm menu Edit (27.3 KiB, text/plain)

Revision history for this message

Veli-Matti Lintu (vmlintu) wrote on 2009-10-21:

boot+shutdown.log Edit (25.8 KiB, text/plain)

Another log file with boot + shutdown. This works with no problems.

Revision history for this message

Veli-Matti Lintu (vmlintu) wrote on 2009-10-28:

Problem seems to related to "reboot -p" call in gtkgreet/greeter.c of ldm. Changing that to "reboot -fp" fixes the problem.

affects:

ubuntu → ldm (Ubuntu)

Revision history for this message

Veli-Matti Lintu (vmlintu) wrote on 2009-10-29:

ldm_2.0.48-0ubuntu1-reboot-fix.diff Edit (587 bytes, text/plain)

This patch fixed the problem. It adds -f also to poweroff just in case.

Revision history for this message

Andrew Rigney (ubuntultspadmin) wrote on 2009-11-04:

How do you apply this patch?

Revision history for this message

torabian (torabian02) wrote on 2009-11-15:

Yes please clarify, how do you apply this patch?

Revision history for this message

Jakob Unterwurzacher (jakobunt) wrote on 2009-11-19:

You can get the updated ldm package here (it includes the patch): https://launchpad.net/~opinsys/+archive/ppa/+sourcepub/836591/+listing-archive-extra

Revision history for this message

Jakob Unterwurzacher (jakobunt) wrote on 2009-11-19:

Conversation on #ltsp about this problem: http://www.nubae.com/logs/ltsp20091028_pg1.html (starting at 07:17)

Revision history for this message

Stéphane Graber (stgraber) wrote on 2009-11-19:

Fix has been applied upstream.
Lucid will contain the fix next time I upload, I'll also apply it to my PPA.
If anyone feels like preparing a SRU, please feel free to do so.

Changed in ltsp (Ubuntu):
status:	New → Invalid
Changed in ldm (Ubuntu):
status:	New → Fix Committed
importance:	Undecided → Medium
assignee:	nobody → Stéphane Graber (stgraber)

Stéphane Graber (stgraber) on 2009-11-20

Changed in ldm (Ubuntu):
status:	Fix Committed → Fix Released

Revision history for this message

müzso (bit2) wrote on 2010-01-18:

#10

There's another call to "reboot": after logout from your Gnome session ldm checks whether there's a new version of the NBD image on the server. If there is, then a reboot is issued.
In the source of the "ldm" package this is in the file "rc.d/I01-nbd-checkupdate" at the very end of the script.

The package available at https://launchpad.net/~opinsys/+archive/ppa/+sourcepub/836591/+listing-archive-extra does not yet include the fix for the above problem (it contains the fix only for the regular reboots through the GUI), but the trunk version of the ldm package (available at http://bazaar.launchpad.net/~ltsp-upstream/ltsp/ldm-trunk) already has this fix, so probably the next release version of ldm will contain it too.

Revision history for this message

Nikolaus Rath (nikratio) wrote on 2010-02-12:

#11

The above fix calls reboot with the -f option to force a hard reboot without properly shutting down the system.

This is not always a good idea. The problem still exists when some other application (or the user) happens to issue a normal shutdown command.

Moreover, doing a hard reboot is not always a good idea. I have a couple of fat clients that I'd really rather shut down properly (so that e.g. the NFS mounted home directories are correctly flushed and umounted).

The real problem that has to be correct here is that sometime during the shutdown the network interface is deactivated. The NBD then becomes unavailable and the entire system freezes, since it cannot access the root file system anymore.

There is actually a check in /etc/init.d/networking that should prevent the network from being disabled when something is mounted from /dev/ndb, but either it does not work in this case or there is another script that shuts down the network.

Changed in ldm (Ubuntu):
status:	Fix Released → Confirmed
Changed in ltsp (Ubuntu):
status:	Invalid → Confirmed
Changed in ldm (Ubuntu):
status:	Confirmed → Invalid

Revision history for this message

Nikolaus Rath (nikratio) wrote on 2010-02-12:

#12

debdiff for ltsp Edit (993 bytes, text/plain)

And it turns out that the problem is that /etc/init.d/sendsigs kills nbd-client. Attached is a debdiff that should fix the problem once and for all. I would also suggest to revert the change in LDM - it's no longer necessary.

Please don't be too harsh on the debdiff, it's my very first one.

tags:	added: patch
Changed in ltsp (Ubuntu):
assignee:	nobody → Ubuntu Sponsors for main (ubuntu-main-sponsors)

Revision history for this message

Nikolaus Rath (nikratio) wrote on 2010-02-12:

#13

...and here is a branch for merging in Lucid. This patch also protects dbd-proxy: http://bazaar.launchpad.net/~nikratio/ltsp/ubuntu.bug457702

Revision history for this message

Colin Watson (cjwatson) wrote on 2010-03-23:

#14

Subscribed ubuntu-sponsors, unassigned ubuntu-main-sponsors (best not to use assignment for this).

Changed in ltsp (Ubuntu):
assignee:	Ubuntu Sponsors for main (ubuntu-main-sponsors) → nobody

Revision history for this message

Stéphane Graber (stgraber) wrote on 2010-03-23:

#15

Please note that in Lucid, a upstart script exists for both reboot and shutdown that triggers a forced shutdown/reboot and bypass the regular shutdown sequence.

These two scripts issues a "sync" then reboot or shutdown depending on the runlevel.
/etc/init.d/sendsigs will then never be called at shutdown or reboot making updating the process blacklist useless in this case.

That new LTSP including updated upstart jobs and other upstream fixes will be uploaded later today.

Stéphane Graber (stgraber) on 2010-11-23

Changed in ltsp (Ubuntu):
status:	Confirmed → Fix Released

Revision history for this message

K. O. (h-admin-mi-fh-offenburg-de) wrote on 2010-11-30:

#16

Hey guys,

we are currently facing the same problem, or at least something relaited.

We use an Xubuntu 10.04.1 Server (amd64) with i386-Clients. LTSP 5.2.1ubuntu9 and separate (Windows) DHCP-Server.
If the LTSP-server gets down, no matter if shutdown, reboot, hard shutdown etc. the clients waits for around 1 minute (mostly less) and starts then giving the above mentioned errors ...

First ...
[ 57.955127] nbd0: Attempted send on closed socket
[ 57.959994] end_request: I/O error, dev nbd0, sector 416250

and then ...

[ 57.965709] SQUASHFS error: squashfs_read_data failed to read block 0xcb3f6c9
[ 57.972983] SQUASHFS error: Unable to read metadata cache entry [cb3f6c9]
[ 57.979941] SQUASHFS error: Unable to read directory block [cb3f6c9:193c]
... to infinity ...

sligthly different IDs etc but the rest is identical. The client continues with that until i shut him down manually (pulling the plug) or try to login with SSH. The authentication/login works, but you never get a prompt, just sometimes an "Input/Output Error". The client hangs then completly and also stops giving above errors. Sometimes it already hangs earlier and SSH is not reachable.

With an hardware-server which takes about 4 minutes to boot (loooots of RAM to count) we faced that first. With an virtual Server (VMware) reboots just takes seconds and the clients reboot automaticlly when the server is back running. If i pause the boot, we have the same behavier of the clients like with the hardware-server.

As above is written the bug is fixed an we use a much newer version, i was really confused why in our .../initramfs/scripts/ltsp_nbd file the fix is not implemented? But even if I add the fix manually the problem stays.

Any solution? We have already an older Server with Ubuntu 8.04 (32Bit) and 15 Clients running without such problems.

Revision history for this message

Stian Hill (stian-axachi) wrote on 2012-11-30:

#17

Hi this is strange I have Ubuntu 12.04 Precise and default apt-get install lts-server. then a lts-build-client. this makes a amd64 folder . I use a Windows dhcp. I installed Likewise5 to get AD integration. it all works. only thing is when the client has been on for a while it gets the error listed her...

Now I would think that this fix was implemented in the newer versions???

:K. O. (h-admin-mi-fh-offenburg-de) did you find out anything.?

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Patches

Add patch

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntuldm package

nbd+squashfs errors when rebooting ltsp thin clients

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Ubuntu
ldm package