Landscape client loses registration too easily in unexpected shutdowns/reboots

Bug #788605 reported by Eric Williams
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Landscape Client
Fix Released
Medium
Thomas Herve

Bug Description

## Issue

* Rebooting landscape dedicated server client after successful registration results in
  another registration request being sent

* LDS client appears multiple times in "Pending Computers" list or "Computers" if accepted

## Environment

* Landscape Dedicated Server

* Lucid 10.04 (client machines)

* Registration requires approval (no password set)

## Diagnostic Steps

1. Install Landscape Server

2. Create 3 new client machines

3. Register clients using "landscape-config ..."

landscape-config --computer-title "$(hostname)" \
        --account-name standalone \
        --url https://lds1.local/message-system \
        --ping-url http://lds1.local/ping \
        -k landscape_server_ca.crt --silent \
        --script-users=ALL \
        --ssl-public-key=/etc/landscape/landscape_server_ca.crt

4. create archive of /var/lib/landscape/

5. Approve registrations

6. Reboot client using script from landscape server

7. If new profiles show up in "Pending", create new archive of
   /var/lib/landscape

Tags: verified
Revision history for this message
Eric Williams (eric-canonical) wrote :

Example of multiple profiles

Revision history for this message
Eric Williams (eric-canonical) wrote :

Another example of multiple profiles

Revision history for this message
Eric Williams (eric-canonical) wrote :

/var/lib/landscape from client after successful registration.

Revision history for this message
Eric Williams (eric-canonical) wrote :

/var/lib/landscape from client after reboot, when the duplicate profile appeared in LDS.

Revision history for this message
Eric Williams (eric-canonical) wrote :

/var/lib/landscape from client after next reboot, when yet another new profile for him appeared.

Changed in landscape:
milestone: none → backlog
tags: added: lds
Revision history for this message
Tom Ellis (tellis) wrote :

I've been trying to reproduce this on ec2 without success yet.

I'm wondering if you are using dhcp in this case and there is a hostname and IP address change at all? if so, perhaps that's what is causing it.

Revision history for this message
Eric Williams (eric-canonical) wrote : Re: [Bug 788605] Re: LDS client appears multiple times in "Pending Computers" list or "Computers" after reboots

On Wed, 2011-06-08 at 10:46 +0000, Tom Ellis wrote:
> I've been trying to reproduce this on ec2 without success yet.
>
> I'm wondering if you are using dhcp in this case and there is a hostname
> and IP address change at all? if so, perhaps that's what is causing it.
>

There is definitely no IP address change. Hostname change may or may
not happen. At the customer site where this occurred, the machines had
a different hostname during the PXE boot than afterwards. Registration
was being done by a first-boot script.

In my virtual machines, there is no DNS, just dnsmasq. Hostname is set
manually before the registration. IP address doesn't change.

Eric

Revision history for this message
Tom Ellis (tellis) wrote : Re: LDS client appears multiple times in "Pending Computers" list or "Computers" after reboots

OK, this is most certainly a bug. I managed to reproduce it.

If reboot a client system either by running 'sudo reboot' on the cli, or via 'reboot this computer' on the Landscape WebUI - the problem does not occur.

If I reboot the system with 'reboot -fn' in a stored script, I see pending computers upon reboot, they re-request the registration.

Changed in landscape:
status: New → Confirmed
Revision history for this message
Tom Ellis (tellis) wrote :

Right, if we do a forced reboot I get this message in broker.log straight after the reboot:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/landscape/broker/store.py", line 299, in _reprocess_holding
    message = bpickle.loads(self._get_content(old_filename))
  File "/usr/lib/python2.7/dist-packages/landscape/lib/bpickle.py", line 45, in loads
    raise ValueError, "Can't load empty string"
ValueError: Can't load empty string
2011-06-08 12:08:35,804 ERROR [MainThread] Can't load empty string
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/landscape/broker/store.py", line 299, in _reprocess_holding
    message = bpickle.loads(self._get_content(old_filename))
  File "/usr/lib/python2.7/dist-packages/landscape/lib/bpickle.py", line 45, in loads
    raise ValueError, "Can't load empty string"
ValueError: Can't load empty string

Looks like a pickle corruption issue, soon after we get a reregistration request:
011-06-08 12:08:35,804 INFO [MainThread] Accepted types changed: +test +register +register-cloud-vm
2011-06-08 12:08:35,806 INFO [MainThread] Queueing message to register with account 'standalone' without a password.
2011-06-08 12:08:35,809 INFO [MainThread] Starting message exchange with https://ec2-46-51-138-192.eu-west-1.compute.amazonaws.com/message-system.

affects: landscape → landscape-client
Changed in landscape-client:
milestone: backlog → none
milestone: none → backlog
summary: - LDS client appears multiple times in "Pending Computers" list or
- "Computers" after reboots
+ Landscape client loses registration too easily in unexpected
+ shutdowns/reboots
Changed in landscape-client:
milestone: backlog → 11.06.2
visibility: private → public
Changed in landscape-client:
milestone: 11.06.2 → backlog
Revision history for this message
Tom Ellis (tellis) wrote :

can we increase the severity of this bug?

We're seeing this at another customer who has 200 laptops registered to hosted and will be increasing the number significantly, they get around 10 of these re-registrations per week and this will increase as they raise the number of registered systems.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Tom, are "forced reboots" or random crashes so common? Note comment #8 above.

Revision history for this message
Tom Ellis (tellis) wrote :

I would say yes, with laptops users generally do silly things like hard powering off systems or letting the battery die.

Changed in landscape-client:
importance: Undecided → Medium
tags: removed: lds
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

This is being addressed in bug #809210 for the moment, which has a branch up for review. Don't mark either one as a duplicate just yet, please.

Revision history for this message
Tom Ellis (tellis) wrote :

jpds checked the fixes from #809210 and it solved the issue for him.

Thanks Andreas!

Thomas Herve (therve)
Changed in landscape-client:
status: Confirmed → Fix Released
milestone: backlog → 11.08.1
assignee: nobody → Thomas Herve (therve)
tags: added: verified
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.