udev enumeration should use /sys/bus not /sys/devices

Bug #6367 reported by Edward Mendelson on 2006-01-02
142
This bug affects 18 people
Affects Status Importance Assigned to Milestone
Ubuntu
Medium
Unassigned
grub-installer (Ubuntu)
Medium
Unassigned
udev (Ubuntu)
High
Scott James Remnant (Canonical)
Dapper
Undecided
Scott James Remnant (Canonical)

Bug Description

Dapper Flight 2 installs successfully in an UNdocked ThinkPad T42, and installs itself on the internal hard disk, which it sees as hda.

However, when you then dock the same ThinkPad in the ThinkPad Dock II (the one with an additional bay for a floppy or optical drive, etc.), Ubuntu won't boot, because it now sees the same harddisk as hde, not hda.

A similar problem occurs when installing to the ThinkPad in the dock II: the installer sees the hard disk as hda, but when Ubuntu starts up, it sees the disk as hde, and can't complete the installation.

I think this didn't happen in Breezy (though my Breezy setup was different), but it's a major annoyance in Dapper.

(This is a duplicate of 5913, because Malone won't let me add a distribution name to that; I'll mark 5193 as a duplicate.)

Edward Mendelson (emendelson) wrote :

Sorry - that should say 5913 both times in the final paragraph, and the duplicate has now been correctly marked.

RobotTwo (ubuntu-2robots) wrote :

Greetings. I've got a Thinkpad T42p, and I am seeing very similar things. Annecdotally, if the system is booted with the lid down in the dock, it still seems to find the hard drive as hda, but if the lid is up, it sees it as hde. I haven't done this enough times to say with 100% certainty, though.

Also, Whether or not CPU-Frequency scaling works seems to be related to when it sees the disk as hda or hde.

I set up a LaptopTestingTeam page at https://wiki.ubuntu.com/LaptopTestingTeam/ThinkpadT42p#preview

With my T42, the enumeration problem *always* occurs when booting in the dock, even with the lid down.

I'm very, very glad to see the new wiki page about this. This gives some hope that Dapper will be usable with the T42 (which is obviously an important piece of hardware for enterprise users...)

RobotTwo (ubuntu-2robots) wrote :

Some additional info -- when it detects the drive as /dev/hde, /dev/hda doesn't exist at all, and vice versa. Someone had asked on the Wiki whether "/dev/hda mirrored the SATA drives". It doesn't, and as far as I know, there is no SATA on the T42p (only ATA), but I could be wrong.

Also, don't know if this matters, but I don't have any disk drives, floppies, or cdroms on my docking station.

It's possible that this is a dock/no dock issue as I have been running all day today on the dock, and it has picked up the disk as /dev/hde all day.

RobotTwo (ubuntu-2robots) wrote :

This is a pretty critical bug, as a beginner would have no idea how to boot his/her system!

RobotTwo (ubuntu-2robots) wrote :

Don't know if its related, but when the disk is detected as /dev/hde, the kernel gives the following messages as its first output:

[4294669.201000] PCI: Cannot allocate resource region 7 of bridge 0000:02:03.0
[4294669.201000] PCI: Cannot allocate resource region 8 of bridge 0000:02:03.0
[4294669.201000] PCI: Cannot allocate resource region 9 of bridge 0000:02:03.0

Paul Sladen (sladen) wrote :

So the dock is an additional PCI bus and is getting scanned. Even if nothing is attached to the IDE channels, the controller still exists...

Can you attached the output of 'lspci' both with and without the docking station attached.

roh (roh) wrote :

the same happens with the ThinkPad X20 and the full dock.
siimage gets loaded before piix and so the hdd is hde and the dvdrom gets hda
for a generic solution https://wiki.ubuntu.com/ProbeForRootFilesystem
seems to be the right way (rootfs-by-uuid)

Ech (ech1965) wrote :

I had the same problem on a desktop :

My hardware:
- Pentium IV mainboard ( Fujitsu-Siemens N60D)

- Primary master 20GB Hard disk
- Secondary master DVD rom

- A ide pci card ( noname with a Silicon image 680 chipset ( medley capable)
with 2x hard disk 160gb in master position on the two channels...

In breezy, I had the following setup
- hda 20HB
- hdc DVD
- hde 160GB
- hdf 160GB

on dapper install cd:

hda 160GB
hdc 160
hde 20
hdf DVD

-after installing dapper, my pc refused to boot ( problem with grub, ...)

I had to remove the pci card and wire the disk in slave position..

Regards
Etienne

Ben Collins (ben-collins) wrote :

Here's the quick summary:

Kernel: There's no way for the kernel to make any exceptions here. The drivers are loaded in a specific order that the kernel has no control over. It just does what it is told.

Udev: Udev shouldn't have to worry about the order of modules. Maybe the user could blacklist the module for the docking station IDE, but that's a hack.

The real fix here is probe-root-fs. Using the UUID alleviates all the problems here. However, this is currently only implemented for removable devices (there was a long discussion about this on #ubuntu-devel).

I'm punting this over to grub-installer (which handles when to use UUID and when to use device path), since it will need to handle the details of this.

This will probably NOT be fixed for dapper.

Changed in linux-source-2.6.15:
status: Unconfirmed → Confirmed
Edward Mendelson (emendelson) wrote :

Is there any hope of getting this fixed during the six-week delay? If not, this bug certainly seems like a problem for the idea of Dapper as enterprise-ready. The ThinkPad T4x series is sold mostly to large corporations, who also buy docks for it. If those corporations can't use Dapper, then it's a fairly serious problem in general.

Matt Zimmerman (mdz) wrote :

Surely there is a simpler stop-gap solution for Dapper; migrating to UUID-based mounting is just too intrusive.

If nothing else, we could special-case these particular devices to be scanned in a particular order

Ben Collins (ben-collins) wrote :

Don't count dapper out yet. There was a proposal on IRC (#ubuntu-devel), where the initramfs would retain the UUID of the root device. If the device in question failed to appear or mount (/dev/hda in this case), then it would fall back to looking for the correct UUID on some other device (/dev/hde in this case).

Not sure how soon this will happen, but I have high hopes that it will end up in dapper.

Marten Klencke (mklencke) wrote :

I probably have the same problem here on my desktop and haven't figured out yet how to work around it correctly.

I have an onboard IDE controller and a PCI one for extra devices. On all my Linux systems (including Ubuntu, Slackware, Fedora) so far, the onboard one is the first, so the drives become hda-d and the ones of the PCI IDE card become hde-h.

Dapper is the only system that gets it wrong. hdg, on which I wanted to install dapper, is suddenly hdc in the Dapper installer. I tried to go for it, installing grub on hdg1 (or hdc1 according to Dapper) (which is being chainloaded from LILO on hda), but unfortunately that didn't work out. LILO successfully loaded Grub, but Grub wasn't able to boot Dapper.

On my Slackware system again, I checked the IDE order and it was still correct. Then I chrooted into hdg1 and executed 'mount' without argument to see what's up. This told me that in the chrooted environment too, the drive order is incorrect:

/dev/hdc1 on / type reiserfs (rw)
/dev/hdg on /media/cdrom0 type iso9660 (ro)

Out of the chroot environment, hdc is the cdrom and hdg1 is the Dapper partition.

I'm puzzled here, and I've never experienced this before... shouldn't the BIOS drive order be used? According to lspci (on Slackware), my onboard IDE comes first:

00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev c1)
01:07.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 04)
01:07.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 01)
01:08.0 RAID bus controller: CMD Technology Inc PCI0680 (rev 02)

Finally: if the Dapper installer thinks that hde is hda, and Grub gets installed to the first hard drive for booting, this will be incorrect. The BIOS will try to start with what Dapper thinks is hde, which will then not have a boot loader. Confusing :-)

Sorry for me not pinning this down, I'm not sure where the problem is exactly. But I know that the previous versions of Ubuntu, as well as other distributions, are fine. If you need any more info, please ask!

Paul Sladen (sladen) wrote :

This all finally explains my problem I was having with ide-cs flash-memory in the PCMCIA slot. This was getting enumerated first and causing havoc (and an oops) on unsuspend/unhibernate when the drives were getting scanned in a different order.

I think some effort needs putting in to ensure probing happens strictly in PCI order.

In Bug #40993 I have outlined similar problems which I have been having with a system which has two active IDE controllers. These problems surfaced with the installation (from scratch and from a dist-upgrade) from 5.10 to 6.06.

I will not repeat the details of my findings here. Suffice it to say that the order of the controllers (and consequently the naming of the disks attached to the two controllers) changes between installation and first boot. My only recourse has been to connect all disks to the same controller.

On a possibly related note, the RAID partitions defined on the four disks also change names. The device called /dev/md1 during installation becomes /dev/md0 at first boot.

Thanks for your attention

  Gisli

Paul Sladen (sladen) wrote :

I noticed these updates:

linux-source-2.6.15 (2.6.15-21.31) dapper; urgency=low

  * ide-acpi: Replaces our old IDE ACPI support code with new and shiney
    stuff that should work better on some hardware.

...is this related in any manner?

Matt Zimmerman (mdz) wrote :

Nothing to do with grub-installer; the most likely place to deal with this will be initramfs-tools.

That is, unless the kernel update has fixed it? Can anyone confirm?

Changed in grub-installer:
assignee: nobody → adconrad

Does the latest beta contain the kernel update? I will check it tonight then (otherwise I don't even get past install successfully).

By the way, this bug is not specifically related to a docking station. On my desktop computer, it occurs because I have an extra IDE controller on a PCI card.

udev already "tries" to enumerate storage devices in bus order, assuming that a custom kernel is not being used, the following order is used:

 1) Storage controllers (IDE and SCSI), one at a time, in bus order
 2) Bridges, docking stations, input devices, serial devices, intelligent devices, together.
 3) ide-generic (for old-fashioned ISA controllers)

This should mean that the internal IDE driver is loaded before the one in the docking station (assuming they are different controller chips, which I doubt).

I've never got confirmation of this, because nobody with this configuration has had the patience to help provide the necessary information, but I strongly suspect that the Laptop itself is the problem and that when the docking station is plugged in, the ports are all changed *anyway* and that driver order makes no difference.

Changed in initramfs-tools:
assignee: adconrad → keybuk

If someone with the IBM Laptop and docking station could do the following for us, that'd be most useful:

Boot with the docking station attached, you will most likely need to edit the kernel command line to say boot=/dev/hdeX rather than boot=/dev/hdaX; you can do this from the GRUB boot menu

The system should boot normally.

Attach /var/log/udev from that boot to this bug report.

Go back and boot without the docking station, and attach /var/log/udev from *that* boot to this bug report.

Changed in udev:
status: Confirmed → Needs Info

As requested, here is the udev that results from booting an up-to-date Dapper on an undocked ThinkPad T42. The next attachment will be the docked udev.

Here's the udev when booting with the ThinkPad docked. As Scott said, it required changing the root= part of the kernel line from =/dev/hda4 to =/dev/hde4 (not the boot= as Scott said in the message, but root=, which I assume was meant).

By the way (probably a separate bug) Dapper does NOT boot normally with this line changed. The trackpoint (mouse equivalent button) does not work at all when Dapper boots on a docked machine. The keyboard works, but the trackpoint does not function at all. I was able to e-mail the udev file to myself by maneuvering with the keyboard.

The trackpoint probably doesn't work due to the change in the input devices that occurs when you plug your docking station in, make sure your xorg.conf only refers to /dev/input/mice and/or /dev/psaux and NOT to any device explicitly by /dev/input/mouse*, event*, etc.

Ok, this is definitely not a udev problem!

Your laptop's internal IDE controller is at 0000:00:1f.1

UEVENT[1146688845.538269] add@/devices/pci0000:00/0000:00:1f.1
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:1f.1
SUBSYSTEM=pci
SEQNUM=2263
PHYSDEVBUS=pci
PHYSDEVDRIVER=PIIX_IDE
PCI_CLASS=1018A
PCI_ID=8086:24CA
PCI_SUBSYS_ID=1014:052D
PCI_SLOT_NAME=0000:00:1f.1
MODALIAS=pci:v00008086d000024CAsv00001014sd0000052Dbc01sc01i8a

Your docking station bridge is at 0000:00:1e.0

UEVENT[1146692267.872892] add@/devices/pci0000:00/0000:00:1e.0
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:1e.0
SUBSYSTEM=pci
SEQNUM=2306
PHYSDEVBUS=pci
PCI_CLASS=60400
PCI_ID=8086:2448
PCI_SUBSYS_ID=0000:0000
PCI_SLOT_NAME=0000:00:1e.0
MODALIAS=pci:v00008086d00002448sv00000000sd00000000bc06sc04i00

Which means the IDE controller inside the docking station is logically *before* the one inside your laptop.

UEVENT[1146692267.873048] add@/devices/pci0000:00/0000:00:1e.0/0000:02:03.0/0000:09:01.0
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:1e.0/0000:02:03.0/0000:09:01.0
SUBSYSTEM=pci
SEQNUM=2312
PHYSDEVBUS=pci
PHYSDEVDRIVER=CMD64x_IDE
PCI_CLASS=1018F
PCI_ID=1095:0648
PCI_SUBSYS_ID=1095:0648
PCI_SLOT_NAME=0000:09:01.0
MODALIAS=pci:v00001095d00000648sv00001095sd00000648bc01sc01i8f

I'm open to suggestions as to a suggested "fix" for this, the only one I can think of is probe-for-root-fs.

You can mount your root filesystem using

    root=UUID=77dac907-836f-4455-a7ed-807bfb2137ce

And that will work whether your docking station is connected or not, because it uses a charactaristic of the drive to locate it, rather than an enumerated location.

This has always been our long term plan anyway, for everybody in every situation -- as it makes us resilient to any hardware change.

Given the topography of the docked T42, the only other solution would be to blacklist the docking station IDE controller; that's a common one, so that can't be a solution in the distribution -- and not to mention people may actually want to boot off one!

I've booted the Dapper beta2 livecd and found the same problem. As before, lspci shows the IDE addon card later in bus order:

0000:00:08.0 PCI bridge: nVidia Corporation nForce2 External PCI Bridge (rev a3)
0000:00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
0000:00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev c1)
0000:01:07.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 04)
0000:01:07.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 01)
0000:01:08.0 RAID bus controller: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)

However, I think I've found the problem. Notice that the external PCI bridge comes before the nForce onboard IDE controller. In the udev log, it can be seen that the enumeration has a slightly different scheme, placing the addon card before the onboard controller:

PHYSDEVPATH=/devices/pci0000:00/0000:00:08.0/0000:01:08.0/ide0/0.0

The extra nesting seems to be the trouble? 0000:01:08.0 should come after 0000:00:09.0, but because of the 0000:00:08.0 it doesn't. Might this be the place to fix it?

Silly question, but how does MS Windows cope with this? Does it get the order of the drives from the BIOS and if so is the BIOS cunningly aware about the whole docking situation.

I don't know much about the WIndows boot loader, but I do know that it's easily confused -- so it's very likely it just asks the BIOS which it thinks is the master disk.

Of course, Linux is "clever" so it doesn't bother asking the BIOS for nuffin' <g>

Marten Klencke (mklencke) wrote :

Also, I've never run into the problem with other Linux distributions or previous Ubuntu versions. They don't use udev this way for hardware detection?

They probably don't support booting from drives on docking stations, or rely on an initrd or initramfs tailored for the system rather than one that can boot anything.

This new behaviour is pretty specific to post-2.6.15 kernels, and is because we now iterate the entire system tree rather than only supporting tiny parts of it.

I am the reporter of bug 40993, whereing the order of two pairs of
fixed disks attached pairwise to two separate internal IDE controllers
changes between installation and first boot. First boot consequently
fails. This occurs with 6.06 beta, but not with previous versions of
Ubuntu. My only solution to the unpredictable order in disk numbering
has been to attach all disks to the same controller which is an
unhappy situation for the performance of my RAID.

Is it possible that bug 40993 and this bug are related and if so, how
does that affect the discussion here.

Regards

  Gisli

On 5/4/06, Scott James Remnant <email address hidden> wrote:
> They probably don't support booting from drives on docking stations, or rely on an initrd or initramfs tailored for the system rather than one that can boot anything.
>
> This new behaviour is pretty specific to post-2.6.15 kernels, and is because we now iterate the entire system tree rather than only supporting tiny parts of it.
>
> --
> Hard disk order changes when using docking station
> https://launchpad.net/bugs/6367
>

It seems then that Kernel 2.6.15 and greater are inconsistent with the BIOS and GRUB?

What I tried to do is install Ubuntu onto a hard disk connected to my addon IDE controller. Ubuntu sees this as hda.

GRUB then needs to be installed onto, according to Ubuntu, hde (which is the boot device according to the BIOS and should be hda). Upon booting then, GRUB fails because it tries to find its data on a different drive/partition than what Ubuntu told it (hda according to Ubuntu, but to GRUB it's not the first drive).

So it's really painful to install Ubuntu if there is a disk connected to second controller because then the BIOS, GRUB and Ubuntu Linux don't agree upon numbering.

After doing a little bit of research, this problem is actually entirely unique to Ubuntu.

We find devices by walking the /sys/class, /sys/block and /sys/devices trees.

Upstream udev has settled on, instead, walking /sys/class, /sys/block and /sys/bus

The change from /devices to /bus means that rather than seeing the IDE controller as a child of the PCI Bridge, and thus logically before the internal IDE controller, it would see the IDE controller as a member of the PCI bus that appears later.

I'm not sure only a couple of weeks away from release is the right time to make such a major change to our hardware enumeration though -- we'll get this fix in edgy anyway when we resync with the upstream udev and kernel versions.

Subscribed mdz to get his opinion on the last message.

I did a comparison of the techniques on my AMD64, and the "upstream" method actually drops a few sysfs objects compared to our own, however all of these are simply "logical bus objects", e.g.

-/sys/devices/pci0000:00

"The PCI bus"

-/sys/devices/pci0000:00/0000:00:01.1/i2c-0
-/sys/devices/pci0000:00/0000:00:01.1/i2c-1

i2c bus points, with no devices under them

 /sys/devices/pci0000:00/0000:00:06.0
-/sys/devices/pci0000:00/0000:00:06.0/ide0
 /sys/devices/pci0000:00/0000:00:06.0/ide0/0.0
 /sys/devices/pci0000:00/0000:00:06.0/ide0/0.1

Here the interim "ide0" is ommitted, but not the actual devices under it.

 /sys/devices/pci0000:00/0000:00:08.0
-/sys/devices/pci0000:00/0000:00:08.0/host2
-/sys/devices/pci0000:00/0000:00:08.0/host2/target2:0:0
 /sys/devices/pci0000:00/0000:00:08.0/host2/target2:0:0/2:0:0:0

And here the USB host and target are lost, but the actual device is kept.

And so on (you get the idea).

Comically, the list generated happens to be the EXACT same list of devices that do nothing when you touch their uevent files anyway, ie. you don't get anything out of the kernel, not even an empty event.

Handy way of fixing *that* bug too (udevplug -s doesn't work, because it waits for those events)

Changed in udev:
status: Needs Info → Confirmed
koftinoff (jeffk) wrote :

I was forwarded to this bug from bug 36667. I encounter this problem on my desktop. I have an additional PCI IDE controller, and as of kernel 2.6.15 it seems my internal drive hda is now hde. Here is my lspci output:

0000:00:00.0 Host bridge: Intel Corporation 82845 845 (Brookdale) Chipset Host Bridge (rev 11)
0000:00:01.0 PCI bridge: Intel Corporation 82845 845 (Brookdale) Chipset AGP Bridge (rev 11)
0000:00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01)
0000:00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01)
0000:00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01)
0000:00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01)
0000:00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 81)
0000:00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 01)
0000:00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 01)
0000:00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01)
0000:00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
0000:01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX 440] (rev a3)
0000:02:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c810 (rev 12)
0000:02:08.0 Ethernet controller: Intel Corporation 82801DB PRO/100 VE (LOM) Ethernet Controller (rev 81)
0000:02:0c.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11)
0000:02:0c.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11)
0000:02:0e.0 RAID bus controller: Silicon Image, Inc. SiI 0649 Ultra ATA/100 PCI to ATA Host Controller (rev 02)

FYI, before I upgraded to dapper, /proc/ide/ looked like:

drivers
hda -> ide0/hda
hdb -> ide0/hdb
hdc -> ide1/hdc
ide0
ide1

After booting the new kernel, /proc/ide looks like:

cmd64x
drivers
hdc -> ide1/hdc
hdd -> ide1/hdd
hde -> ide2/hde
hdf -> ide2/hdf
hdg -> ide3/hdg
ide1
ide2
ide3

<email address hidden>
www.jdkoftinoff.com

Ted_Smith (tedsmith28) wrote :

I have experienced this too on a Desktop machine. I have 4 x 160Gb drives connected to Silicon Image PCI IDE Adaptor plus 1 x 120Gb drive connected to IDE on the mobo with Windows XP and Ubuntu installed.

It was fine with Breezy, with hda2 being seen on IDE1. Dapper (official June 1st release) can't handle it and thinks the partition is hde2.

Very disspatointed. As a newbie to Linux this has wasted hours of my time over several days. You can read about my newbish experiences here : http://www.ubuntuforums.org/showthread.php?p=1086430#post1086430

Very diussapotined. Back to Breezy for me then....

Majkeli (public-deflicted) wrote :

Same issue here. I have a Silicon Image PCI IDE Adaptor with two drives on it. Unable to update to Dapper.

Edward Mendelson (emendelson) wrote :

Just a general comment:

Over at ubuntuforums.org, hundreds of people are reporting problems installing Dapper because of this bug - which was first reported on 2 January 2006 - five months before release. First the report was rejected altogether; then it was confirmed, but, judging from the comments above, there seems to have been a conscious decision not to fix it.

My guess is that this decision will make more people give up on Ubuntu than anything else could have done.

I'd like to echo Edward's comments. I found the discussion on the forums and since it went on for 11 pages I assumed that this was not a known issue. If it was a known issue then shouldn't it have been listed in the Release Notes somewhere. I couldn't find anything.
So, now that we're here is there a workaround that can be posted somewhere. I saw the note about using UUID in menu.lst, but I have to edit /etc/fstab as well. Can something similar be used there?

therunnyman (therunnyman) wrote :

The workarounds are dirty, but available over at ubuntuforums.org, under the "problems?" sticky.

This bug was reported approximately 1,000,000 times - bug no.s 28614 and 44261 leap to mind - and confirmed approximately 10,000,000 times. And it is a bug: sytematic misrecognition of hard disks? You know how hard it is to build a RAID in the first place without having to work with this arbitrary drive enumeration?

Please, for the love of your users, fix this. There's all kinds of information on what to correct and how to correct it over at ubuntuforums.org, under the 'installation and upgrade help' section. I sincerely hope we don't have to wait for edgy for accurate drive enumeration...as an earlier poster said, this is a perfect way to send a hunk of users over to another distro.

While this bug may have been reported very early, it took a very long time to obtain enough information from the users affected to actually pinpoint what was causing the bug.

It is now clear what the bug is.

Unfortunately while fixing it may fix it for you, we have absolutely no way of knowing that it will not break somebody else's machine. That's why this fix will not go into dapper.

As has already been stated in this bug report, edgy will carry the fix from the day it opens (which has not happened yet).

At that point we will know what is broken by fixing this for you -- if nothing is broken, it is very likely that dapper will also be able to carry the fix.

Edward Mendelson (emendelson) wrote :

I'm ready to try it as soon as the fix is ready!

By the way, I tried to make clear on the forums that you (James) were extremely helpful and patient in guiding me toward a workaround - and have also reported the workaround on the forum (in summary form) for others to use. Here's the thread:

http://www.ubuntuforums.org/showthread.php?t=190496

therunnyman (therunnyman) wrote :

I think it's safe to say everyone appreciates the work you devs do, and you've given us not only a wonderful OS, bu also a wonderful community built around an excellent spirit. Please bear that in mind as I rave:

How exactly does the dev team feel saying, "Dapper's great, unless you want a RAID consisting of more than two disks."?

It's a single udev file needs rewriting, and maybe three lines of code in the installer, we all know that. It's not going to break anyone's system to have devices mount in a reasonable order. If you're afraid of breaking machines, I'm positive someone will loan us the space for another .iso.

I want Ubuntu to succeed. That is why I rave.

Your argument is flawed.

We changed the order from breezy to dapper (by accident) and it broke some people's machines.

Why are you so convinced that changing it again will not break other people's?

Rejecting assignment to grub-installer

Changed in grub-installer:
status: Unconfirmed → Rejected
koftinoff (jeffk) wrote :

The problem is that the current order is actually broken.

If you are concerned about breaking other people's machines, make it have an option so the user can select between /sys/bus or /sys/devices.

Currently on my desktop dapper system, every time a new kernel is installed, I have to manually modify grub's menu.lst because it erroneously set root=/dev/hda5 when it only works when it is set to /dev/hde5.. and my fstab and other entries are all manually changed to hde5...

jeffk

It's not broken, it's just proven to provide incorrect results in a certain set of circumstances.

We have not yet proved that ordering it the other way will not be equally "broken" for other people.

This is not a difficult concept to understand.

koftinoff (jeffk) wrote :

What _would_ prove it to you?

I will do this if you wish.

The current state _is_ broken - /dev/hda needs to be my first internal ATA controller, not my external one. Plugging in external drives/laptops into base stations should not break booting like it currently does.

And if you are concerned that changing it to be fixed would cause existing installs with incorrect ordering, then add an option to it.

This is not a difficult concept to understand.

If you concentrate harder, you can understand that the system does not even install properly if it iterates wrong because ubuntu dapper install and boot are internally inconsistent. Therefore broken.

--jeffk++

Rewriting udevplug to enumerate devices differently, and doing wide-spread testing of that; which is what I'm currently in the process of doing for edgy.

koftinoff (jeffk) wrote :

Well, what I don't understand is:

* I boot the ubuntu i386 installer on my machine
* Partitioner says that the my 200 meg disk drive is /dev/hda
* I install to /dev/hda
* the system fails to boot
* I boot from a live cd and mount /dev/hda
* I change /boot/grub/menu.lst and /etc/fstab to say "hde" instead of "hda"
* the system boots

So, the problem really is in the fact that the install disks enumerate differently compared to the normal system.

As far as disk drives go, this is horribly broken. Changing the disk drive enumeration (and only the disk drive enumeration) will only have the effect of fixing the broken systems. Any systems that are working will not break. (heh, except mine since I had to manually patch hde into it)

Either the install system needs to change or something else does.

jeff

Interestingly, upstream udev produces the same apparent bug that we have ... so any other distro is broken too.

therunnyman (therunnyman) wrote :

I didn't mean to start a war, Scott James Remnant. My words came from the fact we're working with Linux here, somethine we've traditionally used on servers and RAID arrays. I don't like the thought of a world stuffed with Windows servers, you know?

I wonder, could you help me with something? I've been working toward writing a udev file that would enumerate at least disks properly, if not most devices - this post-install. The language of udev is a little beyond me, though. Could you help me write this file, or point me in the right direction? I believe if we could do at least this, a lot of people would appreciate it.

runny

I'm not sure what you mean, udev is written in C which is a very common programming language (it's what most of Linux is written in)

therunnyman (therunnyman) wrote :

Yes, udev is written in C, and I'll assume you knew I know that. There's no need for any further hostilities; I'd prefer it if there were no further hostilities between us. We're working toward a larger end, I hope.

The culprits, among udev, are two files in the /etc/udev/rules.d/ directory: 00-init.rules and 65-persistent-disks.rules. What I'm striving to do is rewrite the latter such that, at the very least, disks are enumerated properly. The latter is the one folks are complaining about. If the latter can be rewritten, we'll have taken a large step toward fixing a purty nasty bug.

runny

No, those have nothing to do with it.

The culprit is the scan_block_tree() and recurse_tree() functions in udevplug.c

A half-fixed version is in the edgy source archive at the moment, it uses the symlinks in /sys/bus/pci/devices, however does not yet sort them so is effectively in a random order.

If I'm being hostile, it's only because of the attitude of people such as yourself on this bug. For example this bug has been reported exactly twice, not "1,000,000 times", and it took four months before a user was willing to perform the necessary tests to help us locate the bug. It certainly has not been confirmed "10,000,000 times" (your words, again) because I doubt we have that many users!

As you can see from the log, I've been perfectly fluffy until the point people starting throwing their toys out of their pram and threatening things if the bug was not fixed now, Now, NOW! WAAAH BIKKIT!

I've iterated many times that until we have performed mass-scale testing of the fix for this bug, there is no way to know whether it will break other people or not -- the chances are that it will. We were only a few weeks from release when this bug was finally identified as a problem in udevplug, and the fix identified -- that is not a time you fundamentally change the way hardware enumeration works.

I've also iterated that the fix has been around for some time now, and that it would go into edgy when it opened (it's already partially there, awaiting builds) -- and that if we have no adverse effects in edgy, it could be backported to either dapper-updates or at least dapper-proposed or dapper-backports.

Gisli Ottarsson (gisli) wrote :

At the risk of fanning the flames, can I ask which, specifically, are the two reports of this problem? I've been under the impression that my report in Bug #40993 is pertinent to the issue discussed above. Can this be confirmed?

I would like to restate my willingness to experiment for a fix -- to the extent that I am qualified.

The bug has not been a showstopper for me, but until it is fixed my four disk RAID is flogging a single IDE controller.

Thanks for your efforts. Everybody please stay cool.

  Gisli

You mis-understand the group of people that need to test the fix.

We don't need the people who are affected by this bug to test it, we need everyone who isn't affected by it to test it to make sure nothing else breaks.

I recently applied a large load of security updates (mosted Gnome and KDE stuff) that included a kernel update to 2.6.15-25.43
After the reboot to load the new kernel my IDE drive has moved back to /dev/hda even while docked. Has anybody else seen this? Does anybody know if this was intentional? I'm guessing not based on some of the comments made to this bug.
There is a changelog entry that might point to the source of fix (intentional for this bug or not), it reads:
  * Sync ide changes for better flash driver handling (flash isn't removable).

Not sure how something like that got rolled into a SECURITY fix, but there it is.

Mike

Björn Janßen (b-janssen) wrote :

I have only minor stuff to add.

First, it also affects IBM Thinkpads T2x with UltraDock II Stations (Type 2631). I have a half-baked workaround. Set PCI IRQ handling in the BIOS to AUTO (that's not the default for T2xs.) Pass the ACPI=force and irqpoll (optional, i found it working with and without, but without irqpoll the sound will fail occasionally) argument to the kernel. Now a 2.6.15 kernel will always assign the primary HD to hde. Ugly, but realiable. Oh, you loose USB connectivity this way.

Second, the bug propagated to Debian Etch and probably SID.

Björn Janßen (b-janssen) wrote :

Two things i forgot:

The T2xs i have had available for testing were T20s, T21s and T23s. All with at least 256 MB RAM, varying HDs but all with DVD-drive in the Laptops Ultrabay. I also had two Type 2631 docking stations available. All showed the same behaviour when running a 2.6.15 kernel. Stock Debian and Ubuntu 2.6.12 kernel were running fine. A 2.6.16 from ZenWalk GNU/Linux showed not the same problems but panicked on booting in the dock.

The other, more important thing is that enabling/disabling the PCI Power Management Feature in the BIOS led to erratic (really unpredictable, the previous issue was predictiably switching hda, hde in alternating turns) behaviour even with the above mentioned workaround.

Björn Janßen (b-janssen) wrote :

OK, one more and hopefully last comment: forget the above mentioned workaround. If you remove the laptop from the dock, the kernel finds hda again.

So, i'm trying the UUID solution but i encountered two roadblocks:
1) how to make GRUB or LILO use UUIDs? (rewriting the initrd.img should work, no?)
2) how to assign UUIDs to removable media drives like CD-ROM drives?

Please point me to any area that is more suited for these questions.

Regards

Steffen Torp (steffen-ubuntu) wrote :

This is a really serious bug. I just inserted a memory stick to my Ubuntu Thinkpad, and it wouldn't mount properly (detected as hde1 and not automatically mounted as a removable device as earlier). I then attempted a restart with the stick inserted - and oups, the computer won't boot, complaining that the there is no operating system on the disk. It seems that after the Dapper upgrade there has been some serious messing around with what constitutes the default HD.

Needless to say, this procedure has worked flawlessly from Warty through Hoary and Breezy, on the exact same hardware.

Ted_Smith (tedsmith28) wrote :

I have written a HOW TO for people suffering from this problem while the bug is ammended. It's titled "Fix 'ALERT! : dev/XYZ does not exist' after upgrade to Dapper Drake". I hope it may be able to help the developers with fixing it, perhaps?

http://www.ubuntuforums.org/showthread.php?t=197956

Thanks

Ted

The devs already know exactly how to fix it - the title of the bug
report is the answer. In fact they knew how to fix it before Dapper
was released. They simply didn't think it was worth fixing in Dapper
- which was the worst mistake the Ubuntu devs ever made. It's not a
mystery that needs to be solved - it's a fix that they knew but
didn't want to put in at the last moment because they were afraid the
fix might break other things. (It turns out that the fix doesn't
break anything else, but that was their decision.)

The least bad workaround is to use UUIDs for the boot identifier in
grub, as explained on that bug report. But that's not really good
enough for all situations. The only real fix is not to install Dapper
and instead wait for Edgy, which reportedly had the fix built-in from
day 1. I'm not bothering with Dapper because, to me, it's definitely
not worth the hassle.

Best,

Edward

On 4 Jul 2006, at 10:34 AM, GIZMO wrote:

> I have written a HOW TO for people suffering from this problem
> while the
> bug is ammended. It's titled "Fix 'ALERT! : dev/XYZ does not exist'
> after upgrade to Dapper Drake". I hope it may be able to help the
> developers with fixing it, perhaps?
>
> http://www.ubuntuforums.org/showthread.php?t=197956
>
> Thanks
>
> Ted
>
> --
> udev enumeration should use /sys/bus not /sys/devices
> https://launchpad.net/bugs/6367

On Tue, Jul 04, 2006 at 03:17:43PM -0000, Edward Mendelson wrote:
> The devs already know exactly how to fix it - the title of the bug
> report is the answer. In fact they knew how to fix it before Dapper
> was released. They simply didn't think it was worth fixing in Dapper

That is not true; the bug report already explains the reasons why this
change wasn't made for Dapper.

Please take further discussion to ubuntu-devel; this is not the place for
it.

--
 - mdz

From: "Matt Zimmerman" <email address hidden>
To: <email address hidden>
Sent: Tuesday, July 04, 2006 11:47 AM
Subject: [Bug 6367] Re: [Bug 6367] Re: [Bug 6367] Re: udev enumeration
shoulduse /sys/bus not /sys/devices

On Tue, Jul 04, 2006 at 03:17:43PM -0000, Edward Mendelson wrote:
>> The devs already know exactly how to fix it - the title of the bug
>>report is the answer. In fact they knew how to fix it before Dapper
>> was released. They simply didn't think it was worth fixing in Dapper

>That is not true; the bug report already explains the reasons why this
>change wasn't made for Dapper.

>Please take further discussion to ubuntu-devel; this is not the place for
>it.

Apologies. My memory misled me, and you are absolutely correct.

Paul Sladen (sladen) wrote :

For those still wondering; The workaround for this issue involves specifying the partition to boot from by ID, rather than the transient location:

  1. Boot a desktop/LiveCD.
  2. Run sudo /sbin/dumpe2fs /dev/sdX | grep UUID
  3. tell 'grub' or 'lilo' to boot with that ID:

     linux ... root=UUID=d4c51a2a-d93f-4dc1-8717-9c3cdb4d41ce

and also adjust this in:

  /boot/grub/menu.lst

ubuntu_demon (ubuntu-demon) wrote :

Paul thank you very much for this workaround!

There are a lot of people on the forums having problems with this bug.

I've asked them test this workaround by you (Paul Sladen) and report (meaningful) results either here or in this thread :
http://ubuntuforums.org/showthread.php?p=1257154

jwmislan (jwmislan) wrote :

to boot from by ID, rather than the transient location:

  1. Boot a desktop/LiveCD.
  2. Run sudo /sbin/dumpe2fs /dev/sdX | grep UUID
  3. tell 'grub' or 'lilo' to boot with that ID:

How can I obtain /dev/sdX ID. for reiserfs

Thanks
JWM

A friend of mine has such a config that totally prevented him from booting Dapper from the HDD. He is therefore still in Breezy.
I just asked him to test the Edgy Knot1 livecd and report me if the partitioning part of the installation process was possible, or at least if he can find his HDD in /dev.

ANSWER : NO !!! Seems as broken as Dapper on that point !
-> Gparted would hang forever without finding any HDD
-> no /dev/sd* and no /dev/hd* at all

Is there any other test to do ?

On Sat, Jul 22, 2006 at 06:12:18PM -0000, Paul RIVIER wrote:
> A friend of mine has such a config that totally prevented him from booting Dapper from the HDD. He is therefore still in Breezy.
> I just asked him to test the Edgy Knot1 livecd and report me if the partitioning part of the installation process was possible, or at least if he can find his HDD in /dev.
>
> ANSWER : NO !!! Seems as broken as Dapper on that point !
> -> Gparted would hang forever without finding any HDD
> -> no /dev/sd* and no /dev/hd* at all
>
> Is there any other test to do ?

No. It will be documented in the release notes when this change is included
in a Knot CD.

--
 - mdz

Matt Zimmerman (mdz) wrote :

Scott, now that the uuid mounting changes are in, can we go ahead and throw this switch?

This has been in edgy since the new udev was uploaded ages ago

Marking as fixed (for edgy)

Changed in udev:
status: Confirmed → Fix Released

Marking as open in dapper -- in case we decide to backport the fix to -updates

My current opinion is "NO", this broke several things in edgy (soundcard ordering, SATA vs. IDE ordering, network card ordering,etc.)

Fortunately we had other things in place in edgy to fix those before they broke

Changed in udev:
assignee: nobody → keybuk
status: Unconfirmed → Confirmed
ubuntu_demon (ubuntu-demon) wrote :

just curious : How are the current chances that this will get fixed for the next Point Release (6.06.2) ?

On Fri, Aug 11, 2006 at 07:59:07PM -0000, ubuntu_demon wrote:
> just curious : How are the current chances that this will get fixed for
> the next Point Release (6.06.2) ?

As explained earlier in this bug, this is an incompatible change which could
cause working systems to fail to boot. As such, it is not appropriate for
backporting to a stable release.

There is new infrastructure in Edgy which makes it robust against this type
of change, allowing the root filesystem to be found regardless of the device
name, so this problem will be solved for the future.

--
 - mdz

ubuntu_demon (ubuntu-demon) wrote :

>As explained earlier in this bug, this is an incompatible change which could
>cause working systems to fail to boot. As such, it is not appropriate for
>backporting to a stable release.
>There is new infrastructure in Edgy which makes it robust against this type
>of change, allowing the root filesystem to be found regardless of the device
>name, so this problem will be solved for the future.
>
>--
> - mdz

Thank you for the quick answer.

Yeah I read it was solved for Edgy. Great work guys!

I will report to the forum users that there is no chance that this will ever get fixed for Dapper. I will wait a couple of days before announcing to be sure.

Officially marking as Rejected for dapper.

Here is the reasoning:

 - We applied the "fix", as planned, to the udev in Edgy Eft.

 - While the fix corrected the problem described here, it caused new problems. Because it was a fundamental change to device ordering, it changed the order of other devices where more than one existed in the system -- including other disks

 - For Edgy this was not a problem, as the upgrade makes other changes that make device order non-important anyway

 - These changes are too invasive for backporting to dapper

 - This bug has a known workaround, which can be applied by the minority of users affected by it

 - That workaround is preferable to causing far more working systems to cease functioning.

Changed in udev:
status: Confirmed → Rejected
ubuntu_demon (ubuntu-demon) wrote :

Scott thanks for this information. I will post this in the forum threads about this bug.

(For your information : I added this also to my blog http://ubuntudemon.wordpress.com/2006/08/17/mounting-root-filesystems-bug-wont-be-fixed-for-dapper-fixed-in-edgy)

ubuntu_demon (ubuntu-demon) wrote :

Scott : I will try to collect meaningful forum feedback on your last suggested workaround here :
http://ubuntuforums.org/showthread.php?p=1257154

Changed in grub-installer (Ubuntu):
status: Invalid → New
Changed in ubuntu:
status: Invalid → New
Changed in udev (Ubuntu Dapper):
status: Invalid → New
Colin Watson (cjwatson) wrote :

Danilo, please don't reopen bugs that have been closed for over a decade without so much as an explanation of why. (In nearly all such cases, opening a new bug would be far more sensible.)

Changed in ubuntu:
status: New → Invalid
Changed in grub-installer (Ubuntu):
status: New → Invalid
Changed in udev (Ubuntu Dapper):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers