Lucid reads file size wrong

Bug #538165 reported by Erick Brunzell
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Ubuntu CD Images
New
Undecided
Unassigned
nautilus (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: nautilus

I can't be certain that Nautilus is the culprit here but I noticed in Lucid on 03/10/2010 that my zsynced "lucid-desktop-i386.iso" showed a size of 718MB although the daily build web page showed it as 685MB. I thought perhaps zsync messed up (although it never has before and the md5sum matched) so I downloaded a whole new one to my Downloads folder. Still showed oversize but I thought it was just a fluke.

Today I decided to zsync the new image and it showed up oversize again, the daily build web page said 676MB but it shows up in Nautilus as 708.3MB. So this time I booted into Jaunty (same machine multi-booted) and the same image was seen in Jaunty as being the correct size.

Now, I had one more potential "cause" to rule out. My Downloads and Documents folders are shared by multiple distros via symlink. So to rule out the symlinking being a part of the problem I first copied the image to the Desktop and it still showed up too large, so just to be sure I downloaded a whole new iso to the Desktop and still the same.

The actual image size is 675.5MB according to either Jaunty or Hardy, but in Lucid it shows up as 708.3MB.

One reason I doubt Nautilus being the sole culprit is that the image shows up too large in Brasero also, but I'm stumped.

A bit of info:

lance@lance-desktop:~$ lsb_release -rd
Description: Ubuntu lucid (development branch)
Release: 10.04
lance@lance-desktop:~$ apt-cache policy nautilus
nautilus:
  Installed: 1:2.29.92.1-0ubuntu1
  Candidate: 1:2.29.92.1-0ubuntu1
  Version table:
 *** 1:2.29.92.1-0ubuntu1 0
        500 http://archive.ubuntu.com lucid/main Packages
        100 /var/lib/dpkg/status

Sorry I couldn't use Apport but couldn't due to this:

https://bugs.launchpad.net/ubuntu/+source/apport/+bug/538097

I'll be glad to provide any additional info requested.

Tags: units-policy
Revision history for this message
Erick Brunzell (lbsolost) wrote :

I just double checked this running the 03/12/2010 Live CD and it also reads the isos as being oversize so it's not just a fluke with my installed Lucid.

Changed in nautilus (Ubuntu):
status: New → Confirmed
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Thank you for your bug report. However, this is due to recent changes to standardise the units that are used on the desktop (see https://wiki.ubuntu.com/UnitsPolicy for details).

Your issue is that the web page is using the SI prefix for presenting a base 2 number, which is incorrect (718 * (10^6/2^20) = 685)

Changed in nautilus (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Erick Brunzell (lbsolost) wrote :

Well I'd just say that for Nautilus to now see that image as 708+MB makes one think that it will not fit on a 700MB CD. Also before reading this I installed Thunar and I see it still reads the image as 675.5MB.

So I'm a bit puzzled?

Revision history for this message
William Shand (williamshand14) wrote :

I can verify this as well, i have various files that should be 30MB less than it shows in nautilus

Revision history for this message
Stephan Muhs (stephan-dinoco) wrote :

It is flabbergasting that the marketing spin (started in late 80s, I believe) to make hard drives look bigger by redefining the kilobyte as 1000 bytes has now hit Ubuntu as well. Every filesystem I have ever worked with did stick to the old definition of kb (1024 bytes) and MB (1024*1024 MB) and all applications and all users have come to rely on it.

Introducing change for change's sake is helping nobody - and just marking this as "Invalid" is downright arrogant in my book. If you need to introduce insanity in Lucid's filesystem, at least make it configurable for those of us who do not want to loose consistency with the rest of the world.

Revision history for this message
Stephan Muhs (stephan-dinoco) wrote :

Sorry for the typo: instead of "MB (1024*1024 MB" it should of course read "MB (1024*1024 bytes). Where is the edit function?

Revision history for this message
Endolith (endolith) wrote :

It's not marketing spin. It's the international standard system of units that has been used for hard drives, networking speeds, processor speeds, DVDs, Blu-Ray, and many other things since the beginning of time. kilo- = 1,000, mega- = 1,000,000, ...

Also, it's more user-friendly and just makes a lot more sense than the Windows way of doing things, where a drive that holds 300,000,000,000 bytes is displayed as "279.4 GB" in one place and "286,102 MB" in another. It's nonsensical.

This is a good change.

tags: added: units-policy
Revision history for this message
Erick Brunzell (lbsolost) wrote :

I just want to clarify a couple of things here for those who may not follow the discussion on the forums:

http://ubuntuforums.org/showthread.php?t=1428022

I'm just an end user with no actual computer training and I respect your opinions as developers with true technical knowledge in this field. I hope you can respect my opinion as well. As I'm the only subscriber to this thread ATM I assure you I have no intention of debating this ad nauseam, now that I understand what's up I'll adjust to it, but I just want to point out how this effects the common end user such as myself.

Point #1: CD's are marketed and labeled as 700MB. (I've never bought discs that would make me aware that, "Your CD is actually 734 MB = 700 MiB." BTW that quote is taken from Endolith's post here:

http://ubuntuforums.org/showpost.php?p=8957863&postcount=9

Point #2: Currently the devs that put up the Ubuntu, Xubuntu, and Kubuntu daily builds show the size incorrectly according to Chris Coulson: "Your issue is that the web page is using the SI prefix for presenting a base 2 number, which is incorrect (718 * (10^6/2^20) = 685)". That comes from post #2 of this report. So they must all change to using your new "correct" method?

Point #3: If the daily builds, the iso testing builds, and the release announcements themselves all adopt this standard, as described above, won't most end users, such as myself, think they would then need to use a DVD or Flash Drive? How would that work? I'd really like an answer to that one!

Point #4: At this time I'm multi-booting Win XP, Hardy, Jaunty, Karmic, Lucid, Mint Gloria, Fedora 12, and Debian Squeeze. I even tried the current Lucid Thunar and it still follows the old behavior. Only Lucid's Nautilus employs this new policy, is this really a good move given points #1, #2, and #3 above?

In conclusion, as an "uneducated" end user, I must say I don't see how this can work without getting the disc manufacturers to clarify the difference between MB and MiB, do you really think you can bring that about?

I guess we'll all need to get re-educated to keep using Ubuntu.

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Indeed, it seems that cdimages.ubuntu.com currently violates the units policy. I've opened another task for ubuntu-cdimage, but I'm not entirely sure that this project also deals with the hosting of the images, so it might not be the correct project

Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Also, remember that hard disks are sold in base-10 SI units, so it doesn't make sense for Nautilus to use base-2 units (which it was doing in previous releases). Users were confused with base-2 units, with comments such as "Nautilus only shows 298GB available on my nice new shiny 320GB hard-disk. What's using the rest of the space?" being quite common.

Revision history for this message
Erick Brunzell (lbsolost) wrote :

Who is going to address the relabeling of marketed CD's so people will understand that, " Your CD is actually 734 MB = 700 MiB"?

Revision history for this message
Colin Watson (cjwatson) wrote : Re: [Bug 538165] Re: Lucid reads file size wrong

Apache's mod_autoindex deals with the directory listing, and I don't
think we can control its units display.

Erick is right to point out that this is a flaw in the units policy,
though. When looking at the units policy in the Technical Board, we
took account of the units in which various things are traditionally
labelled; that's why disks are specified as decimal (since, like it or
not - and I don't - that's how the things are labelled) and RAM as
binary. There are always going to be different uses for file sizes, but
their units seemed to follow most naturally from disks.

In this case, though, CDs have always, unlike hard disks, been labelled
in binary megabytes, and my opinion is that it would be against the
spirit of the new units policy (even if not its letter) to quote CD
image sizes in decimal megabytes. This is my personal opinion, and I'm
not speaking for the rest of the Technical Board here, but I believe I
could make a good case for an exception. We should not blindly rush to
apply a (very!) new policy to an edge case that wasn't considered when
the policy was written.

Perhaps the best approach is to note the discrepancy in a footnote on
the web pages. Yes, it's a shame that we have to talk about this
gobbledegook at all, but with CDs and hard disks being labelled in
different units, the only thing that we can possibly do is move the
confusion around. As Chris observes, there was confusion before the new
units policy, just in a different place.

Revision history for this message
Erick Brunzell (lbsolost) wrote :

Remember also that we're talking about an LTS release!

IMHO no huge change should occur without thoughts of it's long term implications.

Revision history for this message
Endolith (endolith) wrote :

"In this case, though, CDs have always, unlike hard disks, been labelled in binary megabytes, and my opinion is that it would be against the spirit of the new units policy (even if not its letter) to quote CD image sizes in decimal megabytes."

Yes. DVDs are measured in decimal, though, and that was causing even more confusion for people who couldn't fit 4.4 "GB" on a 4.7 GB disk.

No matter what is done, the software needs a well-designed interface that shows the user what they need to see in a given context. I think Brasero should show both measurements side-by-side, for instance, or show the disk size and image size in the same units near each other. I don't care if you write "700 MiB" or "734 MB", but please don't continue using the erroneous "700 MB" with no qualifier.

And yes, only half-fixing the problem and leaving a bunch of apps conflicting with each other in an LTS release is not good. (But not really the end of the world, either, since the status quo is already inconsistent and confusing.)

Revision history for this message
Endolith (endolith) wrote :

"since, like it or not - and I don't - that's how the things are labelled"

And for the record, to present another point of view, I *like* the way hard drives are labeled. Decimal is what people are used to working with in everyday life. We don't use 1024 for anything in real life.

Writing disk and file sizes in base 2 multiples is needlessly confusing and serves no purpose that I know of. It doesn't simplify anything or fit any natural sizing of the numbers. There's nothing about hard drives or files that lends them to binary measurements. The only thing that naturally comes in powers of two is memory, and we have "KiB" and "MiB" for that. It has never made any sense to me why people think it preferable to label a drive that holds close to 100,000,000,000 bytes as "93 GB". It's not logical or user-friendly, and the only rationale I've heard for continuing to do it that way is that Windows does it that way.

Revision history for this message
Erick Brunzell (lbsolost) wrote :

"I think Brasero should show both measurements side-by-side"

Can that be done within the constraints of Beta 1?

Beta 1 is due in just a few days.

I'd add that I've been iso testing since Jaunty and bugs encountered in iso testing can slow the whole process even if they're not truly bugs, as is the case here.

A decision must be made very soon!

Revision history for this message
Colin Watson (cjwatson) wrote :

Endolith: If you're going to argue that memory is naturally in powers of
two, then so are hard drives since they have 512-byte sectors. In fact,
the industry is switching over to 4KiB physical sectors, and I hear that
some modern disks even like partitions to be aligned on MiB boundaries.
I'm sure that will cause some confusion for those counting in decimal
and looking closely at their partitions, but there's not much to be done
about it.

Erick: I don't see a tearing urgency to change the cdimage site for beta
1. If there's a bit of confusion that's different from the confusion
that was there before, so be it.

Revision history for this message
Benjamin Drung (bdrung) wrote :

Erick, you are not the only subscriber. There are a couple of people following this bug. For example all members (117 at the moment) of Ubuntu Desktop Bugs get notifications.

Point #1: Yes, this a real problem. Hard disks and DVD sizes do not confuse users any more, but CD will now.

Point #2: Yes, that the preferred solution. Base-2 -> IEC prefixes and base-10 -> SI prefixed. No SI prefixes for base-2.

Point #3: When an ISO is labeled as CD image, I would assume that it can be burned on a CD. Brasero will show xy MB remain, when I try to burn it. So I wouldn't be worried. IMO at least a notice should be added or the sizes should be displayed in both versions (base-10 and base-2) on the websites.

Point #4: We are at the beginning in converting all applications to follow the units policy. Someone has to start following the given standards (SI, IEC). Thunar needs to get fixed before the release. I hope that Ubuntu will have impact on the other distributions (primarily Debian) and that they will use this policy, too. If you prefer base-2 units, you can use the newly created base-2 PPA [1].

I doubt that we can educate the disc manufacturers.

[1] https://launchpad.net/~bdrung/+archive/base-2

Revision history for this message
Endolith (endolith) wrote :

The total capacity of an optical or magnetic disk isn't based on the size of its sectors. It's based on fitting the maximum number of bits in the surface area of a given circle. They aren't going to throw away storage space just to make it an even multiple of the nearest power of 2.

    sudo fdisk -l

    Disk /dev/sda: 160.0 GB, 160041885696 bytes

How is it beneficial to display this disk to the user as "149 GB", or 152,627 MB"? What advantage is there to this convention? How does this make things more intuitive or convenient for the user?

"160.0 GB" makes a lot more sense to me. Likewise "500 GB" for a 500107862016 bytes disk, and "8.0 GB" for 8040480256 bytes.

Revision history for this message
Asif Youssuff (yoasif) wrote :

I'm not sure I really understand why the units policy is so strange -- file sizes have always been reported on pretty much any OS I can think of as base-2 even when using SI units (wrongly).

To now go to changing to base-10 just to use the SI units correctly seems wrong -- why not simply update the MB to MiB, and etc? That way, everything, from RAM sizes, to file sizes, will use the same familiar general sizes, and the units are updated to be correct, and more importantly, educational.

So what if hard drive manufacturers use base-10 units? No OS ever has, and to change it to avoid confusion is solving the wrong problem, imo. We should fix the units and presentation, not create even more inconsistency -- I love that command line tools get an exemption in the units policy, and now CD media may too.

If we are aiming for less confusion, creating even more inconsistency makes no sense. Why not do something more simple and more logical -- pick the binary measurements that have been in use for decades (hence the cli exemption), and update the SI units to IEC units /everywhere/.

Just my two cents, even though I doubt we will see any change on the units policy, even though it has clearly not been thought through to be logical.

Also, I am disappointed to see that Mac OS X has gone to SI units for reporting hard drive sizes as well, although I suspect from their KB article that they still report file sizes using base-2 and using SI units. More inconsistency, yet the units policy actually references this wrongheaded approach.

Revision history for this message
Doug McMahon (mc3man) wrote :

As it stands the web pages would only present possible confusion to those who are already using karmic, a footnote would only need to address that (your download will be reported as larger in MB's if using karmic already

For all others (except poss. mac os) it would appear as 'normal' (< than 700MB's

Revision history for this message
Erick Brunzell (lbsolost) wrote :

I just wanted to thank Benjamin Drung for the PPA although I've not used it yet. Also just BTW my comment about being the only "subscriber" was accurate ATM and was actually meant to address what appeared to be Endolith's response to Stephan Muhs even though he hadn't bothered to subscribe.

After reading all of the responses here and at the forums I'd have to conclude that a policy change was needed. While the new policy may create some confusion the old "policy" (maybe should say "practice") had clearly been flawed for ages. For DVD users and flash drive users the old "policy" was undoubtedly flawed.

Now the challenge is to minimize the confusion as much as possible for the reasons I've pointed out previously. That's really beyond my capabilities, but I do recommend keeping the explanation as brief as possible. I would however recommend at least a brief mention in the Beta 1 release notes.

Kudos to all for taking on this challenge and my sincere apologies if I ruffled any feathers.

Revision history for this message
Neal McBurnett (nealmcb) wrote :

Marked as a duplicate of bug 538783 though they are a bit different.
See also bug 369525.

I think the new Units Policy is an important step forward to deal with the inherent ambiguities and contradictions of using the same unit prefix for different units. But it was approved too late to deal with the tricky corner cases and work with upstreams for Lucid.

Revision history for this message
Endolith (endolith) wrote :

Asif Youssuff, I disagree with you wholeheartedly. I think that changing SI units to IEC units everywhere would be wrong-headed, illogical, and would create more confusion, and I am happy that Apple took the initiative to finally work towards fixing this. Many technically-oriented users have very strong opinions against this, but I don't think they're objective or logical. It's just a matter of perspective:

* If you were taught in an 80s computer course that KB = 1024 B, you will think that this is the only true definition. You will think of Windows file sizes, memory sizes, and 5.25 inch floppy disk sizes as "correct", and everything else as wrong. The evil hard drive, flash drive, and DVD manufacturers are all in on a conspiracy to defraud you, Apple is in league with the devil for using marketroid measurements for file sizes, etc.

* If, on the other hand, you are a scientist or engineer or European, and were taught the metric system first, you will think that it is the only true definition. Hard drive, DVD, processor speed, networking rate, and other measurements are correct and logical, and the K = 1024 definition is a weird aberration created by early computer marketing types so that they could say 32 + 32 = 64 instead of 65 when talking with laymen. (See Donald Morrison 1968)

* If, on the other other hand, you're an average computer user without a technical background, you don't know or care about the discrepancy between the two units, but you'll occasionally run into problems like trying to fit 4.7 GiB of data on a 4.7 GB DVD.

Revision history for this message
oddasee (stamour-robert) wrote :

The file size should be BYTE COUNT period
                         1000 Bytes = 1000 Bytes
                         1 GB = 1e9 Bytes

A byte and a gigabyte differ by a gig. You are counting bytes, not powers of two.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.