wireshark does not have large file (> 2GiB) support

Bug #190233 reported by James Troup
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Wireshark
Fix Released
Medium
wireshark (Ubuntu)
Fix Released
Wishlist
Unassigned

Bug Description

Binary package hint: wireshark

wireshark apparently does not have LFS support; I just tried to open a 7GB file and got a 'value too large for data type' dialog

Revision history for this message
In , Gtamisier (gtamisier) wrote :

Build information:
Version 0.10.13 (C) 1998-2005 Gerald Combs <email address hidden>

Compiled with GTK+ 2.4.14, with GLib 2.4.7, with WinPcap (version unknown),
with libz 1.2.3, with libpcre 6.3, with Net-SNMP 5.2.1.2, with ADNS.

Running with WinPcap version 3.1 (packet.dll version 3, 1, 0, 27), based on libpcap version 0.9[.x] on Windows XP Service Pack 2, build 2600.

Ethereal is Open Source Software released under the GNU General Public License.

Check the man page and http://www.ethereal.com for more information.
--

I have the error message : << An error occured while reading from the
file "C:\Sample.trace": Arg list too long. >> when doing the following action :

I open a very large trace file (2 GB) with a filter, then save the result to a
new trace file. Before the end, Ethereal stops writing the resulting trace to
the disk, with the error message.

Revision history for this message
In , Gtamisier (gtamisier) wrote :

Created attachment 154
Error message

Revision history for this message
In , ulfl (ulf-lamping) wrote :

This may be a misleading error message (unsure, but doesn't seem right).

Is this simply because your hard disk is full (or your qouta is reached)?
Did you changed the file type before in the save dialog? (probably not, just asking)

Any other ideas what might have caused this?

I've tried to reproduce this problem, but didn't had any luck. I was only
getting the expected error messages "... there is no space left ...".

Revision history for this message
In , Guy Harris (guyharris) wrote :

Googling for "E2BIG", which is probably the errno value corresponding to "Arg list too long", on
msdn.microsoft.com, revealed only some exec and spawn calls, i.e. calls where E2BIG is appropriate (it's
only *supposed* to be returned for argument lists to a new executable image being too large).

Perhaps there's a code path where errno isn't getting set but we're detecting an error, and errno was
previously set to E2BIG by some other call (not necessarily a system call).

Revision history for this message
In , Gtamisier (gtamisier) wrote :

No, my hard drive is not full, there is a lot of free space. Quotas are not
enabled.

I have not changed the file type before in the save dialog.

The problem arises when the resulting trace file is very large. I always have
this problem when the trace file is greater than almost 2,5 Gb.

Revision history for this message
In , Guy Harris (guyharris) wrote :

Greater than almost 2 1/2 GB, or greater than 2 147 483 647 GB? 2 1/2 GB is an unusual number, but 2
147 483 647 GB is 2^31-1 GB - I'd expect that if file-size problems showed up, they'd show up in files >
2^31-1 GB or 2^32-1 GB.

(If it's 2^31-1 GB, somebody at Microsoft needs to be taught the difference, in UN*Xland, between E2BIG
and EFBIG, which would be the *appropriate* error for a file too large for a particular API, if you're using
UN*X errnos for errors in that API.)

Revision history for this message
In , Gtamisier (gtamisier) wrote :

You are very likely right. I said 2,5 Gb, because all my trace files were more
than this size. It is not necessarily the limit.

Revision history for this message
In , ulfl (ulf-lamping) wrote :

You can handle files of that size in general and don't use a file system which
can't handle files of that size - don't remember correct, but FAT32 can't handle
files larger than 2GB (or was it 4GB) or some old samba versions.

Sounds there is a value or function used not capable handling more than 2GB
files, using a simple int which can't hold more than these 2GB. This will be
hard to find in the code :-(

Just for the record, are you using libpcap files or a different file format?

Revision history for this message
In , Gtamisier (gtamisier) wrote :

The file size limit of FAT32 is 4 Gb, not 2 Gb. Nevertheless, I was using an
NTFS file system, which doesn't have such a limit.

Yes, I was using a libpcap file.

Revision history for this message
In , Guy Harris (guyharris) wrote :

Ethereal currently uses longs for file offsets; that doesn't work for files >2GB on ILP32 platforms such
as 32-bit UN*Xes and Win32 (or LLP64 platforms such as Win64).

I don't know whether any of the Microsoft file APIs require an application to explicitly declare its
willingness to handle files >2GB; I think the Large File Summit recommendations for UN*X might call for
APIs of that sort, so that apps that think file offsets are 32 bits (e.g., by using off_t on systems where
off_t is 32 bits) get errors rather than, for example, getting file offsets truncated to 32 bits by lseek()
calls.

On Windows, the current versions of the MSVC++ library have _fseeki64() and _ftelli64() calls that use
__int64 values as file offsets; if those are present in the versions of MSVC++ we use (we won't be using
any that impose redistribution restrictions on binaries built with them, as Ethereal, being GPLed, can't
have limits on redistributability imposed), then we could look at using those on Win32, and arrange to
have file offsets be __int64.

On 4.4BSD-derived UN*Xes ({Free,Net,Open,Dragonfly}BSD, Darwin/OS X), off_t is 64 bits, and fseeko()/
ftello() are available; on those systems, we could look at using fseeko() and ftello(), and have file offsets
be off_t's.

I'm not sure what other UN*Xes use fseeko()/ftello(). ANSI C has fsetpos()/fgetpos(), but it makes no
guarantee that an fpos_t is an arithmetic type, and to avoid constantly fetching file positions (because
they tend to involve system calls), we assume that, after reading N bytes, the file position advances by
N bytes.

Whether adding 64-bit support of that type to Wiretap and the apps using it will clear up this
*particular* problem is unknown.

Revision history for this message
In , ulfl (ulf-lamping) wrote :

slightly change the summary to reflect the problem better

Revision history for this message
In , ulfl (ulf-lamping) wrote :

Some time ago I've fixed the wireshark code to use 64 bit integer for all places where a file offset is used, this would - in theory - fix the problem.

However, we use zlib to read capture files (which uncompresses gzipped files transparently for us). Unfortunately, zlib only supports (unsigned?) long as the file offset, which will leave us on 32bit platforms with the problem described.

So unless we drop gzip support (by not using zlib), or find a way to circumvent the zlib 32bit offset limitation, we'll stuck with this problem.

There were some thoughts to replace zlib with our own implementation, as zlib doesn't work pretty fast with random access either. But I don't have enough knowledge about this topic to do this myself.

Revision history for this message
In , Daniel Black (daniel-black) wrote :

for searchable gzip see sgzip as part of http://sourceforge.net/project/showfiles.php?group_id=100803.
As far as I know the PyFlag guys are still using it.

Revision history for this message
In , Guy Harris (guyharris) wrote :

sgzip's sgzlib.c says

   sgzip files are files based on the gzip compression
   library which are also quickly seekable and therefore may be used
   for applications where seeking is important.

and sgzlib.h says

   The sgzip file format basically relies on breaking the file into
   blocks, each of these blocks is compressed seperately (which
   results in a slight decrease of compression for small blocks). The
   file then stores these blocks in the compressed file. Just before
   each compressed block, an offset is stored to the next block in the
   file.

I.e., it is *NOT* a library that provides random access to gzipped files, it's a library that defines its own file format, designed to better support random access than gzip format does. That might be useful - although bzip2 format *also* should support random access, and is, I suspect, more commonly used - but it doesn't provide a way to do efficient random access to .gz files.

Revision history for this message
Michael B. Trausch (mtrausch) wrote :

This bug also affects mergecap which is part of this package... I just tried to create a merged file from a bunch of 100MB files, and the output stopped at precisely 2 GiB with the error message "mergecap: Error writing to outfile: Less data was written than was requested".

Marking bug as confirmed.

Changed in wireshark:
status: New → Confirmed
Changed in wireshark:
status: Unknown → Confirmed
Revision history for this message
Matt Davey (mcdavey) wrote :

Handy mergecap workaround: send output to stdout using '-w - ' and redirect, optionally passing through 'gzip -c'. This will let you write an arbitrarily large mergecap file, as long as you have input files less than 2GB.

That said, it would be great to have LARGEFILE support for wireshark and friends...

Revision history for this message
In , Bugzilla-admin-z (bugzilla-admin-z) wrote :

Consolidate the 0.10.x release versions.

Revision history for this message
Kees Cook (kees) wrote :
Kees Cook (kees)
Changed in wireshark (Ubuntu):
importance: Undecided → Wishlist
Changed in wireshark:
importance: Unknown → Medium
Revision history for this message
In , Guy Harris (guyharris) wrote :

On the trunk, we're no longer using zlib's I/O routines, and have made changes that should support large files on Windows (we no longer support Windows 9x/Me, so we use APIs that are NT-only) as well as on UN*Xes that include large file support.

Changed in wireshark:
status: Confirmed → Fix Released
Revision history for this message
Evan Huus (eapache) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. However, I am closing it because the bug has been fixed in the latest development version of Ubuntu - Oneiric Ocelot.

If you need a fix for the bug in previous versions of Ubuntu, please follow the instructions for "How to request new packages" at https://help.ubuntu.com/community/UbuntuBackports#request-new-packages

Changed in wireshark (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.