hardy regression: reading from a urllib2 file descriptor happens byte-at-a-time

Bug #214183 reported by James Troup
2
Affects Status Importance Assigned to Milestone
python2.5 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: python2.5

Hi,

When reading from urllib2 file descriptor, python2.5 in hardy will
read the data a byte at a time regardless of how much you ask for.
python2.4 will read the data in 8K chunks.

This has enough of a performance impact that it increases download
time for a large file over a gigabit LAN from 10 seconds to 34
minutes. (!)

Trivial/obvious example code:

    f = urllib2.urlopen("http://launchpadlibrarian.net/13214672/nexuiz-data_2.4.orig.tar.gz")
    while 1:
   chunk = f.read()

... and then strace it to see the recv()'s chugging along, one byte at
a time.

--
James

Related branches

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python2.5 - 2.5.2-2ubuntu3

---------------
python2.5 (2.5.2-2ubuntu3) hardy; urgency=low

  * Fix urllib2 file descriptor happens byte-at-a-time, reverting
    a fix for excessively large memory allocations when calling .read()
    on a socket object wrapped with makefile(). LP: #214183.

 -- Matthias Klose <email address hidden> Tue, 08 Apr 2008 23:27:23 +0200

Changed in python2.5:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.