Remove zlib double-dependency

Bug #566923 reported by Martin Packman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Martin Packman

Bug Description

Bazaar on windows currently needs two copies of zlib, one in the python dll (or as zlib.pyd for Python 2.4) and zlib1.dll separately.

This is a little silly, though there are reasonable explanations:
<https://lists.ubuntu.com/archives/bazaar/2009q2/056268.html>
<https://lists.ubuntu.com/archives/bazaar/2009q2/057449.html>

The only reason for the second copy is bzrlib._chk_map_pyx needing a crc32 function. As that's about 12 lines of C, seems overkill to ship a whole library for it.

Related branches

Revision history for this message
Martin Packman (gz) wrote :

Couple of solutions in the branches linked, also possible if neither of those is acceptable, one of John's original solutions:

# "pulling that bit of code into this file" - crc32 is not a complicated function to include in the source.
# "remove the optimized _search_key functions" - are they still a bottleneck?

Revision history for this message
Alexander Belchenko (bialix) wrote : Re: [Bug 566923] [NEW] Remove zlib double-dependency

Martin [gz] пишет:
> Public bug reported:
>
> Bazaar on windows currently needs two copies of zlib, one in the python
> dll (or as zlib.pyd for Python 2.4) and zlib1.dll separately.
>
> This is a little silly, though there are reasonable explanations:
> <https://lists.ubuntu.com/archives/bazaar/2009q2/056268.html>
> <https://lists.ubuntu.com/archives/bazaar/2009q2/057449.html>
>
> The only reason for the second copy is bzrlib._chk_map_pyx needing a
> crc32 function. As that's about 12 lines of C, seems overkill to ship a
> whole library for it.

I'd better have this function in some core C-extension.

--

All the dude wanted was his rug back

Revision history for this message
Robert Collins (lifeless) wrote :

It is 12 lines of C, but really - how big is the library - < 100Kb ? really
not an issue, given that we don't have to maintain its code or worry about
it.

Revision history for this message
Martin Packman (gz) wrote :

You should worry about it because paging in even a small library has measurable cold startup cost.

I worry about it because if I forget and just type `setup.py install` the build fails.

Have to instead remember to do `setup.py build_ext --allow-python-fallback install` which is annoying.

I chose that over `setup.py build_ext -I C:\Dev\zlib-1.2.4 -L C:\Dev\zlib-1.2.4 install` because then at least I don't have to worry about putting a library somewhere it can interfere with other programs. By general standards zlib is marvellously stable, but there's just enough incompatibility between the many different projects that use it to cause occasional pain.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 566923] Re: Remove zlib double-dependency

So, I'm extremely happy for anyone on a given platform to drive and decide
what that platform does; if its best to embed the code for windows - thats
fine, lets do that. As long as Alexander and you come to some agreement :P.

For unix though, I'm not at all convinced yet.

Revision history for this message
Martin Packman (gz) wrote :

This isn't such a critical optimisation that it's worth doing weird build things, even only on one platform. Profile results for `bzr --lsprof log -v bzr.dev > nul` which John suggests should be pretty hot support this.

The current state of affairs:

   CallCount Total(ms) Inline(ms) module:lineno(function)
           1 50.0559 0.0000 bzrlib.commands:653(run_argv_aliases)
       75311 0.1693 0.1693 <bzrlib._chk_map_pyx._search_key_255>

Likewise without _chk_map_pyx.pyd installed:

           1 65.5518 0.0001 bzrlib.commands:653(run_argv_aliases)
       75311 1.6820 0.8413 bzrlib._chk_map_py:48(_search_key_255)
      +75311 0.4698 0.3000 +bzrlib._chk_map_py:28(_crc32)
      +75311 0.1468 0.1468 +<struct.pack>
      +75311 0.1321 0.1321 +<method 'replace' of 'str' objects>
      +75311 0.0920 0.0920 +<method 'join' of 'str' objects>

With the -pyapi branch:

           1 49.7405 0.0000 bzrlib.commands:653(run_argv_aliases)
       75311 0.2162 0.2162 <bzrlib._chk_map_pyx._search_key_255>

So, there's win in a compiled version of the function, but even the adventurous fix for this bug doesn't hurt performance enough to complain about.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin [gz] wrote:
> This isn't such a critical optimisation that it's worth doing weird
> build things, even only on one platform. Profile results for `bzr
> --lsprof log -v bzr.dev > nul` which John suggests should be pretty hot
> support this.
>
> The current state of affairs:
>
> CallCount Total(ms) Inline(ms) module:lineno(function)
> 1 50.0559 0.0000 bzrlib.commands:653(run_argv_aliases)
> 75311 0.1693 0.1693 <bzrlib._chk_map_pyx._search_key_255>
>
> Likewise without _chk_map_pyx.pyd installed:
>
> 1 65.5518 0.0001 bzrlib.commands:653(run_argv_aliases)
> 75311 1.6820 0.8413 bzrlib._chk_map_py:48(_search_key_255)
> +75311 0.4698 0.3000 +bzrlib._chk_map_py:28(_crc32)
> +75311 0.1468 0.1468 +<struct.pack>
> +75311 0.1321 0.1321 +<method 'replace' of 'str' objects>
> +75311 0.0920 0.0920 +<method 'join' of 'str' objects>
>
> With the -pyapi branch:
>
> 1 49.7405 0.0000 bzrlib.commands:653(run_argv_aliases)
> 75311 0.2162 0.2162 <bzrlib._chk_map_pyx._search_key_255>
>
> So, there's win in a compiled version of the function, but even the
> adventurous fix for this bug doesn't hurt performance enough to complain
> about.
>

Care to run 'bzr log -v' without --lsprof? It often significantly
penalizes non-compiled code (moreso than compiled code).

Generally, though, I'm thinking your right. Importing crc32 is probably
fine. We can probably tweak it a bit to avoid some object lookup calls.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPU2UACgkQJdeBCYSNAAPN+gCfaYL3gEW2QuYNIAlOlVXtUXMq
h5UAn0EgxRrA4tagkojxkQSW8A/R78oD
=vSkv
-----END PGP SIGNATURE-----

Revision history for this message
Martin Packman (gz) wrote :

Yeah, compiled vs. pure python wasn't quite as much as an order of magnitude win in microbenchmark. Timings without profile overhead

Without _chk_map_pyx.pyd installed:

Elapsed Time: 0:00:31.640
Process Time: 0:00:31.359
System Calls: 377941
Context Switches: 61628
Page Faults: 439021
Bytes Read: 22641894
Bytes Written: 2369744
Bytes Other: 298218

With (note, contains other more significant optimisations as well):

Elapsed Time: 0:00:27.093
Process Time: 0:00:26.765
System Calls: 357177
Context Switches: 55174
Page Faults: 432277
Bytes Read: 22303724
Bytes Written: 12323866
Bytes Other: 292632

Revision history for this message
Martin Packman (gz) wrote :

Note, there is some downside to the current code even on linux, as just demonstrated on IRC. People like being able to use setup.py scripts, and don't always have all the right -devel packages installed even for ubiquitous things like zlib:
<http://irclogs.ubuntu.com/2010/04/21/%23bzr.html#t20:32>

Revision history for this message
Vincent Ladeuil (vila) wrote :

@gz: Feel free to set status and importance, especially when you start working on a bug.

Changed in bzr:
assignee: nobody → Martin [gz] (gz)
status: New → In Progress
importance: Undecided → Medium
Revision history for this message
Martin Packman (gz) wrote :

See bug 570287 for another example of a user getting confused by this when trying to install bazaar.

Revision history for this message
Matt Nordhoff (mnordhoff) wrote :

Since the associated branch landed, I'm marking this fixed. Correct me if I'm wrong!

Changed in bzr:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.