Reducing peak memory for commit

Bug #566940 reported by John A Meinel
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Unassigned

Bug Description

While working on refactoring PackCollection, I came across some inefficiencies in how we handle bytes.

A while back, I worked on decreasing the peak memory consumption for commit, and I think I managed to actually get it down to a single copy of the uncompressed content of a file.

However, it appears that we currently (as of bzr.dev 5165) have 1-uncompressed copy, and 3-compressed copies. For content that doesn't compress well, this can be significant (for incompressible content, this is a 4x peak memory requirement.)

See also bug #109114

I can reproduce this with:

 dd if=/dev/urandom of=1mb bs=1024 count=1024
 echo "start of text" > 80mb
 for i in `seq 80`; do cat 1mb >> 80mb; done

 bzr init .
 bzr add 80mb
 bzr commit -Dmemory -m "80mb file"

Note that 1MB of random data is too big for zlib to do anything with, so the compressed size is still ~80MB.

Peak memory for the commit is ~360MB.

Dumping the memory during the 'write to pack file' steps, shows 4 80MB strings.

1: 'B83911719\n\ngcb1z\n83911695\n83886099\nx\x9c\x00\x05@\xfa\xbff...
2: 'gcb1z\n83911695\n83886099\nx\x9c\x00\x05@\xfa\xbff\x8e\x80\x80(...
3: 'x\x9c\x00\x05@\xfa\xbff\x8e\x80\x80(start of text\n\xc9\x92\xe1...
4: 'start of text\n\xc9\x92\xe1\xa6\xac[\x85\xf6_9\xad\xbb\x8c\xc7...

4: appears to be the raw content of the file, stored in
 GroupCompressBlock._content_chunks = ['f', '\x8e\x80\x80', <4>]

3: appears to be
 GroupCompressBlock._z_content = <3>

 Which should be the zlib compressed content.

So it seems that GroupCompressBlock holds on to both _content_chunks and
_z_content simultaneously. Also note that:
 GroupCompressBlock._create_z_content_from_chunks is:

    def _create_z_content_from_chunks(self):
        compressor = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION)
        compressed_chunks = map(compressor.compress,
                                self._content_chunks)
        compressed_chunks.append(compressor.flush())
        self._z_content = ''.join(compressed_chunks)
        self._z_content_length = len(self._z_content)

The second to last line potentially allocates 2x compressed-size memory.
I think it is reasonable to 'consume' the chunks as we do the
compression. Perhaps something like:

  chunks = self._content_chunks
  self._content_chunks = None
  chunks.reverse()
  while chunks:
    next = chunks.pop()
    compressed_chunks.append(compressor.compress(next))
  next = None
  compressed_chunks.append(compressor.flush())
  self._z_content = ''.join(compressed_chunks)

I wonder if collapsing the compressed content is worthwhile. Or whether
we should be using self._z_content_chunks as the interface instead. Note
that 95% of the time when reading content, we have it as a single
string. And that the code is written so that we *can* decompress a
freshly compressed block. (In case someone writes it and then asks to
read it again.)

2: appears to be a group-compress byte-stream header, followed by the
   _z_content. This appears to be generated in
   GroupCompressBlock.to_bytes():

    def to_bytes(self):
        """Encode the information into a byte stream."""
        self._create_z_content()
        if _USE_LZMA:
            header = self.GCB_LZ_HEADER
        else:
            header = self.GCB_HEADER
        chunks = [header,
                  '%d\n%d\n' % (self._z_content_length,
                                self._content_length),
                  self._z_content,
                 ]
        return ''.join(chunks)

I think it is part of the design that 'to_bytes' does not cause
GroupCompressBlock to stop referencing the data. (_z_content = None).

We have another ''.join(chunks) which is a 2x memory requirement.
'chunks' goes away pretty quickly, though.

1: Appears to be the group-compress byte-stream accompanied with the
Pack header. This is generated in:

 bzrlib.pack.ContainerWriter.add_bytes_record
which calls:
 bzrlib.pack.ContainerSerialiser.bytes_record(bytes, names)
which creates a size-of-record and then post-fixes a bunch of 'name'
fields and has this big XXX:

         # XXX: This causes a memory copy of bytes in size, but is usually
         # faster than two write calls (12 vs 13 seconds to output a gig of
         # 1k records.) - results may differ on significantly larger records
         # like .iso's but as they should be rare in any case and thus not
         # likely to be the common case. The biggest issue is causing extreme
         # memory pressure in that case. One possibly improvement here is to
         # check the size of the content before deciding to join here vs call
         # write twice.
         return ''.join(byte_sections)

Note that ContainerSerialiser.write_func is actually a closure in
NewPack/GCPack that looks like:

        def _write_data(bytes, flush=False, _buffer=self._buffer,
            _write=self.write_stream.write, _update=self._hash.update):
            _buffer[0].append(bytes)
            _buffer[1] += len(bytes)
            # buffer cap
            if _buffer[1] > self._cache_limit or flush:
                bytes = ''.join(_buffer[0])
                _write(bytes)
                _update(bytes)
                _buffer[:] = [[], 0]

So it, once again, can cause a buffer to get joined. When I was doing
commit, self._cache_limit was always 0, so it never did, though.

My feeling is that a good way to handle this would be to change the
Transport.open_write_stream() api, such that the returned object must
have a 'writelines()' attribute (rather than just .write()).

And then to go through all these stacks and change them to use 'chunks'
rather than 'bytes'.

It should mean that in a lot of places where we currently do:

buffer = []
buffer.append('my new header')
buffer.append(passed_in_content)
return ''.join(buffer)

we can just do:

buffer = []
buffer.append('my new header')
buffer.extend(passed_in_chunks)
return buffer

And then the final call is to file.writelines() (which is actually a
'chunked' interface, it does not add '\n' itself. The original idea
being that x.writelines(y.readlines()) would work, but it still works
for our purpose.)

Note there are a couple more *possible* memory duplications that we are
'lucky' are not triggered right now.

_DirectPackAccess.add_raw_records:
          for key, size in key_sizes:
            p_offset, p_length = self._container_writer.add_bytes_record(
                raw_data[offset:offset+size], [])
            offset += size
            result.append((self._write_index, p_offset, p_length))

This takes the raw data buffer, and slices it back up into little
buffers to get written to the output. It doesn't affect us today because:

 a) When we commit we create a single GCB per text being committed, so
    we always only add 1 record at a time.
 b) (raw_data[x:y] is raw_data) when x == 0 and y >= len(raw_data)
    (PyString doesn't chop itself up if you slice it to full size)

GCPack/NewPack _write_data
Buffers if self._cache_limit is >0, and uses ''.join() before writing
the buffered data out. Doesn't affect us today because:

 a) _cache_limit is always 0 during commit.

If/when we get commit working via in 'insert_record_stream' sort of api,
both of these will be an issue.

There is also an extra issue during 'bzr pack' (and autopack, etc),
which is that the GroupCompress code uses a data structure that is
pretty 'loose' with memory, assuming that it won't exist for very long.
It does extra bad if the content being compressed is repetitive, because
hash groups are always given N free spaces. Repetitive data causes the
one group to overflow more often, while the other groups stay empty.

I don't have any sort of ETA on when we could fix this, or a great idea
of how much effort it is. One problem is that it effects a lot of
layers, which causes API friction/churn.

I do have one quick patch which should make it, 3x compressed size, vs
1x raw + 3x compressed.

Revision history for this message
John A Meinel (jameinel) wrote :

Note:

It is very difficult to get rid of the memory allocated by the fulltext itself. Which will require a bunch more refactoring to get it to work.

Specifically the stack is:

Repository.CommitBuilder.record_entry_contents() reads the 'text' from disk, and passes it to self._add_text_to_weave.

_add_text_to_weave hands off to Repository._add_text

_add_text wraps it in a FulltextContentFactory and hands that record off to _insert_record_stream.

Note that none of these functions use the text content again. So we could be done with it as soon as we push it into the compressed form.

The only thing that comes to mind is some sort of ConsumableText object that just wraps a str until someone indicates that they are done with the memory. (The cheap form is a list where you pop out the string when you know you are done with it.)

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 566940] Re: Reducing peak memory for commit

I'd like to delete add_text completely, as previously mentioned - moving to
just a single way to add content to a repo, which will permit commit to a
stacked repo, and make it easier to be confident we've squelched all these
copy bugs.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> I'd like to delete add_text completely, as previously mentioned - moving to
> just a single way to add content to a repo, which will permit commit to a
> stacked repo, and make it easier to be confident we've squelched all these
> copy bugs.
>

Sure, but you would retain the stack of FulltextContentFactory being
passed to another function.

And you can't pass an object to another one without a reference, and
having a reference in the outer means it won't be released in the inner.
_add_text was just a convenience/incremental improvement over add_lines().

We'd still need some way to pass a string to a function and be able to
tell it: "I don't need this anymore, you can do what you want with it."

Also, the current code never deletes 'text', which means that:

text = file.read()

Will hold 2x files in memory while the new one is read, before the old
one is removed. Doing:

text = None
text = file.read()

or some variant on that would get around it.

John
=:_>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvM9QIACgkQJdeBCYSNAAPMvgCglUHwFkdUGaLBR2S+8LN3hm9X
xdEAoIFNvdtjaZkMzOXFX1tTC9R93Dov
=vY/0
-----END PGP SIGNATURE-----

Revision history for this message
Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/19/2010 08:27 PM, John A Meinel wrote:
> Robert Collins wrote:
>> I'd like to delete add_text completely, as previously mentioned - moving to
>> just a single way to add content to a repo, which will permit commit to a
>> stacked repo, and make it easier to be confident we've squelched all these
>> copy bugs.
>
>
> Sure, but you would retain the stack of FulltextContentFactory being
> passed to another function.
>
> And you can't pass an object to another one without a reference, and
> having a reference in the outer means it won't be released in the inner.

One way to do that is to pass in an iterable. If you pass in an
iterator of chunks, the outside doesn't have to hold a reference to the
actual text, and the inside can discard the chunks when it's done with them.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvWL3EACgkQ0F+nu1YWqI0PzACfQi1VE/5iQfS1JC3ixv+gCVgj
QC0An36E+DZiLRWclTzu0XBdDWAeamV/
=eSNk
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

We reduced the data by 1 compressed text, but we still hold the fulltext + compressed text at the same time.

Revision history for this message
Vincent Ladeuil (vila) wrote :

The related partial fix has landed and is now part of 2.0.6 and newer releases (2.1.3, 2.2.1, 2.3b1)

Revision history for this message
Vincent Ladeuil (vila) wrote :

@jam: For clarity sake, can you file a new bug with the pending issues and close this one ?

Revision history for this message
John A Meinel (jameinel) wrote :

This bug was spun off into bug #684301 about our internals creating duplicated compressed content, and bug #684316 about commit holding onto the fulltext longer than we have to. (easier to handle 2 compressed content if we've already released the 1 fulltext).

Revision history for this message
John A Meinel (jameinel) wrote :

The first round of this has been implemented and landed, look to other bugs for more work to be done.

Changed in bzr:
status: Confirmed → Fix Released
Revision history for this message
Martin Pool (mbp) wrote :

see follow on bug 890085.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.