pt-table-checksum: Empy tables cause "undefined value as an ARRAY" errors

Bug #987393 reported by Ben Hencke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Daniel Nichter
2.0
Fix Released
Medium
Daniel Nichter
2.1
Fix Released
Medium
Daniel Nichter

Bug Description

Using pt-table-checksum 2.1.1
If a database contains some empty tables, they cause the following error:
04-23T09:27:01 Error checksumming table db.emptytable: Can't use an undefined value as an ARRAY reference at /usr/bin/pt-table-checksum line 6530.

Related branches

tags: added: crash pt-table-checksum
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

I cannot reproduce this problem with 2.1.1 using 1 empty table, 3 empty tables, a mix of tables with data and empty tables.

Ben, can you tell me more about how you're able to reproduce this? I.e. SHOW CREATE TABLE for your empty table, if there are any other tables in the db, the command line to run the tool, etc.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

PTDEBUG output would be helpful, too.

Revision history for this message
Ben Hencke (brainstar) wrote :

Ah, I think this is related to --chunk-size-limit=0
Due to other non related reasons we had to mess with the chunking.

With PTDEBUG=1 the errors occur on a different line.

Attaching example db (very simple) and stderr from these tests:

# export PTDEBUG=0
# pt-table-checksum -u testuser -ptestpass --databases=test --chunk-size-limit=0 2> stderr_noptdebug
# export PTDEBUG=1
# pt-table-checksum -u testuser -ptestpass --databases=test --chunk-size-limit=0 2> stderr_ptdebug

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Thanks for the extra info. I've been able to reproduce this. Now I'm finding the bug...

tags: added: chunking
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :
Download full text (3.2 KiB)

This bug was complex. First a code comment that explain --chunk-size-limit:

   # --chunk-size-limit has two purposes. The 1st, as documented, is
   # to prevent oversized chunks when the chunk index is not unique.
   # The 2nd is to determine if the table can be processed in one chunk
   # (WHERE 1=1 instead of nibbling). This creates a problem when
   # the user does --chunk-size-limit=0 to disable the 1st, documented
   # purpose because, apparently, they're using non-unique indexes and
   # they don't care about potentially large chunks. But disabling the
   # 1st purpose adversely affects the 2nd purpose becuase 0 * the chunk size
   # will always be zero, so tables will only be single-chunked if EXPLAIN
   # says there are 0 rows, but sometimes EXPLAIN says there is 1 row
   # even when the table is empty. This wouldn't matter except that nibbling
   # an empty table doesn't currently work becuase there are no boundaries,
   # so no checksum is written for the empty table. To fix this and
   # preserve the two purposes of this option, usages of the 2nd purpose
   # do || 1 so the limit is never 0 and empty tables are single-chunked.

So at base the fix was doing simply --chunk-size-limit || 1 when the limit is used to determine if a table can be done in one chunk. Previously, checksums for empty tables may or may not have been written depending on whether MySQL EXPLAIN returned 0 or 1 row. Now checksums are always written for empty tables, and empty tables are always done in one chunk.

This was the root of the crash in combination with a Perl bug:

#!/usr/bin/perl
use strict;
use Data::Dumper;

sub foo {
   my (@vals) = @_;
   print Dumper(\@vals);
}

my $var = undef;

# Error: Can't use an undefined value as an ARRAY reference
# my $a = [ @{$var} ];
# print Dumper($a);

# Works, prints []
foo(@{$var});

@{$var} either results in [] (empty arrayref) or an error depending on the context (when using strict; else, it results in [] in any context). @{$var} should probably always result in an error, but it doesn't (all the way through Perl 5.15 iirc what Brian tested). The code still uses this construct but warns:

   # XXX This call and others like it are relying on a Perl oddity.
   # See https://bugs.launchpad.net/percona-toolkit/+bug/987393

But this problem came up not because we're exploiting this oddity but because of the aforementioned bug that caused empty tables to be chunked which resulted in boundary values being undef because an empty table doesn't have boundaries. In particular, in OobNibbleIterator::_next_boundaries():

         $self->{upper} = $self->boundaries()->{first_lower};

first_lower was undef, which caused

         if ( !$nibble_iter->one_nibble() ) {
            my $expl = explain_statement(
               tbl => $tbl,
               sth => $sth->{explain_nibble},
               vals => [ @{$boundary->{lower}}, @{$boundary->{upper}} ],
            );

in pt-table-checksum to hit the Perl oddity because $boundary->{upper} was undef.

So this bug was a combination of empty table + dual-purpose --chunk-size-limit + --chunk-size-limit=0 + Perl bug/oddity + OobNibbleIterator.

Please let me ...

Read more...

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-522

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.