pt-table-checksum should force replica table charset to utf8

Bug #1485195 reported by Jaime Crespo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Carlos Salguero

Bug Description

Background information:

https://groups.google.com/forum/#!topic/percona-discussion/uOSGA6P6BIU

I am suggesting to change:

  CREATE TABLE checksums (
     db CHAR(64) NOT NULL,
     tbl CHAR(64) NOT NULL,
     chunk INT NOT NULL,
     chunk_time FLOAT NULL,
     chunk_index VARCHAR(200) NULL,
     lower_boundary TEXT NULL,
     upper_boundary TEXT NULL,
     this_crc CHAR(40) NOT NULL,
     this_cnt INT NOT NULL,
     master_crc CHAR(40) NULL,
     master_cnt INT NULL,
     ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     PRIMARY KEY (db, tbl, chunk),
     INDEX ts_db_tbl (ts, db, tbl)
  ) ENGINE=InnoDB;

to

  CREATE TABLE checksums (
     db CHAR(64) NOT NULL,
     tbl CHAR(64) NOT NULL,
     chunk INT NOT NULL,
     chunk_time FLOAT NULL,
     chunk_index VARCHAR(200) NULL,
     lower_boundary TEXT NULL,
     upper_boundary TEXT NULL,
     this_crc CHAR(40) NOT NULL,
     this_cnt INT NOT NULL,
     master_crc CHAR(40) NULL,
     master_cnt INT NULL,
     ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     PRIMARY KEY (db, tbl, chunk),
     INDEX ts_db_tbl (ts, db, tbl)
  ) ENGINE=InnoDB charset=utf8;

or at least

  CREATE TABLE checksums (
     db CHAR(64) NOT NULL charset utf8,
     tbl CHAR(64) NOT NULL charset utf8,
     chunk INT NOT NULL,
     chunk_time FLOAT NULL,
     chunk_index VARCHAR(200) NULL,
     lower_boundary TEXT NULL,
     upper_boundary TEXT NULL,
     this_crc CHAR(40) NOT NULL,
     this_cnt INT NOT NULL,
     master_crc CHAR(40) NULL,
     master_cnt INT NULL,
     ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     PRIMARY KEY (db, tbl, chunk),
     INDEX ts_db_tbl (ts, db, tbl)
  ) ENGINE=InnoDB;

As the table names have to be utf8 (character_set_system). By leaving it as default, on systems with default charset binary, CHAR(64) gets converted into BINARY(64) and checksums fail because padded space is not deleted by default in db and table names on fetch function ('my_table_name\0\0\0\0\0\0').

Surprisingly the CREATE TABLEs are applied directly from the ones documented on the same file. Changing the embbebed man at the beginning of the file fixed the issue for me. But I would like the change to be applied upstream. It is a very trivial change, just make sure with your CI that nothing else breaks.

Changed in percona-toolkit:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Frank Cizmich (frank-cizmich)
milestone: none → 2.3.1
Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

Hello Jaime,
This is almost a duplicate of https://bugs.launchpad.net/percona-toolkit/+bug/925781
Since 2.2.14 you can change the "lower_boundary" & "upper_boundary" olumns to BLOB using the --binary-index option, which should fix the problem.
I agree that changing the default charset of the table to the the database default or to utf-8 should also be considered for next release.

Revision history for this message
Jaime Crespo (jynus) wrote :

No, this is a different issue (although similar in spirit). lower_boundary and upper_boundary can be a blob, and that doesn't affect me. The problem is the table and db name get filled with the null character because they get transformed into blobs. db nad tbl should always be utf8, as they are going to be utf8 if mysql >= 5.5.

I did a pull request on https://github.com/percona/percona-toolkit/pull/48 I think you may want to solve it differently, but that is what works for me.

Changed in percona-toolkit:
milestone: 2.2.17 → 2.2.18
Changed in percona-toolkit:
milestone: 2.2.18 → 2.2.19
Changed in percona-toolkit:
assignee: Frank Cizmich (frank-cizmich) → Carlos Salguero (carlos-salguero)
Changed in percona-toolkit:
status: Triaged → In Progress
Changed in percona-toolkit:
status: In Progress → Fix Committed
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-690

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.