Percona Server moved to https://jira.percona.com/projects/PS

Innodb produces false positive duplicate keys errors in Percona server 5.5

Bug #1096324 reported by Nickolay Ihalainen on 2013-01-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Percona Server moved to https://jira.percona.com/projects/PS	Incomplete	Undecided	Unassigned

Bug Description

Hello,

The error could be reproduced on a slave server, within STATEMENT binary log mode after upgrade to Percona-Server-5.5.28-29.2 from Percona Server 5.1.

CREATE TABLE `patient_pdups` (
`pat_id` int(10) unsigned NOT NULL DEFAULT '0',
`pat_id_match` int(10) unsigned NOT NULL DEFAULT '0',
`rating` smallint(5) unsigned NOT NULL DEFAULT '0',
UNIQUE KEY `idx1` (`pat_id`,`pat_id_match`),
UNIQUE KEY `idx2` (`pat_id_match`,`pat_id`),
KEY `idx3` (`rating`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

The query causes duplicate key error:
INSERT INTO patient_pdups (pat_id,pat_id_match,rating) VALUES ('1864','1983','9'),('1983','1864','9');

During fault both `idx1` & `idx2` produces duplicate key error in a random manner.
I have checked execution path for idx2 and it's very common for duplicate key: have a backtrace at /usr/src/debug/Percona-Server-5.5.28-rel29.2/Percona-Server-5.5.28-rel29.2/storage/innobase/row/row0ins.c:1759
But I don't understand why it produces duplicate key error, because SELECT * WHERE (pat_id='1864' AND pat_id_match='1983') OR (pat_id='1983' AND pat_id_match='1864') returns no rows.

At the same time there is no row with such pat_id & pat_id_match fields combination as in insert query.

Before the error appearrs I can always see strange entries in mysql error log:

InnoDB: Warning: hash index ref_count (182) is not zero after fil_discard_tablespace().
index: "idx1" table: "xx_xxxxx_xxxx/patient_pdups"
InnoDB: Warning: hash index ref_count (87) is not zero after fil_discard_tablespace().
index: "idx2" table: "xx_xxxxx_xxxx/patient_pdups"
InnoDB: Warning: hash index ref_count (97) is not zero after fil_discard_tablespace().
index: "idx3" table: "xx_xxxxx_xxxx/patient_pdups"

After reboot the same statement executes correctly. Additional inserts producing normal duplicate key error messages, but only with `idx1`.

The slave server is crated from xtrabackup dump which is created on another 5.1 slave, transaction logs correctly applied, mysql_upgrade executed.
During the test there is only replication log, no other queries executed. No other 5.5 binaries tested yet.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-01-08:

I tested with a 5.1 master (5.1.66) and 5.5 slave (5.5.28-29.2)
with config in
https://gist.github.com/d58f40f14f5530765cee

However, I have not been able to reproduce it yet

The values were inserted like

INSERT INTO patient_pdups (pat_id,pat_id_match,rating) VALUES ('18,'19','11'),('19','18','31'); in a loop.

However, it needs to be verified if this can be rproduced with a xtrabackup dump from 5.1 mysql_upgraded to 5.5 slave.

Regarding,

""
InnoDB: Warning: hash index ref_count (182) is not zero after fil_discard_tablespace().
index: "idx1" table: "xx_xxxxx_xxxx/patient_pdups"
InnoDB: Warning: hash index ref_count (87) is not zero after fil_discard_tablespace().
index: "idx2" table: "xx_xxxxx_xxxx/patient_pdups"
InnoDB: Warning: hash index ref_count (97) is not zero after fil_discard_tablespace().
index: "idx3" table: "xx_xxxxx_xxxx/patient_pdups"

it looks like the index may be inconsistent here, since the ref_count refers to the number of blocks in the index with search index built which remain after a discard_tablespace.

Revision history for this message

Dave Juntgen (djuntgen) wrote on 2013-01-09:

mysqld.err Edit (53.9 KiB, text/plain)

This error has occurred again from a different snapshot.

1) snapshot take for another 5.1.66-rel14.1-log slave.
2) extracted: xbstream + qpress
3) apply log: innobackupex --apply-log --use-memory 15G .
4) moved: innobackupex --move-back --defaults-file=/etc/my.cnf .
5) mysqld start
6) mysql_update
7) Change master from xtrabackup_slave_info
8) start slave.
9) Ran for about 12 hours, had an error regarding SQL, skipped.
10) crash.

See attached mysqld.err for details.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-01-09:

@David,

It looks from the error log that you have

a) A corrupted secondary index -- idx3. (Other indexes need to be
checked)

    - So suggest running a CHECK TABLE for
      wc_miehr_mhpc`.`patient_pdups to see if any other index
      other than idx3 are corrupted

b) 130109 8:16:43 InnoDB: Error: trying to access tablespace
601060 page no. 1,
InnoDB: but the tablespace does not exist or is just being
dropped.

also suggests a corruption.

So,

1) Is the source -- 5.1.66 host -- fine? Does it have any related
errors? You can run 'check table' there as well.

2) Also, provide the innobackupex (with its command line) logs and apply-log output.

Revision history for this message

Dave Juntgen (djuntgen) wrote on 2013-01-09:

check table from slave that took snapshot:

See attache innodb backup.

Revision history for this message

Dave Juntgen (djuntgen) wrote on 2013-01-09:

db3-backup.log Edit (13.2 MiB, text/plain)

Attached is the backup log from slave.

I do not have logs for apply-log at the wcdb-slave2 (one with issues)

Revision history for this message

Dave Juntgen (djuntgen) wrote on 2013-01-09:

check table from wcdb-slave2:

Revision history for this message

Roel Van de Paar (roel11) wrote on 2013-02-06:

See bug 1116447

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2014-06-17:

As https://bugs.launchpad.net/percona-server/+bug/1116447 seems fixed since 5.5.29, I wonder if anybody noted similar errors in versions newer than that?