Percona XtraBackup moved to https://jira.percona.com/projects/PXB

innobackupex doesn't copy xtrabackup_logfile if 'UNLOCK TABLES' times out at end of script

Bug #405604 reported by Brad Fino on 2009-07-28

This bug report is a duplicate of: Bug #408803: innodbbackupex reaches Servers wait_timeout. Edit Remove

This bug affects 6 people

	Status	Importance	Assigned to	Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB	Triaged	High	Valentine Gostev	Percona XtraBackup moved to https://jira.percona.com/projects/PXB 1.9.0
1.6	Triaged	High	Valentine Gostev	Percona XtraBackup moved to https://jira.percona.com/projects/PXB 1.6.4
2.0	Triaged	High	Valentine Gostev	Percona XtraBackup moved to https://jira.percona.com/projects/PXB 1.9.0

Bug Description

After it flushes tables with read lock during the the "innobackupex: Finished backing up .frm, .MRG, .MYD, .MYI, .TRG, .TRN, .ARM, .ARZ and .opt files" phase, UNLOCK TABLES will sometimes timeout due to whatever $process may be occurring within my application at the time (maybe an ALTER, massive traffic, etc).

So, I end up with:

xtrabackup: Transaction log of lsn (574 2452056146) to (574 3374977342) was copied.
innobackupex: Error: mysql child process has died: ERROR 2006 (HY000) at line 19: MySQL server has gone away
while waiting for reply to MySQL request: 'UNLOCK TABLES;' at /usr/bin/innobackupex-1.5.1 line 452.

innobackupex dies waiting for unlock tables to complete, but it doesn't copy the xtrabackup_logfile over to the target server.
It just leaves it behind in /tmp on the source server. If i manually copy it over then everything's fine, and recovery completes as normal.

Tags:

Aleksandr Kuzminsky (akuzminsky) on 2010-03-22

Changed in percona-xtrabackup:
importance:	Undecided → Medium
milestone:	none → release-1.2

Aleksandr Kuzminsky (akuzminsky) on 2010-03-22

Changed in percona-xtrabackup:
assignee:	nobody → Aleksandr Kuzminsky (akuzminsky)

Aleksandr Kuzminsky (akuzminsky) on 2010-07-06

Changed in percona-xtrabackup:
milestone:	1.2 → 1.3.1

Stewart Smith (stewart) on 2011-05-20

Changed in percona-xtrabackup:
milestone:	1.3.1 → none

Revision history for this message

Stewart Smith (stewart) wrote on 2011-09-20:

Is this still reproducible in current xtrabackup? there have been a lot of bug fixes since this bug was filed. Please try to reproduce with a more recent release.

thanks for your report, and sorry it has stagnated for so long.

Changed in percona-xtrabackup:
status:	New → Incomplete

Revision history for this message

Kenny Gryp (gryp) wrote on 2011-10-27:

------
111026 02:37:53 innobackupex: Finished backing
up .frm, .MRG, .MYD, .MYI, .TRG, .TRN, .ARM, .ARZ, .CSV, .CSM and .opt
files

innobackupex: Resuming ibbackup

xtrabackup: The latest check point (for incremental): '191499091912'
log scanned up to (191499360626)
xtrabackup: Stopping log copying thread.......
xtrabackup: Transaction log of lsn (191487786897) to (191499360626) was
copied.
innobackupex: Error: mysql child process has died: ERROR 2006 (HY000) at
line 13: MySQL server has gone away
while waiting for reply to MySQL request: 'UNLOCK TABLES;'
at /usr/bin/innobackupex line 336.
------

===> so wait_timeout reached before the copying of the FRM files was done.

===> There is some mechanism that checks if the connection should be pinged, but it's not really intelligent:
#
# mysql_keep_alive subroutine tries to keep connection to the mysqld database
# server alive by sending a dummy query when the connection has been idle
# for the specified time.
#
sub mysql_keep_alive {
    if ((time() - $mysql_last_access_time) > $mysql_keep_alive_timeout) {
        # too long idle, send a dummy query
        mysql_check();
    }
}

===> default setting is not that good:
# time in seconds after which a dummy query is sent to mysql server
# in order to keep the database connection alive
my $mysql_keep_alive_timeout = 1800;

===> In this case, wait_timeout was set to 100, which is much lower than that default value.

===> HOWEVER, changing $mysql_keep_alive_timeout is not enough because it does run mysql_keep_alive every time a file has been copied so when the file is big and will take longer than 100 seconds, it will fail

===> a (dirty) way that it can be fixed is by applying the patch below and expecting a file not to take longer than 57600 seconds

Revision history for this message

Kenny Gryp (gryp) wrote on 2011-10-27:

innobackupex.405604.patch Edit (624 bytes, text/plain)

Changed in percona-xtrabackup:
status:	Incomplete → New

Stewart Smith (stewart) on 2011-10-27

Changed in percona-xtrabackup:
importance:	Medium → High
assignee:	Aleksandr Kuzminsky (akuzminsky) → nobody

Valentine Gostev (longbow) on 2011-11-01

tags:

added: innobackupex

Revision history for this message

Stewart Smith (stewart) wrote on 2011-11-01:

We should probably ignore the result of UNLOCK TABLES as it failing/timing out really isn't anything that affects us (tables would be unlocked when we disconnect from server anyway)

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2011-11-11:

It looks like the real problem was that the client connection timed out before innobackupex issued UNLOCK TABLES, but rather when copying non-InnoDB diles. It's just that the error message is misleading in such a case.

So it is essentially a duplicate of bug #408803 and will be addressed by the same patch.

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #408803 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Patches

innobackupex.405604.patch Edit

Add patch

Remote bug watches

Bug watches keep track of this bug in other bug trackers.