LOAD DATA INFILE not replicating properly in cluster

Bug #1206129 reported by Przemek
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
Fix Released
High
Teemu Ollakka
5.6
Fix Released
High
Teemu Ollakka
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Fix Released
Undecided
Unassigned
5.6
Fix Released
Undecided
Unassigned
codership-maria
New
Undecided
Unassigned

Bug Description

When I start large LOAD DATA INFILE, if I understand correctly, Galera does implicit commits on every 10k rows, correct?
However these implicit commits are not seem to be replicated to other nodes. The problem is during the LOAD is in progress, I can see increasing row count on the same node where I run the LOAD, but all other nodes show 0 rows.
Also, when I interrupt the LOAD command with CTRL+C, the writing node has already tens of thousands rows present in the table, while all other nodes have this table empty.
No errors in error log, wsrep status shows all the nodes are "Synced", cluster size is OK. Until I write to the table again of course.

How to reproduce:
- start large LOAD DATA INFILE on one node to InnoDB table, after first batch is committed (select count(*) showing rows) - abort the statement.
- check the table contents on other nodes

| version_comment | Percona XtraDB Cluster (GPL), wsrep_23.7.5.r3880 |
| wsrep_provider_version | 2.6(r152) |

Changed in codership-mysql:
assignee: nobody → Seppo Jaakola (seppo-jaakola)
importance: Undecided → Medium
status: New → In Progress
milestone: none → 5.5.32-23.7.6
Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

Turns out that there is a regression bug, LOAD DATA transaction does not commit anymore at 10K row insert intervals. I will push a fix for this and create a regression test for it.
Also, as this changes the logic of LOAD DATA processing, will add a configuration variable to control whether such LOAD DATA transaction splitting is wanted or not: wsrep_load_data_splitting=ON|OFF

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :
Changed in codership-mysql:
status: In Progress → Fix Committed
Changed in codership-mysql:
status: Fix Committed → Fix Released
Revision history for this message
borice (borice) wrote :

Which version has this fix been released in? The current latest available release is:
MySQL with wsrep patches: 5.5.33-24.8
Galera: 24.2.7

Is this included in the above?

Thanks.

Revision history for this message
borice (borice) wrote :

Nevermind. As part of the Release Notes of 5.5.33-24.8 I found the following note:

"A port of 5.5 branch to wsrep API v24. As such it retains all the features and fixes from 23.7.6, but also adds some new, such as ability to use both 2.x and 3.x branches of Galera."

This seems to indicate that the fix is included in 24.8 since it was included in 23.7.6.

Revision history for this message
borice (borice) wrote :

Well... I guess that Release Notes is wrong...
After adding the option to /etc/mysql/my.cnf MySQL no longer wants to start and I get the following error in /var/log/mysql/error.log:

...snip...
[ERROR] /usr/sbin/mysqld: unknown variable 'wsrep_load_data_splitting=ON'

What gives?

Revision history for this message
borice (borice) wrote :

So, I installed the 23.7.6 version and that error went away, however I cannot confirm that this bug is actually fixed. I have a very large LOAD DATA INFILE and I'm monitoring the nodes in the cluster with SHOW TABLE STATUS and only on the node where the LOAD command was issued the number of rows is increasing, on the other nodes stays at 0...
The node where LOAD was issued has already loaded over 2 million rows... of if Galera would commit every 10K rows, I should've seen that... but I'm not.

Perhaps related, I found someone else complaining of something similar:
https://mariadb.atlassian.net/browse/MDEV-5146

Any ideas?

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

Yes, it looks like the fix didn't make it to 24.8 :( And it looks like regression again.

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

@borice: I tested LOAD DATA splitting in 23.7.6. release (https://launchpad.net/codership-mysql/5.5/5.5.33-23.7.6), and it seems to work for me. Tried just with a very simple table and 100K rows,

Here is a simple test for LOAD DATA splitting;

mysql> CREATE TABLE t (i int); /* or any other table structure */
mysql> INSERT INTO t VALUES (1);
mysql> INSERT INTO t SELECT * FROM t
/* repeat this as many times as it takes to get > 10K rows in the table */
mysql> SELECT COUNT(*) FROM t;
mysql> SELECT * FROM t INTO OUTFILE '/tmp/t';
mysql> TRUNCATE t;
mysql> LOAD DATA INFILE '/tmp/t' INTO TABLE t;
mysql> SELECT COUNT(*) FROM t;

You could try is something like this works in your cluster. If this test passes and your table still does not split LOAD DATA session correctly, there is something else in your use case, and requires further troubleshooting.

Note that table status is not always up to date. Did you try to see the row count with: SELECT COUNT(*) FROM ....; ?

Current wsrep-5.5.23 development head has broken LOAD DATA splitting, but the fix is rather simple.

Revision history for this message
Seppo Jaakola (seppo-jaakola) wrote :

@borice: if you enable wsrep debugging:

mysql> SET GLOBAL wsrep_debug=1;

...you should see a log message in master node error log, for each 10K insert batch, like:

131025 10:13:10 [Note] WSREP: forced trx split for LOAD: load data infile '/tmp/t' into table t

And, better disable debug logging after inspection:

mysql> SET GLOBAL wsrep_debug=0;

Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :

Reopening to fix regression in current branch head and to merge to 5.5 and 5.6.

Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :
Revision history for this message
Teemu Ollakka (teemu-ollakka) wrote :
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1405

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.