Wrong result with subqery in select list and subquery cache=off in maria-5.3

Bug #777691 reported by Philip Stoev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
High
Timour Katchaounov

Bug Description

Not repeatable in maria-5.2 Repeatable in maria-5.3, maria-5.3-mwl89. The SUM part of following query:

SELECT t1.f1, ( SELECT SUM( f10 ) FROM t3 WHERE f10 > t1.f10 ) FROM t1 JOIN t2 ON t2.f2 > 0;

returns 2 different results for 2 rows where t1.f1 is the same, hence the SUM should also be the same.

Subquery cache appears to mask the bug by caching the first value and returning it twice.

Test case:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 ( f10 int NOT NULL) ;
INSERT IGNORE INTO t1 VALUES (104),(119);

DROP TABLE IF EXISTS t2;
CREATE TABLE t2 ( f2 int NOT NULL ) ;
INSERT IGNORE INTO t2 VALUES (8),(231);

DROP TABLE IF EXISTS t3;
CREATE TABLE t3 (f10 int NOT NULL ) ;
INSERT IGNORE INTO t3 VALUES (112);

SET SESSION optimizer_switch='subquery_cache=off';
SELECT t1.f10, ( SELECT SUM( f10 ) FROM t3 WHERE f10 > t1.f10 ) FROM t1 JOIN t2 ON t2.f2 ;

In maria-5.2:

f10 ( SELECT SUM( f10 ) FROM t3 WHERE f10 > t1.f10 )
104 112
119 NULL
104 112
119 NULL

in maria-5.3:

f10 ( SELECT SUM( f10 ) FROM t3 WHERE f10 > t1.f10 )
104 112
119 NULL
104 NULL <--------- notice NULL here
119 NULL

Explain in maria-5.3:

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 ALL NULL NULL NULL NULL 2
1 PRIMARY t2 ALL NULL NULL NULL NULL 2 Using where; Using join buffer (flat, BNL join)
2 DEPENDENT SUBQUERY t3 system NULL NULL NULL NULL 1

Changed in maria:
milestone: none → 5.3
summary: - Wrong result with subqery in select list and subquery cache=off
+ Wrong result with subqery in select list and subquery cache=off
summary: - Wrong result with subqery in select list and subquery cache=off
+ Wrong result with subqery in select list and subquery cache=off in
+ maria-5.3
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

Another test case:

SET @@optimizer_switch='materialization=off,subquery_cache=off,semijoin=off';

CREATE TABLE t1 ( f11 varchar(32)) ;
INSERT IGNORE INTO t1 VALUES ('x'),('b');

CREATE TABLE t2 ( f2 int, f10 varchar(32)) ;
INSERT IGNORE INTO t2 VALUES (1,'x');

SELECT ( SELECT MAX( f2 ) FROM t2 WHERE t2.f10 <> t1.f11 ) FROM t1;

Changed in maria:
assignee: nobody → Timour Katchaounov (timour)
Changed in maria:
status: New → Confirmed
importance: Undecided → High
Changed in maria:
status: Confirmed → In Progress
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote :

Still repeatable. Another test case:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (f2 varchar(32)) ;
INSERT IGNORE INTO t1 VALUES (NULL), ('w');

DROP TABLE IF EXISTS t3;
CREATE TABLE t3 ( f1 int, f2 varchar(32)) ;
INSERT IGNORE INTO t3 VALUES (1,'x');

SELECT (
 SELECT COUNT( t3.f1 )
 FROM t3
  WHERE t3.f2 != t1.f2
) FROM t1;

Revision history for this message
Timour Katchaounov (timour) wrote :

Test case that shows that the problem is in the order
of data - if a NULL result comes first, then we get wrong
result. Probably the recorded NULL result is not being
reset properly for the next row.

SET @@optimizer_switch='materialization=off,subquery_cache=off';

CREATE TABLE t1 ( f11 varchar(32)) ;
INSERT INTO t1 VALUES ('b'),('x');

CREATE TABLE t2 ( f2 int, f10 varchar(32)) ;
INSERT INTO t2 VALUES (1,'x');

SELECT t1.f11, ( SELECT MAX( f2 ) FROM t2 WHERE t2.f10 = t1.f11 ) FROM t1;

Revision history for this message
Timour Katchaounov (timour) wrote :

Analysis;

The cause of the wrong result is that after the subquery execution
detects that it must generate a NULL row for an aggregate function,
the function do_select() calls:
        while ((table= li++))
          mark_as_null_row(table->table);
This marks all rows in the table as complete NULL rows. When evaluating
the field t2.f10 for the second row, all bits of Field::null_ptr[0] are set by
the previous call to mark_as_null_row(), and the the call to Field::is_null()
returns true, resulting in a NULL for the MAX function.

The problem lines above were added by the fix for bug lp:613029.
It turns out that if this fix is removed, both the current bug, and
lp:613029 are fixed. Therefore the fix is to remove the patch for bug
lp:lp:613029, however, I have to understand what other patch have
fixed bug lp:613029.

Revision history for this message
Timour Katchaounov (timour) wrote :

Expanded test case:

CREATE TABLE t1 ( f1 varchar(32)) ;
INSERT INTO t1 VALUES ('b'),('x'),('c'),('x');

CREATE TABLE t2 ( f2 int, f3 varchar(32)) ;
INSERT INTO t2 VALUES (1,'x');

SET @@optimizer_switch='materialization=off,in_to_exists=on,subquery_cache=off';

EXPLAIN SELECT t1.f1, ( SELECT MAX( f2 ) FROM t2 WHERE t2.f3 = t1.f1 ) as max_f2 FROM t1;
SELECT t1.f1, ( SELECT MAX( f2 ) FROM t2 WHERE t2.f3 = t1.f1 ) as max_f2 FROM t1;

INSERT INTO t2 VALUES (2,'y');
EXPLAIN SELECT t1.f1, ( SELECT MAX( f2 ) FROM t2 WHERE t2.f3 = t1.f1 ) as max_f2 FROM t1;
SELECT t1.f1, ( SELECT MAX( f2 ) FROM t2 WHERE t2.f3 = t1.f1 ) as max_f2 FROM t1;

drop table t1, t2;

Changed in maria:
status: In Progress → Fix Committed
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.