My SQL Dump

MySQL musings by a self professed MySQL Geek

Previous Entry Add to Memories Share Next Entry
Using a DELETE w/ a JOIN and LIMITing the number of rows deleted.
swanhart
I've come up with some interesting workarounds for missing features using @session variables and I'd like to share one with you today: DELETE ... JOIN ... LIMIT N;

okay, so we've got two tables, one with many duplicates and one with no duplicates.
We want to join the tables using 'id', but only want to delete N duplicates from t1.
MySQL's DELETE does not include LIMIT support when JOIN is used, so we need to work around
that using some fancy footwork:


mysql> select * from t1;
+------+------+
| id | abc |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
+------+------+
8 rows in set (0.00 sec)

mysql> select * from t2;
+----+------+
| id | xyz |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.00 sec)

mysql> set @limit=2; set @rowCount=0;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)

mysql> DELETE t1.*
-> FROM t1
-> JOIN t2 on (t1.id = t2.id)
-> WHERE t2.id = 1
-> AND CONCAT(t1.id, t1.abc, @rowCount +1)
-> = CONCAT(t1.id, t1.abc,
-> CASE @rowCount
-> WHEN @LIMIT THEN @rowCount:=NULL
-> ELSE @rowCount := @rowCount+1
-> END
-> );
Query OK, 2 rows affected (0.00 sec)

mysql> select * from t1;
+------+------+
| id | abc |
+------+------+
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
| 1 | 1 |
+------+------+
6 rows in set (0.00 sec)

how it works:
When MySQL first evaluates the expression, @rowCount is zero. The CONCAT in the where clause concatenates all the values given together, so at first we get CONCAT(1, 1, 1) = CONCAT(1, 1, 1). The next time through, we get CONCAT(1, 1, 2) = CONCAT(1, 1, 2). At the third execution we get CONCAT(1, 1, 3) = CONCAT(1, 1, NULL), which does not evaluate true. At this time @rowCount is now NULL, so subsequent scanned rows evaluate to
CONCAT(1, 1, NULL) = CONCAT(1, 1, NULL), which can never evaluate true (NULL != NULL).

So, what interesting ways have you used @session variables to come up with solutions to intractable problems?

Here's another way. I put a bunch of rows into the table, then:

mysql> set @limit := 2;
Query OK, 0 rows affected (0.00 sec)

mysql> delete t1.* from t1 join t2 using(id) where t2.id = 1 and (@limit := @limit -1) >= 0;
Query OK, 2 rows affected (0.00 sec)

Now the fun part: the query actually runs through the whole table and doesn't abort after 2 rows, as you can see here:

mysql> select @limit;
+--------+
| @limit |
+--------+
| -29 |
+--------+
1 row in set (0.00 sec)

On the other hand, this is roughly equivalent to a join:

mysql> delete from t1 where exists(select * from t2 where t2.id=t1.id) limit 2;
Query OK, 2 rows affected (0.06 sec)

Actually in terms of handler operations I'm pretty sure this is going to be more efficient; I think a delete with a JOIN creates a temp table. I'd have to check that though.

I think the EXISTS clause have to FTS the entire t1, whereas when using a join NL will scan t1 for rows, at least when that makes sense to the optimizer. It will also have to execute one query per row in t1, which could mean a lot of b-tree dips into t2 which could be inefficient.

Unless I'm completely off my mark that is, which this early in the morning certainly is possible.



Edited at 2008-05-14 03:59 pm (UTC)

Wonder if this will stop scanning rows to reduce the i/o footprint, assuming t2.id is not nullable?

SET @limit:=2;
DELETE t1.*
FROM t1
JOIN t2 using (id)
WHERE t2.id = CASE
WHEN (@limit := @limit -1) >= -1 THEN 1
ELSE NULL
END


I predict it won't. But you should benchmark and/or profile with mk-profiler. I think the EXISTS() subquery might hold some surprises for you :-)

My favorite.

(Anonymous)

2008-05-14 03:10 pm (UTC)

Session vars can be handy. I'm sure this may be old hat to some, but I discovered recently when I needed to update a row, but also get the previous value of the column I was updating with out using an explicit lock of any kind.
TABLE foo
+--------+----------+
| foo_id | foo_val |
+--------+----------+
| 1 | 213 |
+--------+----------+

UPDATE foo set foo_val = 2937 WHERE foo_id =1 and (@old_foo := foo_val);

Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0

select @old_foo;
+----------+
| @old_foo |
+----------+
| 213 |
+----------+



Be careful, because if foo_val is null, your UPDATE won't work . I used the same type of example in the MySQL manual comments
and Paul Decowski noticed that. His solution was to use OR (1=1), but the MySQL optimizer now optimizes that away, so Jon Meridith suggested the following, which should work on all versions of MySQL and for all column values:

update some_table
set col = col + 1
where key = 'some_key_value'
and ((@value := col) IS NULL OR (@value := col) IS NOT NULL)

Re: My favorite.

(Anonymous)

2009-04-09 02:24 pm (UTC)

THANKS MATE TO HAVE SHARED THIS YOU MAY HAVE SAVED MY TIME A LOT!! :)

You are viewing swanhart