Discussion: Delete query question
Afficher un message
Vieux 05/09/2007, 20h29   #6
Baron Schwartz
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Delete query question

IN() subqueries in MySQL are badly optimized. It's usually better to
use a JOIN, even though it's non-standard:

DELETE geno_260k.*
FROM geno_260k
INNER JOIN (
SELECT ident FROM geno_260k
WHERE a1=0
GROUP BY ident HAVING count(*)>250000
) AS der USING(ident);

Try profiling this and see if it's faster. It probably will be on any
reasonably large data set, as long as the table has an index on ident.

Note I changed the COUNT(a1) to COUNT(*) for efficiency. Counting a
column counts the number of values (e.g. non-null). Counting * just
counts the number of rows and can be faster. COUNT(*) is what you want
to use 99% of the time.

Regards
Baron

Olaf Stein wrote:
> Thanks,
>
> This seems to work but that IN seems to be really slow...
>
>
> On 9/5/07 9:41 AM, "Justin" <mysql@b0rker.com> wrote:
>
>> try
>>
>> SELECT * from geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE
>> a1=0
>> GROUP BY ident HAVING count(a1)>250000);
>>
>> This will give you what you're deleting first.. then if that is good. do
>>
>>
>> DELETE FROM geno_260k WHERE ident IN (SELECT ident FROM geno_260k WHERE
>> a1=0
>> GROUP BY ident HAVING count(a1)>250000);
>>
>> (note the change in case is just my way of seeing things.. it's not
>> necessary that I know of)
>>
>>
>> ----- Original Message -----
>> From: "Olaf Stein" <steino@ccri.net>
>> To: "MySql" <mysql@lists.mysql.com>
>> Sent: Wednesday, September 05, 2007 9:35 AM
>> Subject: Delete query question
>>
>>
>>> Hey all
>>>
>>> I am stuck here (thinking wise) and need some ideas:
>>>
>>> I have this table:
>>>
>>> CREATE TABLE `geno_260k` (
>>> `genotype_id` int(10) unsigned NOT NULL auto_increment,
>>> `ident` int(10) unsigned NOT NULL,
>>> `marker_id` int(10) unsigned NOT NULL,
>>> `a1` tinyint(3) unsigned NOT NULL,
>>> `a2` tinyint(3) unsigned NOT NULL default '0',
>>> PRIMARY KEY (`genotype_id`),
>>> KEY `ident` (`ident`),
>>> KEY `marker_id` (`marker_id`),
>>> CONSTRAINT `geno_260k_ibfk_2` FOREIGN KEY (`marker_id`) REFERENCES
>>> `markers` (`marker_id`),
>>> CONSTRAINT `geno_260k_ibfk_1` FOREIGN KEY (`ident`) REFERENCES
>>> `individual` (`ident`)
>>> ) ENGINE=InnoDB AUTO_INCREMENT=91318273 DEFAULT CHARSET=utf8
>>>
>>>
>>> And with the following query I get 159 ident's back:
>>>
>>> select ident from geno_260k where a1=0 group by ident having
>>> count(a1)>250000;
>>>
>>> I want to delete all records containing those idents (about 260000 per
>>> ident
>>> so 159*260000).
>>> So I thought
>>>
>>> delete from geno_260k where ident=(select ident from geno_260k where a1=0
>>> group by ident having count(a1)>250000);
>>>
>>> But mysql can not select and delete from the same table.
>>>
>>> Any ideas?
>>>
>>> Thanks
>>> Olaf
>>>
>>>
>>>
>>> --
>>> MySQL General Mailing List
>>> For list archives: http://lists.mysql.com/mysql
>>> To unsubscribe: http://lists.mysql.com/mysql?unsub=mysql@b0rker.com
>>>
>>>

>
>
>
>
>
> -------------------------
> Olaf Stein
> DBA
> Center for Quantitative and Computational Biology
> Columbus Children's Research Institute
> 700 Children's Drive
> phone: 1-614-355-5685
> cell: 1-614-843-0432
> email: steino@ccri.net
>
>


--
Baron Schwartz
Xaprb LLC
http://www.xaprb.com/
  Réponse avec citation
 
Page generated in 0,07944 seconds with 9 queries