|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
There's no "using distinct", but there is "not exists", and in fact no rows are returned. Slow query log reports "#Query_time: 94 Lock_time: 0 Rows_sent: 0 Rows_examined: 370220" EXPLAIN: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE t1 index NULL PRIMARY 150 NULL 338451 Using index 1 SIMPLE t2 ref word word 150 t2.field 4 Using where; Using index; Not exists These are two search tables (hence the large key_len i believe), one with ~400K rows, one row per search term the other with ~4M rows, relating search terms to content. Perhaps I could optimize by doing a count(distinct) on each table and only running the expensive query if the counts don't match? Would I see any benefit by making these InnoDB tables? Thanks for your with this! Baron Schwartz wrote: > Hi, > > That is the right way, but if you show us the exact output of EXPLAIN we can more. In particular, does it say "Using distinct/not exists" in Extra? > > Russell Uman wrote: >> >> howdy. >> >> i trying to find items in one table that don't exist in another. >> i'm using a left join with a where clause to do it: >> >> SELECT t1.field, t2.field FROM table1 t1 LEFT JOIN table2 t2 ON >> t1.word = t2.word WHERE t2.word IS NULL; >> >> both tables are quite large and the query is quite slow. >> >> the field column is indexed in both tables, and explain shows the >> indexes being used. >> >> is there a better way to construct this kind of query? -- russell uman firebus ((((d-_-b)))) |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
I don't think it will be any better to count distinct values. I think
the query is just slow because the index lookups are slow. Is the 'word' column really 150 bytes? That's probably the culprit. How slow is this, by the way? 370k rows in one table, verifying the non-existence of index records in a 4M-row table with 150-byte index values... what does "slow" mean for your application? How big is the index for the 4M-row table (use SHOW TABLE STATUS)? Russell Uman wrote: > > There's no "using distinct", but there is "not exists", and in fact no > rows are > returned. Slow query log reports "#Query_time: 94 Lock_time: 0 > Rows_sent: 0 > Rows_examined: 370220" > > EXPLAIN: > id select_type table type possible_keys key > key_len ref rows Extra > 1 SIMPLE t1 index NULL PRIMARY 150 NULL > 338451 Using index > 1 SIMPLE t2 ref word word 150 > t2.field 4 > Using where; Using index; Not exists > > These are two search tables (hence the large key_len i believe), one > with ~400K > rows, one row per search term the other with ~4M rows, relating search > terms to > content. > > Perhaps I could optimize by doing a count(distinct) on each table and only > running the expensive query if the counts don't match? > > Would I see any benefit by making these InnoDB tables? > > Thanks for your with this! > > Baron Schwartz wrote: >> Hi, >> >> That is the right way, but if you show us the exact output of EXPLAIN >> we can > more. In particular, does it say "Using distinct/not exists" in > Extra? >> >> Russell Uman wrote: >>> >>> howdy. >>> >>> i trying to find items in one table that don't exist in another. >>> i'm using a left join with a where clause to do it: >>> >>> SELECT t1.field, t2.field FROM table1 t1 LEFT JOIN table2 t2 ON >>> t1.word = t2.word WHERE t2.word IS NULL; >>> >>> both tables are quite large and the query is quite slow. >>> >>> the field column is indexed in both tables, and explain shows the >>> indexes being used. >>> >>> is there a better way to construct this kind of query? > |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
Baron Schwartz wrote: > I don't think it will be any better to count distinct values. I think > the query is just slow because the index lookups are slow. Is the > 'word' column really 150 bytes? huh. it's a varchar(50) on table1 and a varchar(50) on table2. i wonder why explain is reporting 150 as key_len? > That's probably the culprit. How slow > is this, by the way? this is also interesting. as you can see in the slow query log reported before, it took 94 seconds. i'd say i see between 15 and 90 seconds in the slow query log for this normally. however, i just ran the query now, at a time when the application is not heavily loaded, and it finished quickly - less than a second. another run a few minutes later took around 3 seconds. so there seems to be some interaction with load. > 370k rows in one table, verifying the > non-existence of index records in a 4M-row table with 150-byte index > values... what does "slow" mean for your application? How big is the > index for the 4M-row table (use SHOW TABLE STATUS)? the larger table has 95M index. the smaller has a 5M index. key_buffer is set to 2G, and when i look at top mysql never actually get's above 1.5G, so i'm under the impression that all the indexes are in memory. it's a search table, so it does get a lot of inserts, but slow log never reports any lock time. is there anything else i can investgate? |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
Russell Uman wrote:
> > Baron Schwartz wrote: >> I don't think it will be any better to count distinct values. I think >> the query is just slow because the index lookups are slow. Is the >> 'word' column really 150 bytes? > > huh. it's a varchar(50) on table1 and a varchar(50) on table2. i wonder > why explain is reporting 150 as key_len? utf8? >> That's probably the culprit. How slow is this, by the way? > > this is also interesting. as you can see in the slow query log reported > before, it took 94 seconds. i'd say i see between 15 and 90 seconds in > the slow query log for this normally. > > however, i just ran the query now, at a time when the application is not > heavily loaded, and it finished quickly - less than a second. > > another run a few minutes later took around 3 seconds. so there seems to > be some interaction with load. > >> 370k rows in one table, verifying the non-existence of index records >> in a 4M-row table with 150-byte index values... what does "slow" mean >> for your application? How big is the index for the 4M-row table (use >> SHOW TABLE STATUS)? > > the larger table has 95M index. the smaller has a 5M index. key_buffer > is set to 2G, and when i look at top mysql never actually get's above > 1.5G, so i'm under the impression that all the indexes are in memory. > > it's a search table, so it does get a lot of inserts, but slow log never > reports any lock time. > > is there anything else i can investgate? Do you need utf8? :-) Check your cache hits. I can't remember if you said, but is it an InnoDB table? I'm guessing MyISAM since you have a 2G key buffer. Check key_read_requests and key_reads for the query (mysql-query-profiler is a handy way to do this). Baron |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
>> huh. it's a varchar(50) on table1 and a varchar(50) on table2. i >> wonder why explain is reporting 150 as key_len? > > utf8? yes. that does make sense. >> is there anything else i can investgate? > > Do you need utf8? :-) yes. it's an internationalized application ![]() > Check your cache hits. I can't remember if you said, but is it an > InnoDB table? I'm guessing MyISAM since you have a 2G key buffer. yes. we do have some tables as innodb - those that get many many inserts and don't require any count(*) queries which as i understand it are slow in innodb - if there's some reason that this kind of query would be faster under innodb i'm happy to give it a try... > Check > key_read_requests and key_reads for the query (mysql-query-profiler is a > handy way to do this). awesome. i will look into it. |
|
![]() |
| Outils de la discussion | |
|
|