|
|
|
|
||||||
![]() |
|
|
LinkBack | Outils de la discussion |
|
|
#1 |
|
Messages: n/a
Hébergeur: |
Hi, im having troubles with one query, hope someone can .
on this table: messages: id_from int(10) id_to int(10) text varchar(1000) time with index on id_form and id_to i want to get messages sent from one user to another and vice versa order by time. let say we want to search messages between user 1 and 2: i'm doing: SELECT * FROM messages WHERE id_from in (1,2) AND id_to in (1,2) ORDER BY time but that gives me a filesort in all rows matching the where clause, and thats not good. I tried by indexing id_from, id_to, time but thats not working of course. any would be really appreciate. |
|
|
|
#2 |
|
Messages: n/a
Hébergeur: |
On Sat, Mar 15, 2008 at 2:42 PM, Nacho Garcia <nachoab@gmail.com> wrote:
> Hi, im having troubles with one query, hope someone can . > > on this table: > > messages: > id_from int(10) > id_to int(10) > text varchar(1000) > time > > with index on id_form and id_to > > i want to get messages sent from one user to another and vice versa order by > time. > let say we want to search messages between user 1 and 2: > i'm doing: > > SELECT * > FROM messages > WHERE id_from in (1,2) > AND id_to in (1,2) > ORDER BY time > > but that gives me a filesort in all rows matching the where clause, and > thats not good. I tried by indexing id_from, id_to, time but thats not > working of course. > > any would be really appreciate. > Short answer: The filesort is not necessary not your problem. Add a composite key on id_from,id_to. ALTER TABLE `messages` ADD INDEX ( `id_from` , `id_to` ) Long answer http://dev.mysql.com/doc/refman/5.0/...imization.html "In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following: .... The key used to fetch the rows is not the same as the one used in the ORDER BY: SELECT * FROM t1 WHERE key2=constant ORDER BY key1;" As a quick example using your table I will generate a bunch of data to play with: DROP TABLES IF EXISTS messages,integers,user_ids; CREATE TABLE integers(i int NOT NULL PRIMARY KEY); INSERT INTO integers(i) VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9); CREATE TABLE user_ids(`id` int(10)) SELECT t.i * 10 + u.i as id FROM integers AS u, integers as t WHERE (t.i * 10 + u.i) < 100; CREATE TABLE `messages` ( `id_from` int(10) NOT NULL, `id_to` int(10) NOT NULL, `text` varchar(1000) NOT NULL, `time` datetime NOT NULL ); INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',NOW() FROM user_ids t1, user_ids t2; INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',DATE_SUB(NOW( ), INTERVAL 1 DAY) FROM user_ids t1, user_ids t2; INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',DATE_SUB(NOW( ), INTERVAL 2 DAY) FROM user_ids t1, user_ids t2; INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',DATE_SUB(NOW( ), INTERVAL 3 DAY) FROM user_ids t1, user_ids t2; INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',DATE_SUB(NOW( ), INTERVAL 5 DAY) FROM user_ids t1, user_ids t2; INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',DATE_SUB(NOW( ), INTERVAL 6 DAY) FROM user_ids t1, user_ids t2; INSERT INTO messages(id_from, id_to,text,time) SELECT t1.id, t2.id, '',DATE_SUB(NOW( ), INTERVAL 7 DAY) FROM user_ids t1, user_ids t2; This creates 70k rows. Without an index your query takes ~.25 sec on a rather slow box and examines 70k rows. With an index on id_from your query takes ~.01 sec and examines 1.4k rows. With a composite index on id_from,id_to your query takes ~.002 sec and examines 28 rows, and has a filesort. Still pretty fast... -- Rob Wultsch |
|
|
|
#3 |
|
Messages: n/a
Hébergeur: |
I would suggest u use
SELECT * FROM messages WHERE id_from between 1 and 2 AND id_to between 1 and 2 ORDER BY time Hope this s. Velen ----- Original Message ----- From: "Nacho Garcia" <nachoab@gmail.com> To: <mysql@lists.mysql.com> Sent: Sunday, March 16, 2008 1:42 AM Subject: how to use index with order by here > Hi, im having troubles with one query, hope someone can . > > on this table: > > messages: > id_from int(10) > id_to int(10) > text varchar(1000) > time > > with index on id_form and id_to > > i want to get messages sent from one user to another and vice versa order by > time. > let say we want to search messages between user 1 and 2: > i'm doing: > > SELECT * > FROM messages > WHERE id_from in (1,2) > AND id_to in (1,2) > ORDER BY time > > but that gives me a filesort in all rows matching the where clause, and > thats not good. I tried by indexing id_from, id_to, time but thats not > working of course. > > any would be really appreciate. > |
|
|
|
#4 |
|
Messages: n/a
Hébergeur: |
On Sun, Mar 16, 2008 at 10:36 AM, Velen <velen@biz-mu.com> wrote:
> I would suggest u use > > SELECT * > FROM messages > WHERE id_from between 1 and 2 > AND id_to between 1 and 2 > ORDER BY time That would only be applicable if he mandated that his users only send messages to other users with user_id's +/- 1 of their own, or they will break their message search app. I bet that would go over well. Even if this were not an issue changing from the IN to BETWEEN does not performance. In fact performance is worse. With the index I had suggested above and using the sample data I created his original query examines 28 rows, while yours examines 713, and takes several (4 to 6) times longer to retrieve the rows on 5.0.51. -- Rob Wultsch |
|
|
|
#5 |
|
Messages: n/a
Hébergeur: |
Rob Wultsch schrieb:
> On Sat, Mar 15, 2008 at 2:42 PM, Nacho Garcia <nachoab@gmail.com> wrote: >> Hi, im having troubles with one query, hope someone can . >> >> on this table: >> >> messages: >> id_from int(10) >> id_to int(10) >> text varchar(1000) >> time >> >> with index on id_form and id_to >> >> i want to get messages sent from one user to another and vice versa order by >> time. >> let say we want to search messages between user 1 and 2: >> i'm doing: >> >> SELECT * >> FROM messages >> WHERE id_from in (1,2) >> AND id_to in (1,2) >> ORDER BY time >> >> but that gives me a filesort in all rows matching the where clause, and >> thats not good. I tried by indexing id_from, id_to, time but thats not >> working of course. >> >> any would be really appreciate. >> > > Short answer: The filesort is not necessary not your problem. Add a > composite key on id_from,id_to. > ALTER TABLE `messages` ADD INDEX ( `id_from` , `id_to` ) this will not prevent filesort, because the results still needs to be sorted for ORDER BY time, or? i would try INDEX(`id_from`, `id_to`, `time`) but i am not sure if this will , cause of this two IN() -- Sebastian |
|
|
|
#6 |
|
Messages: n/a
Hébergeur: |
I love when this happens. I woke in the middle of the night with an idea for
you. It now occurs to me that the query you want is dead simple. It just took me a while to see: SELECT * FROM messages WHERE id_from = 1 AND id_to = 2 UNION SELECT * FROM messages WHERE id_from = 2 AND id_to = 1 ORDER BY create_time Assuming an index on id_from (or id_from, id_to), it will be used. This will be very quick. hth, Arthur On 3/16/08, Rob Wultsch <wultsch@gmail.com> wrote: > > On Sun, Mar 16, 2008 at 10:36 AM, Velen <velen@biz-mu.com> wrote: > > I would suggest u use > > > > SELECT * > > FROM messages > > WHERE id_from between 1 and 2 > > AND id_to between 1 and 2 > > ORDER BY time > > > That would only be applicable if he mandated that his users only send > messages to other users with user_id's +/- 1 of their own, or they > will break their message search app. > > I bet that would go over well. > > Even if this were not an issue changing from the IN to BETWEEN does > not performance. In fact performance is worse. With the index I > had suggested above and using the sample data I created his original > query examines 28 rows, while yours examines 713, and takes several (4 > to 6) times longer to retrieve the rows on 5.0.51. > > -- > > Rob Wultsch > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: > http://lists.mysql.com/mysql?unsub=f...tful@gmail.com > > |
|
|
|
#7 |
|
Messages: n/a
Hébergeur: |
On Mon, Mar 17, 2008 at 4:36 AM, Arthur Fuller <fuller.artful@gmail.com> wrote:
> I love when this happens. I woke in the middle of the night with an idea for > you. It now occurs to me that the query you want is dead simple. It just > took me a while to see: > > SELECT * > FROM messages > > WHERE id_from = 1 AND id_to = 2 > UNION > SELECT * > FROM messages > > WHERE id_from = 2 AND id_to = 1 > ORDER BY time > > Assuming an index on id_from (or id_from, id_to), it will be used. This will > be very quick. I bench'ed the union before sending in my original response. For the generic data set I created as an example his original query is faster (not by much) and simpler. If it were me writing the query I would use a union, probably. IN and OR never end well ![]() >this will not prevent filesort, because the results still needs to be sorted for ORDER BY time, or? Yes. Adding the extra column to the index will not result in losing the filesort. The filesort will not be any sort of a problem unless the result is large. If you do a sort by the left most column of the index (in my suggestion id_from ) then you will not have a filesort, but it is not much faster. I suggest benching it. Create the table and populate them with my queries above. If you want a much larger dataset you can something like the following instead of my last 7 queries... INSERT INTO messages( id_from, id_to, text, time ) SELECT t1.id, t2.id, '', DATE_SUB( NOW( ) , INTERVAL t.i *10 + u.i DAY ) FROM user_ids t1, user_ids t2, integers AS u, integers AS t WHERE ( t.i *10 + u.i) <20; and replace the 20 with whatever. (Full disclosure: I ripped off Baron Schwartz integer table from http://www.oreilly.com/pub/a/mysql/2...rank-data.html ) -- Rob Wultsch |
|
|
|
#8 |
|
Messages: n/a
Hébergeur: |
Rob Wultsch schrieb:
> On Mon, Mar 17, 2008 at 4:36 AM, Arthur Fuller <fuller.artful@gmail.com> wrote: >> I love when this happens. I woke in the middle of the night with an idea for >> you. It now occurs to me that the query you want is dead simple. It just >> took me a while to see: >> >> SELECT * >> FROM messages >> >> WHERE id_from = 1 AND id_to = 2 >> UNION >> SELECT * >> FROM messages >> >> WHERE id_from = 2 AND id_to = 1 >> ORDER BY time >> >> Assuming an index on id_from (or id_from, id_to), it will be used. This will >> be very quick. > > I bench'ed the union before sending in my original response. For the > generic data set I created as an example his original query is faster > (not by much) and simpler. If it were me writing the query I would use > a union, probably. IN and OR never end well ![]() > >> this will not prevent filesort, because the results still needs to be > sorted for ORDER BY time, or? > > Yes. Adding the extra column to the index will not result in losing > the filesort. hu? ... i am pretty sure i had a similar problem, i solved this by adding the column with the order to the index, which 'solved' the filesort (except the ORDER is in reverse) > The filesort will not be any sort of a problem unless > the result is large. yes. of course, size matters, if your whole DB is small enough you will not even get any performance impacts without any index ... ;-) but it requires more resources, or? -- Sebastian Mendel |
|
![]() |
| Outils de la discussion | |
|
|