PHWinfo banniere

Titres
PORTAIL ANNUAIRE ARTICLES COMPARATEUR HÉBERGEURS DEVIS FORUMS RÉDUCTEUR D'URL
Précédent   PHWinfo > Autres forums > Forum Programmation & Conception > mysql.general > Big SELECT: ordering results by where matches are found
S'inscrire FAQ Membres Recherche Messages du jour Marquer les forums comme lus
Big SELECT: ordering results by where matches are found

Réponse
 
LinkBack Outils de la discussion
Vieux 10/09/2007, 15h42   #1 (permalink)
Chris Sansom
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Big SELECT: ordering results by where matches are found

I'm sure there must be an accepted technique for this, but it's
something I haven't tried before, so if anyone can point me in the
right direction I'd be grateful.

I'm writing a search facility for a site where the data is stored in
several tables - let's say 5 for this example - and I want to order
my results according to where (if anywhere) matches are found. So...

Let's say I have tables 'speakers', 'topics', 'speakers_topics',
'articles', 'other'.
'speakers' is a table of speakers, with id, name and some text fields.
'topics' is a list of topics they address
'speakers_topics' relates the above two by pairs of id numbers
'articles' and 'other' are further tables of text data with possibly
more than one row for some speakers, identified by id.

I want to search the data in the following order:
name from 'speakers'
topics
text data from 'speakers'
text data from 'articles' and 'other'
....and order the results according to where in that hierarchy a match is found.

So, if the user's search term matches one speaker's name field,
another's topic and someone else's text data, that's the order in
which the results should be ordered. Also, if the same person is
matched from, say, both name and text fields (which is very likely,
as their name will almost certainly appear in some of the text), the
name should take precedence in the ordering.

To complicate matters further, I'd like if possible to extend this to
an and/or situation. If the user enters two or more words, any
results that match all the words should be ordered above those that
match only some of the words.

I can probably do this relatively easily with a series of separate
queries (I'm doing all this from PHP, by the way), but that strikes
me as inefficient. Can it all be done in one big query, perhaps with
subqueries?

--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Revolution: an abrupt change in the form of misgovernment.
-- Ambrose Bierce
  Réponse avec citation
Vieux 10/09/2007, 16h01   #2 (permalink)
Baron Schwartz
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Big SELECT: ordering results by where matches are found

Chris Sansom wrote:
> I'm sure there must be an accepted technique for this, but it's
> something I haven't tried before, so if anyone can point me in the right
> direction I'd be grateful.
>
> I'm writing a search facility for a site where the data is stored in
> several tables - let's say 5 for this example - and I want to order my
> results according to where (if anywhere) matches are found. So...
>
> Let's say I have tables 'speakers', 'topics', 'speakers_topics',
> 'articles', 'other'.
> 'speakers' is a table of speakers, with id, name and some text fields.
> 'topics' is a list of topics they address
> 'speakers_topics' relates the above two by pairs of id numbers
> 'articles' and 'other' are further tables of text data with possibly
> more than one row for some speakers, identified by id.
>
> I want to search the data in the following order:
> name from 'speakers'
> topics
> text data from 'speakers'
> text data from 'articles' and 'other'
> ...and order the results according to where in that hierarchy a match is
> found.
>
> So, if the user's search term matches one speaker's name field,
> another's topic and someone else's text data, that's the order in which
> the results should be ordered. Also, if the same person is matched from,
> say, both name and text fields (which is very likely, as their name will
> almost certainly appear in some of the text), the name should take
> precedence in the ordering.
>
> To complicate matters further, I'd like if possible to extend this to an
> and/or situation. If the user enters two or more words, any results that
> match all the words should be ordered above those that match only some
> of the words.
>
> I can probably do this relatively easily with a series of separate
> queries (I'm doing all this from PHP, by the way), but that strikes me
> as inefficient. Can it all be done in one big query, perhaps with
> subqueries?


I've built similar systems with a series of UNION queries. Each UNION
has a column for "relevance", which can be a sum of CASE statements,
such as

IF(<name matches>, 1, 0) + IF(<text matches>, 1, 0) AS relevance...

The entire UNION can then be ordered by relevance. You could also just
add in an arbitrary number in each UNION, to get the effect of ordering
by where in the hierarchy the match is found.

Baron
  Réponse avec citation
Vieux 10/09/2007, 16h21   #3 (permalink)
Chris Sansom
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Big SELECT: ordering results by where matches are found

At 11:01 -0400 10/9/07, Baron Schwartz wrote:
>I've built similar systems with a series of UNION queries. Each UNION has a column for "relevance", which can be a sum of CASE statements, such as
>
>IF(<name matches>, 1, 0) + IF(<text matches>, 1, 0) AS relevance...
>
>The entire UNION can then be ordered by relevance. You could also just add in an arbitrary number in each UNION, to get the effect of ordering by where in the hierarchy the match is found.


Oo-er. This sounds marvellous, and I /think/ I see what you're getting at, but it's a bit beyond anything I've done before - never used UNION for instance. Can you perhaps go into a little more detail?

--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

The nice thing about standards is that there are so
many of them to choose from.
-- Andrew S. Tanenbaum
  Réponse avec citation
Vieux 10/09/2007, 17h11   #4 (permalink)
Chris Sansom
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Big SELECT: ordering results by where matches are found

At 11:01 -0400 10/9/07, Baron Schwartz wrote:
>The entire UNION can then be ordered by relevance. You could also
>just add in an arbitrary number in each UNION, to get the effect of
>ordering by where in the hierarchy the match is found.


Actually, your pointing me towards UNION may have done the trick. I
read up on it on the MySQL docs site and I've ended up with this,
which actually covers more tables and fields than in my original post:

-----------

select distinct tb.speaker_id, tb.fore, tb.sur, tb.division from
(
(
select 1 as relevance, speaker_id, fore, sur, division
from speakers
where fore like '%education%' or sur like '%education%')
union
(
select 2 as relevance, s.speaker_id, fore, sur, division
from speakers s, speakers_topics st, topics t
where st.speaker_id = s.speaker_id and t.topic_id = st.topic_id and
topic like '%education%'
)
union
(
select 3 as relevance, speaker_id, fore, sur, division
from speakers where match (strap, shortbio, longbio) against ('education')
)
union
(
select 4 as relevance, s.speaker_id, fore, sur, division
from speakers s, articles a
where s.speaker_id = a.speaker_id and match (title, article) against
('education')
)
union
(
select 5 as relevance, s.speaker_id, fore, sur, division
from speakers s, other o
where s.speaker_id = o.speaker_id and match (title, article) against
('education')
)
union
(
select 6 as relevance, speaker_id, fore, sur, division
from speakers, books
where speaker_id = author and match (title, description) against ('education')
)
order by relevance, division, sur, fore
) as tb

-----------

First, I did it without the outer select, and I got speakers repeated
if they were matched in more than one block. One of the comments on
the MySQL docs site suggested the 'wrapper', which I did initially
like this:

select distinct speaker_id, fore, sur, division from... with nothing
after the final ')'. This gave me an error to the effect that derived
tables must always have an alias. What the hey, let's just try it
like this (the above)... and to my astonishment it worked!

So before I sign off on this thread, can you see any way I could improve this?

Naturally, I haven't yet incorporated the treatment of more than one
search term, but I'll try and work that out for myself. :-)

--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Justice is incidental to law and order.
-- J. Edgar Hoover
  Réponse avec citation
Vieux 10/09/2007, 18h34   #5 (permalink)
Baron Schwartz
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Big SELECT: ordering results by where matches are found



Chris Sansom wrote:
> At 11:01 -0400 10/9/07, Baron Schwartz wrote:
>> The entire UNION can then be ordered by relevance. You could also
>> just add in an arbitrary number in each UNION, to get the effect of
>> ordering by where in the hierarchy the match is found.

>
> Actually, your pointing me towards UNION may have done the trick. I read
> up on it on the MySQL docs site and I've ended up with this, which
> actually covers more tables and fields than in my original post:
>
> -----------
>
> select distinct tb.speaker_id, tb.fore, tb.sur, tb.division from
> (
> (
> select 1 as relevance, speaker_id, fore, sur, division
> from speakers
> where fore like '%education%' or sur like '%education%')
> union
> (
> select 2 as relevance, s.speaker_id, fore, sur, division
> from speakers s, speakers_topics st, topics t
> where st.speaker_id = s.speaker_id and t.topic_id = st.topic_id and
> topic like '%education%'
> )
> union
> (
> select 3 as relevance, speaker_id, fore, sur, division
> from speakers where match (strap, shortbio, longbio) against ('education')
> )
> union
> (
> select 4 as relevance, s.speaker_id, fore, sur, division
> from speakers s, articles a
> where s.speaker_id = a.speaker_id and match (title, article) against
> ('education')
> )
> union
> (
> select 5 as relevance, s.speaker_id, fore, sur, division
> from speakers s, other o
> where s.speaker_id = o.speaker_id and match (title, article) against
> ('education')
> )
> union
> (
> select 6 as relevance, speaker_id, fore, sur, division
> from speakers, books
> where speaker_id = author and match (title, description) against
> ('education')
> )
> order by relevance, division, sur, fore
> ) as tb
>
> -----------
>
> First, I did it without the outer select, and I got speakers repeated if
> they were matched in more than one block. One of the comments on the
> MySQL docs site suggested the 'wrapper', which I did initially like this:
>
> select distinct speaker_id, fore, sur, division from... with nothing
> after the final ')'. This gave me an error to the effect that derived
> tables must always have an alias. What the hey, let's just try it like
> this (the above)... and to my astonishment it worked!
>
> So before I sign off on this thread, can you see any way I could improve
> this?
>
> Naturally, I haven't yet incorporated the treatment of more than one
> search term, but I'll try and work that out for myself. :-)


Looks like you've found the solution you need. The only other
suggestion I have is to use UNION ALL if you don't need to eliminate
duplicate rows in the UNION, because there's some overhead for checking
for them.

Baron
  Réponse avec citation
Vieux 11/09/2007, 10h18   #6 (permalink)
Chris Sansom
Aucun Avatar
 
Messages: n/a
Hébergeur:
Par défaut Re: Big SELECT: ordering results by where matches are found

At 13:34 -0400 10/9/07, Baron Schwartz wrote:
>Looks like you've found the solution you need. The only other
>suggestion I have is to use UNION ALL if you don't need to eliminate
>duplicate rows in the UNION, because there's some overhead for
>checking for them.


Hi Baron

Thanks for this, and I did try it, but the difference in time taken
to execute the query was negligible (I tested it multiple times) - it
was around 0.02 seconds whichever way I did it, and when I used
EXPLAIN, the results were identical except for one detail:

The number of rows in the first row of the EXPLAIN result was lower
with plain UNION than if I used UNION ALL. As far as I can tell from
my relatively limited experience with all this, the first row refers
to my outer 'wrapper' select from the derived table (the table in the
first row is given as '<derived2>' and the Extra column shows 'Using
temporary'). For a given query, with UNION ALL that has 45 rows, with
UNION it's 31. So I guess I'll stick to plain UNION.

As far as my desire to cope with multiple search terms is concerned,
I realise now that fulltext handles that anyway! So I've changed the
few non-numeric fields that weren't indexed that way (fore, sur and
topic) to fulltext and bingo! Not only that, but it all happens fully
FOUR TIMES as quickly!

So many thanks, Baron - mainly due to you, yesterday was a very good
MySQL day for me. It's not often I get two 'lightbulb moments' on the
same day!

--
Cheers... Chris
Highway 57 Web Development -- http://highway57.co.uk/

Good people will do good things, and bad people will do bad things.
But for good people to do bad things - that takes religion.
-- Steven Weinberg, physicist and Nobel Laureate
  Réponse avec citation
Réponse


Outils de la discussion

Règles de messages
Vous ne pouvez pas créer de nouvelles discussions
Vous ne pouvez pas envoyer des réponses
Vous ne pouvez pas envoyer des pièces jointes
Vous ne pouvez pas modifier vos messages

Les balises BB sont activées : oui
Les smileys sont activés : oui
La balise [IMG] est activée : oui
Le code HTML peut être employé : non
Trackbacks are oui
Pingbacks are oui
Refbacks are oui


Fuseau horaire GMT +1. Il est actuellement 11h26.


Édité par : vBulletin® version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0 RC5 Tous droits réservés.
Version française #16 par l'association vBulletin francophone
PHWinfo est un site Éducation Sans Frontières
Ad Management by RedTyger
©Tous droits réservés par les parties respectives
Page generated in 0,16280 seconds with 14 queries