[nycphp-talk] Need About creating search
Rob Marscher
rmarscher at beaffinitive.com
Tue Nov 27 16:37:39 EST 2007
On Nov 27, 2007, at 11:21 AM, Mitch Pirtle wrote:
> http://www.sphinxsearch.com/
>
> Basically it is the fastest search tool I have found anywhere on any
> platform. Wicked fast, and pretty decent interfaces for a variety of
> languages as well. Using it on some major projects now with millions
> of content entries submitted by users - this is a great environment
> for huge social sites.
+1. We're using it here: http://www.heynielsen.com/search/ (sorry for
the promo... but it seemed that people wanted to see it in action)
What I love about it is the way it uses mysql as the source for the
index. All you need to do is setup the sphinx.conf file and set a
cron to periodically rebuild the indexes. No other programming
required to create the indexes. They already have a php api class
which is including the server download... so searching it from php is
simple too. Documentation on the api could use some help though...
have to get the details by reading the source... maybe I should
contribute.
You can search multiple indexes in one Query by separating the indexes
by a space. That's not documented as far as I know. I discovered it
in the sphinx forums. You can also just search every index you have
available in one query. Here's some code that I use. I have a main
index - "mainIndex" - that I reindex once an hour (it indexes over
100,000 records in a couple seconds and then sends a sighup so that
the search server reloads the index). I also have a "delta" index
that contains only the new entries since the last time the main index
was reindexed. I do this every couple minutes and the operation takes
under a second. They talk about how to do this in the documentation.
I also created stemmed and soundex indexes... so if no results were
found in the regular index, it tries those other indexes next:
$spx = new SphinxClient();
$spx->SetServer($host, $port);
$spx->SetWeights(array(100, 1));
$spx->SetLimits(0, 250);
$spx->SetMatchMode(SPH_MATCH_ALL);
$spx->SetFilter('category', array($category));
$spx->SetSortMode(SPH_SORT_RELEVANCE);
$_rs = $spx->Query($search, 'mainIndex mainIndexDelta');
if (count($_rs['matches']) == 0) {
// give another try with the soundex index <- love this!! :)
$_rs = $spx->Query($search, 'mainIndexSoundex mainIndexSoundexDelta');
}
if (count($_rs['matches']) == 0) {
// still no results? how about stemming
$_rs = $spx->Query($search, 'mainIndexStemmed mainIndexStemmedDelta');
}
if (count($_rs['matches']) == 0) {
// still no results? how about a different match mode
$spx->SetMatchMode(SPH_MATCH_ANY);
$spx->SetLimits(0, 20);
$_rs = $spx->Query($search, 'mainIndex mainIndexDelta');
}
The search only returns the primary keys for the matched records. You
then have to do a separate mysql query to get any extra details... but
you'd be surprised how fast searching mysql via those primary keys
is. You don't need any where clause because your search has already
been narrowed.
-Rob
More information about the talk
mailing list