anybody know of GOOD serach engine?

From: Fred Cisin <cisin_at_xenosoft.com>
Date: Mon Mar 29 17:26:26 2004

On Mon, 29 Mar 2004, Jules Richardson wrote:
> Similarly, Google trying to be "helpful" can be a real pain, when it
> goes and tries to be clever about finding results (returning not only
> matches to what you searched for, but other things it considers close).
> "Stemming" they seem to call it. Trouble is there seems to be no way of
> turning it off...

"Stemming" is the canonical term in IR ("Information Retrieval") for the
process of stripping off prefixes, suffixes, plurals, etc. to attempt to
find a match for the base word, not the specific string of characters.
For example, should "diagrams" be considered a match for "diagram",
"viruses" and "virii" match "virus"?


Google seems to have a unique use of "+". 'course, as George Morrow said,
"Standards are wonderful; everyone should have one of their own."
In many search systems, "A+B" means presence of A AND presence of B.
(In some digital electronics texts, "AB" means A AND B, and "A+B"
means A OR B.)
In Google, "+" seems to mean turn off the stemming, and reject any pages
that do not have that EXACT search term present. Therefore, "A+B" would
mean an exact match for B and a "loose" match for A.

In many search systems, "next" is a "stopword" - a word that is ignored,
(such as A, AN, THE, ...) because it is presumed to not help the search
process. In a system that has no options for case sensitivity, how do you
search for "NeXT"??

> A search engine that just returns what you ask for from the web would be
> nice - no indexing of news, mailing lists etc, no ads, and no trying to
> be intelligent by stripping out words, modifying words, randomly
> inserting or removing punctuation etc.

It would be GREAT to have a system that let you control such
"features". But how many people would actually learn how to use it?
What percentage of the users are looking for something more
involved than "Britney Spears naked"?


--
Grumpy Ol' Fred     		cisin_at_xenosoft.com
suggested readings:
Frakes "Information Retrieval Data Structures and Algorithms"
Salton "Automated Text Processing"
Received on Mon Mar 29 2004 - 17:26:26 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:37:06 BST