Last I heard, Jarrod is using the mnGoSearch engine for this site. It's open source code (as in free) written in C for UNIX as part of the GNU project and runs using a SQL database. There are a number of features a webmaster may choose to use or not use and I'm not sure what Jarrod has chosen. Server performance is usually the key consideration. Based on the results returned and the options presented in the advanced search screen, I have some idea of what he's done. Perhaps he'll choose to respond to this post to clarify some points. Here, in gory detail, are some updated notes I have on the current search facility.
The mngoSearch engine periodically (as in daily or weekly or whenever Jarrod has nothing better to do) makes a keyword index of new documents in most sections of the brickboard web site -including the technical forums and recently even the Opinions forum. The Archives forum and 700/900 feature section were fully indexed some time ago. Up until recently, it seemed like only the past few months of posts in the technical forums were indexed, but the index seems to go all the way back now.
Depending on how busy the system is, how complex the query is, and how many results per page you request, the search may take a very long time or possibly even timeout before finding all or any results. If there is a timeout limit it seems to be determined based on server task load to prevent the search feature from overloading general use of the brickboard system. Multiple people can search at the same time (the code is re-entrant), each will have their own timeout.
Results are returned, one page at a time, on a high score basis depending on how much of the search expression can be matched, the number of keywords found and the count of the keywords in each post (it's a complex scoring formula using vector tables I don't readily understand). The score is shown to the right of the title. For a complex query, especially where common words or wildcards or Boolean expressions are used, this can take a very long time. Even when the brickboard server isn't busy it may take several minutes to get a reply, so try not to lose patience. Unfortunately it seems impossible to distinguish a long wait from a timeout.
For a basic query (in the box at the bottom of all pages), all forums are searched and the keywords are matched (in the description, title and body of each post) using an ALL type of logical search and returned in order based on the score which is an attempt at relevance.
For an advanced search (click on HELP/SEARCH or use the "more search options" link at the bottom of the page), you can specify which forum to search and whether to use ALL(AND), ANY(OR), BOOLEAN (logical expression) or FULL PHRASE (adjacent text) type of searching. You can do Whole Word matching or partial word matching (Beginning, Ending or Substring), but this applies to all keywords supplied so you can't designate a mixed wildcard search. The simplest search is ALL or ANY using Whole Words. The most complex and most powerful is a Boolean search.
Even with multiple keyword, partial keyword (wildcard), Boolean expressions and searching only selected forums, it is often difficult to find relevant posts or to avoid getting too many matches. In any search, if you use too few keywords, you'll get too many extraneous results. Use too many keywords or a complex expression and you may wait a long time to get too few or even no results. BTW there is a limit of ten keywords in any search expression. The trick is to use the fewest number of fairly unique terms to identify the topic of interest, but realize that plurals and singular forms are unique and different people will use different terminology for the same thing.
You can use the following operators in a Boolean expression (this is undocumented on the brickboard):
Boolean Search Expressions and Operators:
term1 ___________contains keyword term1 (single word only, spaces are ignored)
term1 & term2 ___contains keywords term1 AND term2
term1 | term2 ___contains keywords term1 OR term2
~term1 __________does NOT contain keyword term1
(expresssion1) __grouped expression of terms and operators
~(expression1) __expression evaluates as false
Note: Only a single keyword term is allowed between Boolean operators (blanks between words and everywhere else in Boolean expressions are ignored). Upper/lowercase matching is unimportant. Through the pull down menu you can specify that all terms are either whole words or partial words (beginning, ending, substring), but you can't specify mixed whole and partial matches nor mixed positions for the partial word matching. Also, you can't specify word phrases in Boolean searches. The evaluation precedence is: keyword matching then grouped expressions then NOT then AND then OR then left to right. Grouped expressions may also be nested. NOT is what is called a unary operator and can appear at the beginning of any term or expression regardless of other operators.
As a complex example, looking for the phrases "timing belt" or "timing belts" or "T-belt" or "T-belts" for "B234" or B234F" or "16 valve" or "16-valve" engines can make searching difficult. The closest you can come for a Boolean search using Whole Words is:
(timing & (belt|belts))|(t-belt|t-belts) & (b234|b234F|16&valve|16-valve)
Here's the same thing using a partial Beginning word search:
(timing & belt)|(t-belt) & (b234|16&valve|16-valve)
There is a treasure chest of information stored on the brickboard (now up to 10,000 visits per day since starting 7 years ago). I've often gone searching for posts that I knew I'd seen using words that I knew were almost certainly in the post, but have come up dry. I have no good explanation for this other than a timeout or that the index was out of date at the time. In my past experience, the expected results are often not returned or a great many are returned in an order I don't find particularly useful, but with a little perseverance I can eventually find what I want to know. Of course, at that price (basically free), I certainly can't expect something like Google.
--
Dave -not to be confused with a real expert, just goofing around at this
|