So after about 12 hours of load testing and final preparations, I’m proud to say that the 2003 version of the Mintz Levin site is now up and running. The site uses web services for publishing information from the staging site to the live site, isapi_rewrite to provide search engine safe urls and is probably the latest revision available of the MINDSEYE CMS framework we’ve been developing internally for the last couple months.
If you read my previous posts about full text searching and the speed (or lack thereof) of Verity, you’ll understand why today we actually ripped out the Verity searching and replaced it with the Full Text Searching engine available in SQL Server. A couple things lead to that decision.
First, I couldn’t get anything faster than 250ms per Verity search on a (admittedly slow) single processor 500mz Pentium III. Since the site wide search page allows you to search up to 9 collections (and filter down within those sections for a more fine tuned search), the search page was taking upwards of 2 seconds to execute, which caused the machine get overwhelmed when doing anything more than a couple simultaneous users. Performance monitor would show the processor pegged at 100%, cfstat would show requests being queued and the average request time rapidly growing. After replacing the engine, we got the request time down to (in some cases) 16ms and cf didn’t spike the processor. In fact, we loaded the box up to 150 concurrent users (using Apache JMeter which is a great tool for doing cheap load testing, check it out sometime) with no issues (mind you, it’s a single processor 500mhz) . Not bad.
Second, SQL Server Full Text searching gave us more flexibility in searching; if I wanted to order the searches by datecreated or by something other than label or relevance, we simply added an ‘order by’ to the sql query.
Finally, because the SQL Server Full Text engine runs within a cfquery, you can add the cachedwithin attribute to the query to get the results directly into RAM. Verity doesn’t have a cachedwithin attribute, the only way to cache the results would be to stick the resulting query into application or server scope, which by itself isn’t all that bad, but just requires more coding than simply adding an attribute to a tag.
The only time I might use Verity (ie: where SQL Server Full Text Searching would not work) would be where I needed to index the file system (ie: html files, pdf files, word files) or where I needed to spider a site. However, even here, if you have the time to write extra code, it might make sense to use a freely available spider or text extraction tool to read documents on the file system system or spider the site and then store the resulting information in SQL Server as text. SQL Server Full Text Searching is obviously tied to a MS/Windows environment but I haven’t yet looked at how MySQL handles full text searching, although I know that Maia has done some work with it on nehra.com. I’d be interested in hearing about anyone elses’s experience with full text searching using MS SQL, MySQL or Verity. If you’re reading this post and are having issues with Verity and you don’t have access to MySQL or SQL Server with full text indexing, you might look at writing a CFX that wraps Lucene (an idea that Joe commented on).
Building Search Applications for the Web Using Microsoft SQL Server 2000 Full-Text Search [source]