Category Archives: Lucene

How MoreLikeThis Works in Lucene

We created a ‘related items’ feature way back in Clearspace 1.0 (I mocked some of it out here just to prove that it worked) which shows related content based on the document, thread or blog post that you’re currently viewing. … Continue reading

Posted in Clearspace, Lucene, Software Development | 22 Comments

Using Lucene and MoreLikeThis to show Related Content

If you read this blog, you probably paid a smidgen of attention to the Web 2.0 Conference held last week in San Francisco. Sphere was one of the companies that presented and they launched a product called the “Sphere It … Continue reading

Posted in Blogs, Content Management, Lucene, Syndication | 4 Comments

Nutch, Yahoo!, and Hadoop

It’s been awhile since I mentioned anything about Lucene, my favorite Java based open source indexing and search library (which I built the karakoram spider / search application around). Doug Cutting, who created Lucene and who has spent the last … Continue reading

Posted in J2EE, Lucene, Open Source, Software Development | 1 Comment

Links: 8-12-2005

Behind the Scenes of the SourceForge.net Search System – SF.Net Engineering search on sourceforge.net, powered by Lucene.(categories: lucene search sourceforge )

Posted in Daily Links, Lucene | Leave a comment

Doug Cutting interview on TSS

The guys at TSS did a pretty indepth interview with Doug Cutting (who wrote Lucene and is now involved in the Nutch project). You can read the transcript here; there’s video there somewhere, I just couldn’t find it.

Posted in J2EE, Lucene, Software Development | Leave a comment

Conflicting mindsets of C# vs. Java: Part II

You all read the the ‘Conflicting mindsets of C# vs. Java‘ weblog post right? And you all noticed that the guys running the Lucene.NET project on sourceforge closed up shop, took all their toys and went on home right? I’m … Continue reading

Posted in .NET, J2EE, Lucene, Open Source, Software Development | 13 Comments

Extracting Text From MS Word

Someone on the Lucene User list wanted to know if it was possible to search MS Word documents using Lucene. The normal response is to go and take a look at the Jakarta POI project (new blog by the way). … Continue reading

Posted in J2EE, Lucene, Python, Software Development | 11 Comments

Doug Cutting has a blog

Doug Cutting, the creator of Lucene, started a blog a couple months ago. Looks like he’s looking for some people for high profile Lucene work too.

Posted in Lucene | Leave a comment

jSearch 1.1

Spent some time this last week making some minor updates to the jSearch codebase. Along the way I decided to get creative and rename it. So jSearch is now called karakoram, after the mountain range that saddles Pakistan and China. … Continue reading

Posted in J2EE, Lucene | 5 Comments

Lucene Hit Highlighter

A couple months ago I was looking to use a library that Mark Harwood wrote called the Lucene Hit Highlighter. His hosting company closed his account so the library wasn’t available at the time (he was kind enough to send … Continue reading

Posted in Lucene | 6 Comments