Aaron Johnson Now with 50% less caffeine!

Posts Tagged Lucene

How MoreLikeThis Works in Lucene

We created a ‘related items’ feature way back in Clearspace 1.0 (I mocked some of it out here just to prove that it worked) which shows related content based on the document, thread or blog post that you’re currently viewing. It was built using the MoreLikeThis class, a contribution made to the Lucene project by [...]


Using Lucene and MoreLikeThis to show Related Content

If you read this blog, you probably paid a smidgen of attention to the Web 2.0 Conference held last week in San Francisco. Sphere was one of the companies that presented and they launched a product called the “Sphere It Contextual Widget for blogs“, which is JavaScript widget you can add to your blog or [...]


Nutch, Yahoo!, and Hadoop

It’s been awhile since I mentioned anything about Lucene, my favorite Java based open source indexing and search library (which I built the karakoram spider / search application around). Doug Cutting, who created Lucene and who has spent the last couple years working on Nutch, was recently hired by Yahoo!. I just have a couple [...]


Links: 8-12-2005

Behind the Scenes of the SourceForge.net Search System - SF.Net Engineering search on sourceforge.net, powered by Lucene.(categories: lucene search sourceforge )


Doug Cutting interview on TSS

The guys at TSS did a pretty indepth interview with Doug Cutting (who wrote Lucene and is now involved in the Nutch project). You can read the transcript here; there’s video there somewhere, I just couldn’t find it.


Conflicting mindsets of C# vs. Java: Part II

You all read the the ‘Conflicting mindsets of C# vs. Java‘ weblog post right? And you all noticed that the guys running the Lucene.NET project on sourceforge closed up shop, took all their toys and went on home right? I’m gonna go out on a limb and say that they’re related.
The way I [...]


Extracting Text From MS Word

Someone on the Lucene User list wanted to know if it was possible to search MS Word documents using Lucene. The normal response is to go and take a look at the Jakarta POI project (new blog by the way). Ryan Ackley submitted his website (textmining.org) along with a plug for his TextMining.org Word Text [...]


Posted
11 July 2004 @ 10pm

Tagged
Lucene

Doug Cutting has a blog

Doug Cutting, the creator of Lucene, started a blog a couple months ago. Looks like he’s looking for some people for high profile Lucene work too.


jSearch 1.1

Spent some time this last week making some minor updates to the jSearch codebase. Along the way I decided to get creative and rename it. So jSearch is now called karakoram, after the mountain range that saddles Pakistan and China. The updates include:

modified JSP templates to use Struts <html:img /> and <html:image [...]


Posted
22 June 2004 @ 8pm

Tagged
Lucene

Lucene Hit Highlighter

A couple months ago I was looking to use a library that Mark Harwood wrote called the Lucene Hit Highlighter. His hosting company closed his account so the library wasn’t available at the time (he was kind enough to send me the libary via email), but for anyone who cares, you can visit the [...]


← Before