All posts by ajohnson

full text search w/ lucene

Last week at work was slow.. I installed Jabber, put the finishing touches on footjoy.com, and worked on the linux desktop I have all day on Friday because I had to FTP about 3GB’s of data via a VPN connection, which completely tied up my Windows desktop machine. So Friday I installed and began working with Lucene, an open source full text searching api written entirely in Java. By sheer coincidence, Lucene was written by the same guy who wrote the V-Twin search capability in the Mac, which I mentioned yesterday (and found out about by reading Interface Culture, weird!). By the end of this coming week I hope to have a functional search for this site using Lucene. But for now.. links:

Lucene Tutorial

Javaworld Article on using Lucene

Lucene FAQ Home Page

Lucene Mailing List Archive

Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW (With CD-ROM)

infinity imagined

Finished Interface Culture by Steven Johnson today (it’s 93 outside and humid, which means it’s reading weather). I’ll leave it up to the reviewers on Amazon’s site to give you more information about the book…

I like books.. fiction, non fiction, but books that make you think… think about things to create, think about things as you’ve never thought about them before. Interface Culture was one of those books. His central premise (I think!) is that interface design is an art form, just like a Dickens novel or a Renaissance painting and because it is an art form, it has social and cultural impacts, some of which we can see with the naked eye, some of which we can discover and some that can only be seen in hindsight.

A second theme I found was the idea that emergent technologies, things like personal agents and Apple’s V-Twin search technology, while brilliant, most often end up being applied in areas never imagined by their creators. For instance, Thomas Edison created the phonograph in 1877. But get this: he thought the phonograph would be used mainly for recording phone conversations. These applications were explained as exaptations, which is my official word of the day. 🙂

Finally, though not an official theme, I found numerous mentions of the idea that some, if not all, radical and sometimes breakthrough inventions are initially rejected by popular and mainstream culture. The Mac, with it’s icons and graphical user interface, was seen as simple and labeled as cartoonish… it was not seen as a “serious business application”. Soon, the icons, trash bin and menu system took over the entire business world and every computer we use today uses the same metaphors that the original Mac did in the early 1980’s. Just goes to show that maybe the heated debate about technologies like Flash as an interface device or wireless devices might be the tip of an amazing iceberg… who knows?

jabber

I installed Jabber on my Linux server @ work yesterday. Took about 15 minutes to setup the server side, ‘nuther 15 to get a Linux client up and running and 15 to get a Windows client running and connected. Amazingly easy to do and I think Jabber could be very useful in a small office/department environment, if not an entire enterprise. Interally we use IM and email almost exclusively to communicate, even though we don’t have any cubes and sometimes you’re sitting right next to the person you’re talking w/. Anyway, here’s a fun article on using Jabber and bots.

book crossing

A friend sent me a link to bookcrossing.com… interesting concept. From their email:

“The website encourages people to Read, Register, and then Release their books “into the wild” and then track where they go and the lives they touch. Great concept… share your books and follow their progress forever.”

importing large(45mb) xml files

I mentioned that I had to import a large weather file as part of the FJ project… it *works* using simple VBScript and MSXML but it turns out that it kills the server. Couple other options I found:

a) probably the best way to do it was would be to use SAX instead of DOM, unfortunately MSXML doesn’t support SAX via VBScript, only C++ and Visual Basic. Applicable article here on MSDN re: extracting data from a large document.

b) import the data directly into SQL Server using SQL Server Bulk Load functionality, which is the way I’m heading right now… How to? Here.

Great article here on using SAX 2.0 and Java to process large XML documents.

Translucent Databases

Interesting article on oreillynet.com in response to the recent hacking of Yale student admission information by Princeton. The gist is that sensitive data that you don’t need to physically see, but only compare/search/parse should be put into your DB hashed. Excerpt:

“For example, what if a police department needs to build a database of sexual-assault victims that lets them identify trends but hides personal information? You could use a translucent database where the first column is the hash of the victim’s name, and the second column is a hash of their full address, and the third column is a hash of their block and street. You can now group incidents together by grouping entries with identical block hashes; you can see if the incidents refer to the same person by checking to see if those hashes are different.”

More information on translucent databases can be found here.

crazy browser tricks

Found this via http://cms-list.org/, my small brain can’t figure out how this would be useful, but nonetheless, try changing your <body> tag to look like this:

<body contenteditable=true>

and then view your page… type away, move images, *resize* images, delete text… wow. Kinda cool. As usual, it’s IE 5.5 (and higher) specific, although some people have written workarounds to get it to work in Mozilla.

MSDN documentation: http://msdn.microsoft.com/library/default.asp?url=/workshop/author/dhtml/reference/properties/contenteditable.asp

nyc is fun

Just got back from our weekend in NYC (pictures here). I had a great time. Drove down to NYC in a rented 2002 PT Cruiser, got to Yankees Stadium 2.5 hours before game time (4:05pm start). Yankees Stadium staff closed Monument Park before we could get in, so I didn’t get to walk through that, but we did get to hang out on the lower levels for the first couple hours, took alot of pictures. Yankees lost 8-0 to the A’s which made me secretly happy inside. Eric Chavez & Miguel Tejada hit home runs for the A’s, Alfonso Soriano, though he didn’t hit one out of the park, showed some stellar defense. He’s amazing.

Drove to our hotel in the Upper East Side, Hotel Melrose (which used to be the Barbizon Hotel, I stayed here a couple times when I did work for FAO.com last year). I love this hotel! Fun place, reasonable rates, nice area. We learned that the Melrose Hotel, up until the late 1970’s, was an all-women’s hotel.

We walked along Lexington and got down to Rockefeller Center where we had dinner in the Rink Bar located in the bowl where the ice skating happens in the winter time. It was probably 75 degrees with a beautiful breeze… great night.

This morning we got up and had muffins at a local cafe, walked to Central Park and sat on a bench next to Conservatory Water, trying to pick out people that live in New York (in contrast to tourists like us), watching radio controlled sailboats, and drinking in the warm sun. We walked right past the Metropolitan Musuem of Art and then decided to check that out. I was naturally impressed by the fabulous pieces of armor men wore in the Medieval times, but most interested in the stained glass and glass mosaic pieces, probably because Karen could offer insight as she justed finished a glass mosaic class.

Finally, we walked all the way down to 34th and went ‘shopping’ at Macy’s, something Karen’s wanted to do forever… New York city is a cool place.