Blog Spam and Parasites

So I run a couple blogs and like many people I haven’t solved the blog spam problem yet (even MT-Blacklist only works some of the time and it’s a losing battle). There obviously have been alot of people trying to solve the problem, but it’s interesting to watch from another angle: blog spammers (I guess all spammers) are effectively parasites living off a host system. For the most part, they don’t bother the host system too much. One, two, maybe ten comments per day and we both keep on living. It’s the times like last night when I got 418 spam comments on my blog that I think I should turn off comments entirely but killing the host doesn’t seem like the right answer.

A couple of weeks ago I listened to Janice Benyus’ presentation on biomimicry (courtesy of Jason Kottke, you can download and listen to the presentation in MP3 format). In it she talks about 12 ways in which nature does things better than we do: self assembly, the power of shape, natural selection as innovation (amazing to hear how nature in many ways is still light years ahead of our man-made technology). It’s interesting stuff if you’re into that kind of thing, but then you probably wouldn’t haven’t gotten this far if you weren’t. Anyway, that presentation popped back into my head today while I was deleting all 418 comments: how does nature deal with the problem of parasites? I guess it goes without saying that nature hasn’t solved the problem either because the earth is really old… and we’ve still got a ton of parasites. Malaria, giardia, fleas and ticks are all rampant abusers of animals and humans (reading the Wikipedia entry for Malaria reminds me of the malaria pills I had to take when my pop took my brother and I to Kenya for 2 weeks… I chewed one by accident, talk about a bitter pill!) But their existence doesn’t mean that nature (or humans) haven’t tried to get rid of parasites. Doctors and researchers have been trying to eradicate malaria from much of the world for many years using a variety of methods:

· attacking the mosquito population: Reminds me of a post by Tim Bray (Crooks in Plain Sight), why not hunt down the people who do the blog spam (we certainly know where they live on the web)? Tim, or perhaps someone else, suggests that no one wants too. Better yet, no one has the time or energy to do so. Probably true.

· genetically modifying mosquitoes so that they don’t carry malaria: I think this would be analogous to asking the search engines (Google, MSN, Yahoo, etc..) to do something about the problem (ie: modify the Google search algorithm so that even if comments gets on your blog and stays there, it doesn’t influence the PageRank of the link they post). Would this happen? Unlikely. Any change Google makes would be countered by changes from the parasite/blog spammer. That’s a losing battle.

· distributing mosquito nets to areas of the world where the problem is most severe: A mosquito net? I think this is the MT-Blacklist / scode of our disease control toolkit. Put a net over our blogs so that the parasites can’t get it. Works for someone, but complete eradication will only work if everyone has a net. The parasites will continue to flourish on all the unprotected blogs.

· releasing millions of sterile mosquitoes into the wild (Sterile insect technique):
: The idea here is that by releasing millions of sterile mosquitoes, males will fruitlessly attempt to fertilize eggs.. females will lay eggs that can’t ever hatch. I think this would be analogous to a bunch of people getting together and making sites about incest, kid porn and beastiality. Neutralize the blog spammers efforts by beating them at their own game (which we’ve proven we can do).

·through chemicals like DDT: No idea what the corollary would be in the blog world, most likely like the first (attacking the mosquito population)

·by developing a vaccine for malaria: Here’s an interesting idea: instead of distributing nets like MT-Blacklist or scode (which block spam at the point of attack), why not modify our systems that they become immune to the affects of the parasite? Maybe this is modifying the comment URL so that it’s a redirect rather than a direct link (already being done), adding comment registration (already being done), or a comment approval process.

I’m not sure a solution will ever be found (regular spam doesn’t seem to be getting any better)… if nothing else, it’s an interesting problem. Oh, and if you found it interesting, drop me a comment. 😉

Mikedotnet.com v2.0

In my continuing quest for free stuff, I *helped* a buddy transfer his ASP based website to my movable type installation and we had the launch party… um.. actually there was no launch party. No free beer or pizza. Nada. Ahem. Anyways, you can check out the new Mikedotnet.com v2.0 now. Also, I figure by mentioning Mikedotnet a couple times on my blog, I will continue to rank higher than he does for his own site, which means he must continue to give me free stuff. You too Patrick Owens!

Commons Net FTP listFiles() returns null

Just bringing this to the top of the Google QueueTM, if you’re using the listFiles() method of the org.apache.commons.net.ftp.FTPFile class and you’re seeing that listFiles() returns null when it shouldn’t be, make sure to upgrade to the latest and greatest version of commons-net (which is 1.2.2 right now). Using version 1.0.0 was causing NPE’s all over the place. I think this bug on issues.apache.org discusses the issue, but it’s more specific than what I was seeing.

Automating Application Deployment with Tomcat

I’m asking a bunch of questions here so if you’re expecting answers, look elsewhere. In the past couple years, I’ve waded through a bunch of different ways to configure and build Java web applications. At first, I hardcoded connection strings, settings and message strings in the source code, compiled using Eclipse and copied over the JSP and class files to the web server (lame, I know, but you gotta start somewhere). As the applications I wrote got more complex and as I got smarter, I started using Ant to perform the build and I learned about properties files and web.xml. After unpacking alot of other open source Java applications, I’m now using log4j for logging and error messaging, JNDI for configuration (datasource and application configuration stored in Tomcat’s server.xml), resource bundles for storing internationalized message strings, JUnit for running tests and Ant for cleaning, testing and building my apps into a deployable format (war files).

And that’s where it stops being easy. Deployments suck. We have a pretty small environment (a couple test servers, a couple staging servers, a couple live servers) and deploying changes to those servers is tedious. For example, in my development environment, I like to have log4J append any and all information to the console, which lets me watch the system as it starts up and runs:

log4j.category.com.mycompany=INFO, STDOUT

But once I build the application and deploy it to the live environment, I only want error messages and I want them sent to the SMTPAPPENDER:

log4j.category.com.mycompany=ERROR, SMTPAPPENDER

so I’m stuck editing a text file every time we deploy an application. It’s not that big a of a deal, but I also have applications on each server that need to have the appropriate entries in Tomcat’s server.xml (environment entries, JDBC connections, etc..), sometimes Tomcat needs to be restarted after your deploy the war, applications are deployed to different directories on different machines, sometimes the application being rolled out doesn’t work and you need to roll back to the previous version, how do you keep track of all the live / staging / development servers? The environment I work in is pretty small, so all of this can be done by hand, but it’s tedious, boring and error prone. How do you guys that work in larger environments do it? How do you move the .war files from your staging environment to the live environment? Using Ant? Do you trigger Ant tasks on the live servers that check out source code from CVS and build the apps there? Do you restart Tomcat every time? Do you do one half of your machines at a time and then the next half? You can’t be doing this by hand! Any tips?

Scotty Cameron Studio Store: Struts, Hibernate, Quartz, Ant, Cewolf, etc..

On Monday we launched the online store I was working on this summer for ScottyCameron.com. We brought the ecommerce capabilities in-house (it was on Yahoo! Store) so that we could leverage some of the existing functionality I built for address verification, credit card fraud protection and taxation. I was able to use alot of Java related technology including Hibernate (for object persistence), Struts, Quartz (embeddable job scheduling), Ant, Cewolf (for graphing and charting), various commons projects (beanutils, httpclient, logging, etc..), Fedex API, dom4j (XML parsing all deployed successfully on Tomcat 5.0.25 across a couple webservers.