Category Archives: Software Development

instantFeeds 1.0.3

New version of instantFeeds: version 1.0.3. It includes a two new features: you can now turn off your notifications by sending the command ‘off’ (kind of like an out of office feature) and turn them back on by sending the command ‘on’ and the notification you get sent now includes an approximately 255 character summary of the latest item. Additionally, I fixed the package naming (Wildfire recently had to change it’s name to Openfire and all the package names had to be updated as well) issues.

As always, you can check out the release notes, the source repository or just skip to the good parts and download the plugin.

Communication Multiplexers

From an interview with one of the developers on the twitter.com team:

I think the real power of Twitter is its ability to channel over different mediums at the user’s whim. IM, SMS, email, and the web are just transports as far as Twitter is concerned. Generally, you have to go out and get information via whatever medium that information is on. With Twitter, information can come to you via whatever medium you prefer. Or, if you want some space, you can easily turn off the information tap with a simple “off” command. That’s powerful.

I linked to a blog post by Tim O’Reilly a couple days ago that summarized this feature by calling it a ‘communications multiplexer’… There are other companies that do interesting things in this space in different ways: rasasa.net, zaptxt.com, feedcrier.com, etc… It’s also one of the ways I’d like to evolve the instantFeeds plugin I wrote: be able to send an email, IM or an SMS or maybe even message into a web page: get only the information you want, delivered using the medium of your choice.

Book Review: Crossing the Chasm

Book Review: Crossing the Chasm

I don’t remember how Crossing the Chasm got onto my reading list (maybe the Fog Creek Software Management Training Program Reading List?) but I finally got around to reading it over the last couple weeks. One sentence review: It’s a great book for developers and product managers working at small software shops / start ups that want to to take their business to the next level. And now for a bunch of poignant excerpts…

The book hinges on the idea that a giant chasm exists in between the early adapters and the early majority (hence the title), as evidenced by this bell curve:



For posterity’s sake, the author defines the early adopter as someone who wants to buy a change agent, something that will

“… get [them] a jump on the competition, whether from lower product costs, faster time to market, more complete customer service, or some other comparable business advantage.”

In comparison, the early majority

“… want to buy a productivity improvement for existing operations. They are looking to minimize the discontinuity with the old ways. They want evolution, not revolution. They want technology to enhance, not overthrow, the established ways of doing business.”

(pg 20)

What is marketing?

… taking actions to create, grow, maintain, or defend markets… Marketing’s purpose, therefore, is to develop and shape something that is real, and not, as people sometimes want to believe, to create illusions. In other words, we are dealing with a discipline more akin to gardening or sculpting than, say, to spray painting or hypnotism… a market is

  • a set of actual or potential customers
  • for a given set of products or services
  • who have a common set of needs or wants, and
  • who reference each other when making a buying decision

(pg 28)

You need to read the book to understand why the following excerpt is important, but I think this paragraph sums up a lot of what the book is about:

Companies just starting out, as well as any marketing program operating with scarce resources must operate in a tightly bound market to be competitive. Otherwise, their “hot” marketing messages get diffused too early, the chain reaction of word-of-mouth communication dies out, and the sales force is back to selling “cold.” This is classic chasm symptom, as the enterprise leaves behind the niche represented by the early market. It is usually interpreted as a letdown in the sales force or a cooling off in the demand when, in fact, it is simply the consequence of trying to expand into too loosely bound a market. The D-Day strategy prevents this mistake. It has the ability to galvanize an entire enterprise by focusing it on a highly specific goal that is 1) readily achievable and 2) capable of being directly leveraged into a long term success. Most companies fail to cross the chasm because, confronted with the immensity of the opportunity represented by a mainstream market, they lose their focus, chasing every opportunity that presents itself, but finding themselves unable to deliver a salable proposition to any true pragmatist buyer. The D-Day strategy keeps everyone on point — if we’d don’t take Normandy, we don’t have to worry about how we’re going to take Paris. And by focusing our entire might on such a small territory, we greatly increase our odds of immediate success.

(pg 67)

On product-centric / pre-chasm companies compared to market-centric / post-chasm companies:

… we must shift our marketing focus from celebrating product-centric value attributes to market-centric ones. Here is a representative list of each:
Product Centric

  • Fastest product
  • Easiest to use
  • Elegant Architecture
  • Product Price
  • Unique Functionality

Market-Centric

  • Largest installed base
  • Most third party supporters
  • De facto standard
  • Cost of ownership
  • Quality of support

(pg 137)
I’m leaving out a number of other excerpts that I dog eared because it’s late but the excerpts probably won’t do you any good any way, read it and then these might jog your memory.

The Referer header, intranets and privacy

I’ve discussed meaningful URL’s a number of times on this site: one of the biggest benefits of a good blog URL is that you can infer who posted the article, when it was posted and what the blog post is about. For the most part this is all ‘a good thing’. But when you’re blogging on an intranet and you create a blog post that results in a URL like this:

http://intranet.example.com/blogs/aaron/2007/02/07/our-secret-widget-is-going-to-kill-our-competition

and then in the blog post you put a couple links to your competition and embed a picture of their latest product, you’re potentially letting secrets through the firewall without evening knowing it. See, HTTP has this really nice mechanism for specifying both a) what page an image is loading in and b) what page the user was on when they clicked on a link to visit the next page. It’s called the HTTP referer and it’s commonly used for good: web statistics packages (like Google Analytics or AWStats) use the referer header to show you click paths through your site and to show you what other websites are linking to you. A typical request in an Apache HTTPD log file might look something like this:

86.105.195.89 - - [06/Feb/2007:01:54:32 -0500] "GET /blogs/aaron/ HTTP/1.1" 200 34659 "http://intranet.example.com/blogs" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1; .NET CLR 2.0.50727) Gecko/20061204 Firefox/2.0.0.1"

but back to the point at hand: if you’re using blogs or wikis or anything that might produce a clean, understandable, meaningful URL and you or your company are serious about security, you’ll want to make sure that HTTP Referers are blocked because you really don’t want the president of your company breathing down your neck on a Monday morning because your competition just called… and they know. Here’s how:

  • Force anyone / everyone reading your internal site to use a Firefox plugin called RefControl, which allows you to control what gets sent in the referer field per website. Unless you’re the IT guy and you can force people to use this plugin, it’s doubtful this would work.
  • Force all of your outgoing links through what’s called a dereferer. Again, this is unwieldy, can probably be subverted and may not work for images. (you can do the same thing by modifying your Firefox config, but the plugin is easier)
  • Use HTTPS for all the pages on your intranet because RFC 2616 states that:

    Clients SHOULD NOT include a Referer header field in a (non-secure) HTTP request if the referring page was transferred with a secure protocol.

    which means that even if someone does create a link to your competition’s website on the intranet, your competition won’t find out.

On a semi-related note, here are a couple things I learned from reading this article by Eric Lawrence (creator of the fine HTTP Fiddler Tool for Windows):

  • Fiddler has a really cool diff feature where you can select two sessions, right click and select WinDiff from the menu
  • somehow he’s got Firefox hooked up to Fiddler… I gotta learn how.
  • example.com is reserved by RFC2606 specifically for the purpose of blog posts like this. Try the link. Who knew?

Using Outlook 2007 as a RSS aggregator: Not so much

I installed the 60 day preview of Office 2007 a couple days ago to try out some of the blogging / RSS features it included. Publishing from Word 2007 to a blog via the MetaWeblog API? Works pretty well (but seriously, Word to create a blog post?). Outlook 2007 to read feeds? I wouldn’t recommend it. First, I exported all 251 of the feeds I subscribe to using Bloglines as an OPML file and then imported those into Outlook, that part worked great… reading posts worked great. Now because I’m just testing I want to delete all these. I start looking for the ‘manage your subscriptions’ button / option. None exist. I try the multi-select using CTRL-CLICK. No luck. So I have to hand delete 251 feeds. But wait, it gets better. Some of the feeds I apparently don’t have permission to delete (view image) even though I created the subscription! I’m sure it’s a bug (since the documentation says you can use CTRL-CLICK to delete), but it sure would be nice to be able to have a full view / window that gives you the ability to manage all your feeds in one place.

Firefox mimeTypes.rdf corruption

Came across another interesting bug today involving Firefox and mime types. Firefox uses a file called mimeTypes.rdf (stored in your profile folder) to keep track of a) what application should be opening the file you’re downloading and b) what kind of file it should tell a server it’s sending when you upload a file. And it works … for the most part. See, if you download a PDF file from a server that (incorrectly) states that the content-type of the file is ‘application/unknown’, choose to open it using Adobe Acrobat and then check the box that says ‘Do this automatically from now on’, Firefox will store that bit of knowledge away in mimeTypes.rdf. Now go and use a web application that you upload files to and which analyzes the content-type of the files you’re uploading and upload a PDF file. If you’re using LiveHTTPHeaders, you’ll notice that you’re not sending ‘application/pdf’ but instead ‘application/x-download’.

It looks like this bug was filed in bugzilla a couple times and even acknowledged in their documentation, but has yet to be fixed. You can ‘fix’ the problem by deleting your mimeTypes.rdf file and restarting Firefox.

RSS/Atom feeds, Last Modified and Etags

Sometime last week I read this piece by Sam Ruby, which summarized says this:

…don’t send Etag and Last-Modified headers unless you really mean it. But if you can support it, please do. It will save you some bandwidth and your readers some processing.

The product I’ve been working on at work (which I should be able to start talking about soon which I can talk about now) for the last couple months uses feeds (either Atom, RSS 1.0 or RSS 2.0, your choice) extensively but didn’t have Etag or Last-Modified support so I spent a couple hours working on it this past weekend. We’re using ROME, so the code ended up looking something like this:

HttpServletRequest request = ...
HttpServletResponse response = ....
SyndFeed feed = ...
if (!isModified(request, feed)) {
  response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
} else {
  long publishDate = feed.getPublishedDate().getTime();
  response.setDateHeader("Last-Modified", publishDate);
  response.setHeader("Etag", getEtag(feed));
}
...
private String getEtag(SyndFeed feed) {
  return "\"" + String.valueOf(feed.getPublishedDate().getTime()) + "\"";
}
...
private boolean isModified(HttpServletRequest request, SyndFeed feed) {
  if (request.getHeader("If-Modified-Since") != null && request.getHeader("If-None-Match") != null) {
  String feedTag = getEtag(feed);
    String eTag = request.getHeader("If-None-Match");
    Calendar ifModifiedSince = Calendar.getInstance();
    ifModifiedSince.setTimeInMillis(request.getDateHeader("If-Modified-Since"));
    Calendar publishDate = Calendar.getInstance();
    publishDate.setTime(feed.getPublishedDate());
    publishDate.set(Calendar.MILLISECOND, 0);
    int diff = ifModifiedSince.compareTo(publishDate);
    return diff != 0 || !eTag.equalsIgnoreCase(feedTag);
  } else {
    return true;
  }
}

There are only a two gotchas in the code:

  1. The value of the Etag must be quoted, hence the getEtag(...) method above returning a string wrapped in quotes. Not hard to do, but easy to miss.
  2. The first block of code above uses the setDateHeader(String name, long date) to set the ‘Last-Modified’ HTTP header, which conveniently takes care of formatting the given date according to the RFC 822 specification for dates and times. The published date comes from ROME. Here’s where it gets tricky: if the client returns the ‘If-Modified-Since’ header and you retrieve said date from the request using getDateHeader(String name), you’ll get a Date in the GMT timezone, which means if you want to compare the date you’ll have to get the date into your own timezone. That’s relatively easy to do by creating a Calendar instance and setting the time of the instance to the value you retrieved from the header. The Calendar instance will transparently take care of the timezone change for you. But there’s still one thing left: the date specification for RFC 822 doesn’t specify a millisecond so if the long value you hand to setDateHeader(long date) method contains a millisecond value and you then try to use the same value to compare against the ‘If-Modified-Since’ header, you’ll never get a match. The easy way around that is to manually set the millisecond bits on the date you get back from the ‘If-Modified-Since’ header to zero.

If you’re interested, there are a number of other blogs / articles about Etags and Last-Modified headers:

ROME, custom modules, publishdate and RSS

At work, I’ve taken on the work of migrating our RSS feeds currently being produced using JSP to ROME. Since we’ve added a few custom elements to the feeds available in Jive Forums (things like message and thread counts), I’m taking advantage of the feature in ROME that gives you the ability to programtically define namespaces in your RSS 2.0, Atom 0.3 and Atom 1.0 feeds (examples: the iTunes module and the OpenSearch module). Anyway, the code I wrote to add an item to the list of available items in a feed looked something like this:

...
entry = new SyndEntryImpl();
entry.setTitle(thread.getSubject());
entry.setLink("http://mysite.com/community/threads.jspa?id=" + 
   thread.getID());
entry.setUpdatedDate(thread.getModificationDate());
entry.setPublishedDate(thread.getCreationDate());
...
JiveForumsModule module = new JiveForumsModuleImpl();
module.setReplyCount(thread.getReplyCount());
List modules = new ArrayList();
modules.add(module);
entry.setModules(modules);
...

This code works, but if you view the feed, you don’t get a publish date on the item. I dug into the ROME source code a bit and found that the publish date is stored as part of the Dublin Core module, which I came to find out is a ‘special’ module that always exists on a SyndEntryImpl object. Take a look at the implementation of the getModules() method on the SyndEntryImpl class:

public List getModules() {
  if  (_modules==null) {
    _modules=new ArrayList();
  }
  if (ModuleUtils.getModule(_modules,DCModule.URI)==null) {
    _modules.add(new DCModuleImpl());
  }
  return _modules;
}

See how the method automatically injects a DCModuleImpl into the _modules property if the DCModule doesn’t exist? Long story short, the code I wrote blew away the _modules property on the SyndEntryImpl instance which contained a single DCModule which itself contained the publishedDate date instance. So by the time the feed was produced, the publish date I set on each SyndEntry was long gone. I should have written my code like this:

JiveForumsModule module = new JiveForumsModuleImpl();
module.setReplyCount(thread.getReplyCount());
entry.getModules().add(module);

Better yet, the ROME team could have done two things:

  1. Added documentation to the setModules(List modules) method that pointed out that any information in the existing DCModule instance will be lost if the provided list doesn’t contain the existing DCModule instance.
  2. Added a method to the SyndEntry interface called addModule(Module module).

Open source: I’m lovin it.