Category Archives: work

Pancake People

But today, I see within us all (myself included) the replacement of complex inner density with a new kind of self-evolving under the pressure of information overload and the technology of the “instantly available”. A new self that needs to contain less and less of an inner repertory of dense cultural inheritance—as we all become “pancake people”—spread wide and thin as we connect with that vast network of information accessed by the mere touch of a button.

via Richard Foreman via Nicholas Carr
which is interesting and everything but then I came across this quote by Steven Johnson:

But the truth is most of our information tools still have a fuzziness built into them that can, in Richard Foreman’s words, “often open doors to new worlds.” It really depends on how you choose to use the tool. Personally, I have two modes of using Google: one very directed and goal-oriented, the other more open-ended and exploratory. Sometimes I use Google to find a specific fact: an address, the spelling of a name, the number of neurons estimated to reside in the human brain, the dates of the little ice age. In those situations, I’m not looking for mistakes, and thankfully Google’s quite good at avoiding them. But I also use Google in a far more serendipitous way, when I’m exploring an idea or a theme or an author’s work: I’ll start with a general query and probe around a little and see what the oracle turns up; sometimes I’ll follow a trail of links out from the original search; sometimes I’ll return and tweak the terms and start again. Invariably, those explorations take me to places I wasn’t originally expecting to go—and that’s precisely why I cherish them. (I have a similar tool for exploring my own research notes—a program called DevonThink that lets me see semantic associations between the thousands of short notes and quotations that I’ve assembled on my hard drive.)

which I thought was relevant to Clearspace (the word serendipitous comes up more often than you’d think in product conversations) because it shows how search is more than just a directed, singular focus kind of activity that lots of people assume it to be. The first quote is telling too: all the iPhoning, Facebooking, Twittering, Flickring, Clearspacing and Emailing leaves us stretched thin: when was the last time you sat down to read something or write something longer than a single page?

Dates, Milliseconds, Java and Firebug

In Clearspace we store dates using their millisecond representation rather than as a full fledged date in database tables because we don’t want to have to worry about how each database represents dates, but that means that whenever you’re looking at a Clearspace database table, you see values like this:

1172822399000

instead of something like this:

2007-03-01 23:59:59.000

which makes debugging dates a little harder. Firebug and JavaScript to the rescue! Open Firebug, click into the console area and then type this:

new Date(1172822399000);

and then click ‘run’ and voila. You’ve got the date. Have I told you how much I love Firebug?

Creating a Firefox Sidebar for Clearspace: Part II

It looks like it was almost 2 months ago that I wrote a blog post about the Clearspace plugin for Firefox (called Clearfox), promising that I would follow up with the details on the JavaScript side of the project. I guess time flies when you’re having fun.

Getting started on the JavaScript sidebar was easy. The Mozilla folks have a nice document here that shows you how to get a really simple sidebar created, but like a lot of things in software, the last 20% of the features take 80% of the time. It’s also worth noting that Firefox extensions are deployed as an XPI file, themselves nothing more than glorified zip files, so if you’re curious about how an extension (say Firebug, LiveHTTPHeaders or Del.icio.us Bookmarks), you can download the extension, unzip it’s contents and then poke around to your hearts content. It’s just like viewing source on HTML, which is something that other browser extensions don’t offer out of the box. Here are couple things that either weren’t documented in the above document or caused me to repeatedly bang my head against the wall.

Adding Your Icon

When I originally wrote about the plug-in, a number of people asked “why?” I think one of most important things about browser plug-ins is that they bring your web application (in this case Clearspace) front and center in your customers everyday browsing experience. Front and center in Firefox means that your plug-in sits right next to the Back | Forward | Reload | Stop | Home button in the navigation bar. If you want to put a 24×24 picture of yourself in that spot, be my guest. Since the purpose of Clearfox was to get embed Clearspace in the browser, I chose (wisely I think) to use the Clearspace logo. This was one of the simpler things to accomplish. I added the following element to the overlay.xul:

<toolbox id="navigator-toolbox">
    <toolbarpalette id="BrowserToolbarPalette">
      <toolbarbutton id="clearfox-button" class="clearfoxbutton-1 chromeclass-toolbar-additional"
                     observes="viewClearfoxSidebar" />
    </toolbarpalette>
  </toolbox>

and then in a CSS file (that you also reference in overlay.xul), I added this:

#clearfox-button {
  list-style-image: url("chrome://clearfox/skin/clearfox.png");
  -moz-image-region: rect(0px 24px 24px 0px);
}

Finally, you’ll need to have execute some JavaScript when Firefox loads:

var toolbox = document.getElementById("navigator-toolbox");
var toolboxDocument = toolbox.ownerDocument;
var hasClearfoxButton = false;
for (var i = 0; i < toolbox.childNodes.length; ++i) {
  var toolbar = toolbox.childNodes[i];
  if (toolbar.localName == "toolbar" && toolbar.getAttribute("customizable") == "true" ) {
    if (toolbar.currentSet.indexOf("clearfox-button") > -1)
      hasClearfoxButton = true;
  }
}
if (!hasClearfoxButton) {
  for (var i = 0; i < toolbox.childNodes.length; ++i) {
    toolbar = toolbox.childNodes[i];
    if (toolbar.localName == "toolbar" &&  toolbar.getAttribute("customizable") == "true"
      && toolbar.id == "nav-bar") {
      var newSet = "";
      var child = toolbar.firstChild;
      while (child) {
        if (!hasClearfoxButton && (child.id=="clearfox-button" || child.id=="urlbar-container")) {
          newSet += "clearfox-button,";
          hasClearfoxButton = true;
        }
        newSet += child.id + ",";
        child = child.nextSibling;
      }
      newSet = newSet.substring(0, newSet.length-1);
      toolbar.currentSet = newSet;
      toolbar.setAttribute("currentset", newSet);
      toolboxDocument.persist(toolbar.id, "currentset");
      BrowserToolboxCustomizeDone(true)
      break;
    }
  }
}

The key takeaways: the ID of the toolbarbutton element is used in the CSS declaration, the list-style-image property is used to specify the image you want for the button and the observes attribute points to the ID of the broadcaster element, which is used for showing / hiding the sidebar. I'm not all that good with Fireworks / Photoshop so I didn't go the extra mile to create a separate image to show when a user mouses over the button (check out the the excellent del.icio.us bookmarks extension for an example), but adding a mouseover / hover image is as simple as adding another CSS property:

clearfox-button:hover {
  list-style-image: url("chrome://clearfox/skin/clearfox-hover.png");
}

Loading Sidebar Content

I think I mentioned this when I wrote the original post, but the thing that really got me started with the Firefox extension was looking at the source for the twitbin Firefox extension. When I checked out the source for that extension, I was stunned to learn that they were simply loading up an HTML page from their server and then refreshing the page every couple minutes using an AJAX request. I thought it must have been way more complex than that, but HTML + JavaScript + CSS with a little bit of XUL sprinkled in and you're golden. In the broadcaster element (mentioned above), you add an attribute 'oncommand' that tells Firefox what JavaScript method(s) you want invoked when the user clicks on your button; in Clearfox I use that hook to load the HTML content from Clearspace. The JavaScript looks like this:

var sidebar = top.document.getElementById("sidebar");
sidebar.contentWindow.addEventListener("DOMContentLoaded",
  Clearfox.clearfoxContentLoaded, false);
sidebar.loadURI('http://example.com/yourpage.html');

DOMContentLoaded

Once the content has loaded the DOMContentLoaded event is fired and then Clearfox runs the clearfoxContentLoaded method. I initially tried loading the page and then re-loading that same page every couple minutes: I was cheating and trying not have to do the AJAX part. I added a meta refresh tag to the page to have it reload every n minutes, but the DOMContentLoaded event was only fired the first time the page was loaded. Key takeaway: if you rely on DOMContentLoaded, you're only going to get the event once, even if your page reloads.

Opening New Tabs

You can run your own little application over in the sidebar if you want to, never opening new tabs or doing anything in the main window, but the main point of the Clearfox extension was to give users the ability to see new content in Clearspace and then able to view the full thread, document or blog post in their main browser window as a new tab. There's no way you can create new tabs in JavaScript outside of XUL, you have to be inside XUL to create a tab so the DOMContentLoaded event is important because this is where the links are all rewritten to open new tabs rather than work as normal links. There are two parts to the clearfoxContentLoaded method: the first rewrites all the links so that clicking on a link in the sidebar opens a new tab, the second adds listeners to the document so that when new content is added via AJAX, the same first part is repeated. The link conversion looks like this:

var sidebar = top.document.getElementById("sidebar");
var doc = sidebar.contentDocument;
var all_links = new Array();
var links = doc.evaluate("//a", doc, null, XPathResult.ANY_TYPE, null);
var link = links.iterateNext();
while (link) {
  all_links.push(link);
  link = links.iterateNext();
}
for (var i = 0; i < all_links.length; i++) {
  link = all_links[i];
  if (!link.hasAttribute('onclick') &&
    link.hasAttribute('href') &&
    (link.hasAttribute('class') && link.getAttribute('class').indexOf('tabbable') > -1)
  ) {
    var target = link.getAttribute('href');
    link.setAttribute('onclick', 'return false;');
    link.removeAttribute('target');
    link.addEventListener('click', Clearfox.clearfoxCreateOpenFunction(target), true);
  }
}

In English, get all the links that don't have an onclick attribute and add an onclick event, whose callback creates a new tab:

clearfoxCreateOpenFunction: function(url) {
  return function() {
    gBrowser.selectedTab = gBrowser.addTab(url);
  };
}

The second part of the clearfoxContentLoaded method listens for any modifications (anytime something is added to the document via an AJAX call, I remove the last n rows, so in this case I listen for the removal of nodes) to the document and adds handlers for the modification event:

var sidebar = top.document.getElementById("sidebar");
var table = sidebar.contentDocument.getElementById("clearfox-hidden-content");
table.addEventListener("DOMNodeRemoved", Clearfox.clearfoxConvertLinks, false);

Debugging

Two things about debugging your extension: a) none of the tools you've got installed for debugging JavaScript / CSS (cough! Firebug cough!) will work in the sidebar window and b) your only other option is to resort to the Firefox equivalent of Java's System.out.println:

function logClearfoxMsg(message) {
  var consoleService = Components.classes["@mozilla.org/consoleservice;1"].getService(Components.interfaces.nsIConsoleService);
  consoleService.logStringMessage("Clearfox: " + message);
}

and then to log a message:

logClearfoxMsg('your advertisement here');

To see the log message, go to Tools --> Error Console.

Showing The Sidebar When Firefox Opens

I spent way more time than I should have working on this last part: when you first install the extension, you're asked to restart Firefox, which you do and then you see the nice Clearspace button, you click on it and the sidebar opens and you're happy. Then a couple days later when you need to restart Firefox again (for whatever reason), you'll notice that the sidebar is open but that no content shows and you're sad. You probably didn't take it personally, but I did and I wanted to figure out why Firefox wouldn't load up the content if the sidebar was already open. There actually is onload attribute that I tried using in the sidebar.xul and that didn't work for reasons I won't get into here. What ended up doing the trick for me (and I really think it's a trick but I couldn't get anything else to work and this was just about the last option I had) was to check to see if the sidebar was open when Firefox was loading:

var broadcaster = top.document.getElementById('viewClearfoxSidebar');
  if (broadcaster.hasAttribute('checked')) {
    ...

and then, if it is open, *forcing* the sidebar to open again:

toggleSidebar('viewClearfoxSidebar', true);

For whatever reason, this was the only way I could get the rest of my code to work, but work it did. And now I'm happy.

And I hope you are too. If you have any questions about creating a Firefox sidebar, shoot me an email, I'll be glad to help. If you want to see all the code, you can download the Clearfox source over on the Jive Software Community site. All the JavaScript / XUL code I discussed in this post is in the 'xpi' directory off the root.

Java ZipEntry bug on Windows

I rolled out the Clearfox plugin on the Jive Software Community site a couple weeks ago and got some good feedback and some bad feedback. A number of people said they tried to install the Firefox part of the plugin, restarted Firefox and then didn’t see the Clearspace icon like my screenshots / screencast showed. There were no errors in the Clearspace error logs and no errors showed up in the Firefox JavaScript debug console. Through the help of a couple customers, I was able to narrow it down to running Clearspace on Windows: for some reason the zip file (really the XPI file) that the Clearfox plugin creates on the fly was invalid, at least according to Firefox. If you opened the XPI file using any common zip file utility the contents appeared to fine. As always, google came to the rescue and pointed me to this bug filed on bugs.sun.com, which has two parts. The Unicode file name bug didn’t matter to me, but this one did:

Within a ZIP file, pathnames use the forward slash / as separator, as required by the ZIP spec. This requires a conversion from or to the local file.separator on systems like Windows. The API (ZipEntry) does not take care of the transformation, and the need for the programmer to deal with it is not documented.

which wouldn’t hurt so much if it hadn’t been filed back in… get this… 1999. Are you kidding me?

Anyway, long story short: if you’re writing Java, creating a zip file that has paths while on a Windows based machine and deploying said zip file to a place that actually cares about the zip file specification (or violates Postel’s Law), then make sure to do something like this in your Java code:

String zipFilePath = file.getPath();
if (File.separatorChar != '/') {
  zipFilePath = zipFilePath.replace('\\', '/');
}
ZipEntry zipAdd = new ZipEntry(zipFilePath);

noting that even the workaround they give in the aforementioned bug is incorrect because they show

... file.getName();

which doesn’t contain the path separators. Awesome.

Creating a Firefox Sidebar for Clearspace: Part I

It’s been embarassingly quiet on this blog of late, I apologize for all the delicious links, although a case could be made that blogs were originally nothing more than sharing links so maybe I shouldn’t be apologizing, but that’s a different blog post. Today I want to talk about the thing I’ve been working on at night, my non-day job if you will. A couple months ago I was reading all the hype about how JavaScript is going to take over the world and I’d been doing a lot of JavaScript during the day but I needed a project to get me through the night. Right about that time was when Twitter started taking off and I came across twitbin, which is a cool Firefox sidebar that shows you all of your friends tweets in a Firefox sidebar (the same sidebar that livehttpheaders and selenium IDE show up in), updated in real time. Like any good hacker, I wondered “how’d they do that” and started poking around the xpi file that you download to install Firefox extensions. Lo and behold, it’s JavaScript and HTML behind the scenes. Since Clearspace is one of those addictive, constantly updating, can’t get enough of it kind of applications (unlike Twitter you can actually use more than 140 characters! amazing!), I thought it would be both useful and potentially easy to create a sidebar for Clearspace, which brings me to this blog post.

I’m not sure where I started, but I’m pretty sure the first step wasn’t to create a plugin in Clearspace, I messed around for awhile with the technologies that go into creating a Firefox Extension: JavaScript, XUL (pronounced ‘zool’), install manifests, etc. I got a sample Firefox sidebar up and running that and spent way more time than I should have installing, viewing, uninstalling and restarting Firefox than I should have. I spent a lot of time digesting these sites:

I’ll discuss some of the things I ran into on the Firefox / JavaScript side in a second blog post: right now I want to talk about the Clearspace side of the plugin.

Eventually, I got to a place where I was comfortable enough with the Firefox side of things to creating the Clearspace part of the plugin. The content that is displayed in the sidebar is nothing more than plain HTML and CSS. If you install the Clearspace plugin you can actually view the content in any browser: go to http://example.com/clearspace/cf-view.jspa (replacing example.com with the host name of your installation). You’ll see that the content looks surprisingly similiar to the content that shows up on the the homepage of Clearspace, and in fact, it is the same content. The one difference between this page and other pages in Clearspace is that it never does a page reload. All the views (the login page, the settings page, the view page, etc.) are included in the resulting HTML, but hidden using CSS so that you only see one view at a time. When you click a button at the top of the page, two things usually happen: a) the current view is hidden using CSS and the new view is displayed using CSS and b) an AJAX request is sent out to a WebWork action that performs some action: logging you out, getting the updated content since your last time you viewed the page, saving your Clearfox settings, etc. This process might seem a little backwards: why not just work the same way a regular web page browsing session works where you click a link and your browser loads another page? It’s actually a limitation imposed by the Firefox extension model, so I won’t go into it here, but if you’re curious, you can do a search for ‘DOMContentLoaded firefox extension’.

So now that you (hopefully) understand how the client works, I’m going to assume that you won’t have any problems copying the ‘example’ plugin that Clearspace ships with and dive right into the pieces that are distinctive about the Clearspace part of the Clearfox plugin.

First, I needed to define the WebWork actions that Clearfox was going to use, which means I needed to create the xwork-plugin.xml file. The descriptor for the view that I mentioned earlier looks like this:

<action name="cf-view"
   class="com.jivesoftware.clearspace.plugin.clearfoxplugin.ViewAction">
   <result name="success">/plugins/clearfox/resources/view.ftl</result>
   <result name="update">/plugins/clearfox/resources/update.ftl</result>
</action>

That action is pretty standard: there are two possible results: the ‘success’ result shows the list of the 25 most recently updated pieces of content, the update result is used by an AJAX request to update the existing page with the n most recently updated pieces of content since the last refresh (which by default is invoked every 5 minute). The ViewAction class extends com.jivesoftware.community.action.MainAction, which is the WebWork action that handles the display of the homepage of Clearspace, so ViewAction simply invokes

super.execute();

to get the same content you’d get if you viewed the homepage. When the Clearfox plugin refreshes every 5 or so minutes it invokes the

public String doUpdate()

method, passing in the time (in milliseconds) that the last piece of content it has a record of was updated. It gets back a (presumably) shorter list of content that it then updates the view with.

The login, logout and settings actions are all using in combination with AJAX and since none of them need to provide any data, they all return HTTP status code headers:

<action name="cf-login"
   class="com.jivesoftware.clearspace.plugin.clearfoxplugin.LoginAction">
   <result name="success" type="httpheader">
      <param name="status">200</param>
   </result>
   <result name="unauth" type="httpheader">
      <param name="status">401</param>
   </result>
</action>
<action name="cf-logout"
   class="com.jivesoftware.clearspace.plugin.clearfoxplugin.LogoutAction">
   <result name="success" type="httpheader">
   <param name="status">200</param>
   </result>
</action>
<action name="cf-settings"
   class="com.jivesoftware.clearspace.plugin.clearfoxplugin.SettingsAction">
   <result name="success" type="httpheader">
      <param name="status">200</param>
   </result>
</action>

The last action is used by the Firefox Extension framework to determine if the plugin (on the Firefox side) needs to be updated. The action descriptor looks like this:

<action name="cf-updater"
   class="com.jivesoftware.clearspace.plugin.clearfoxplugin.UpdaterAction">
   <result name="success">
      <param name="location">/plugins/clearfox/resources/updater.ftl</param>
      <param name="contentType">text/rdf</param>
   </result>
</action>

The one trick about this one is that the result sets an HTTP contentType header, which isn’t remarkable except that Clearspace uses Sitemesh, which attempts to decorate everything it can parse with a header and footer. The contentType header should be a hint to Sitemesh that you don’t want the result to be decorated / wrapped with a header and footer, but apparently the hint is to subtle for Sitemesh because it attempts to wrap the result of this action anyway, which leads to the second distinctive part of this plugin.

The way we bypass Sitemesh decoration in the core product is by modifying a file called templates.xml, which gives us the ability tell Sitemesh to exclude certain paths from being decorated. In Clearspace 1.7, which will be out in a couple weeks, plugins (which can’t modify templates.xml) will have the ability to tell Sitemesh about paths that they don’t want decorated. Hence the following entry in the plugin descriptor (always located at the root of plugin and named plugin.xml):

<sitemesh>
   <excludes>
      <pattern>/cf-updater.jsp*</pattern>
      </excludes>
</sitemesh>

Third, Another thing you may have noticed if you were following along with the source code (which is available from clearspace.jivesoftware.com) is that two of the action classes (ViewAction and LoginAction if you must know) are marked with the class annotation ‘AlwaysAllowAnonymous’. This annotation tells the Clearspace security interceptor that the anonymous users should be allowed to invoke the action without being logged in. This is probably a good place to remind you of one of the differences between Clearspace and ClearspaceX. Clearspace, by default, is configured to *require* users to login before they can see any content while ClearspaceX uses the opposite default: you only need to login (usually) if you want to post content. So back to the AlwaysAllowAnonymous annotation: it’s important mostly for the Clearspace (not ClearspaceX) implementations because you want to give people the ability to invoke the action and then the action itself handles the display of the login page in it’s own specific way. The RSS related actions in Clearspace work exactly the same way: they are all annotated with the AlwaysAllowAnonymous marker and then handle security via HTTP Basic Auth (Clearspace) or simply allow anonymous usage (ClearspaceX) because feed readers

Fourth, because the Firefox part of the plugin needs to be specific to your installation, all the Firefox plugin related files need to be zipped up to create the XPI file that your users will install into Firefox. The plugin framework gives you the ability to define a plugin class:

<class>com.jivesoftware.clearspace.plugin.clearfoxplugin.ClearFoxPlugin</class>

which implements com.jivesoftware.base.plugin.Plugin. The interface specifies a method:

public void initializePlugin(PluginManager manager, PluginMetaData pluginData);

which means that your plugin will get a chance to initialize itself. The Clearfox plugin uses this initialization hook to zip up the Firefox related files. The plugin is told where it lives:

public void initializePlugin(PluginManager manager, PluginMetaData metaData) {
   File pluginDir = metaData.getPluginDirectory();
   ... 

and then goes on to zip up the files located in the xpi directory of the plugin source.

So that’s that… if you got this far you probably don’t care, but I actually did a screencast of the whole thing that lives over here or you can check it out on blip.tv.

Fun with Supporting Multiple Databases

In my day job over at Jive Software I get to work on the crazy cool Clearspace product and unless you’ve been reading the system requirements lately, you probably didn’t notice that we support six different database platforms: MySQL, Oracle, Postgres, DB2, SQL Server and HSQLDB. Clearspace borrowed a number of classes from Jive Forums so a lot of the database specific code has already been flushed out, but I ran into a couple interesting things this past week that I thought needed to be blogged.

The first thing I ran into was a varchar case sensitivity issue: DB2, Oracle, Postgres and HSQLDB are all sensitive about case with respect to the values they store. So if you do this:

INSERT INTO appuser(username,password) VALUES('Administrator','password');

and then try to retrieve the user:

SELECT username,password FROM appuser WHERE username = 'administrator'

you’ll get different results than you will with MySQL and SQL Server (which are both case insensitive by default). Bottom line: if you plan supporting an application on a variety of database servers, make sure you toLowerCase() or toUpperCase() the string values you store if you plan on doing look ups against those values. If you’re into this kind of thing, page 87 of the SQL 92 standard appears to discuss the case sensitivity issues, but I can’t make heads of tails of it. Any good specification readers out there?

The second thing involved Oracle, which has this wonderful feature where you can’t insert an empty string into a varchar column. Example:

INSERT INTO appuser(username, password) VALUES('administrator', '');

Bennett McElwee blogged about this feature in more detail on his blog, but the jist of this nugget of goodness is that if you attempt to insert an empty space into an Oracle varchar, your empty space will get converted to

null

which is not helpful in the least bit. Consider yourself warned.

Finally, and again Oracle. Assume you have the sample table I used above with two columns: username and password. Assume further that you want to modify the username column to be 100 chars instead of 50. You write an ALTER statement that looks something like this for SQL Server:

ALTER TABLE appuser ALTER COLUMN username nvarchar(100) NOT NULL

on MySQL you’d have this:

ALTER TABLE appuser MODIFY COLUMN username nvarchar(100) NOT NULL

on DB2:

ALTER TABLE appuser ALTER COLUMN username varchar(100) NOT NULL

All very similar. But Oracle, no, if you tried this:

ALTER TABLE appuser MODIFY(COLUMN username varchar(100) NOT NULL)

on Oracle you’d get this error:

ORA-01451: column to be modified to NULL cannot be modified to NULL

See Oracle, for whatever reason, decided that if you are going to modify the column that you can ONLY specify the nullability of a column IF the nullability itself is changing. Which is ridiculous nice.

And yes, nullability is a word.

Using ROME to get the body / summary of an item

I’ve been using ROME for a couple years now and I’m still learning new things. Today I was working on an issue in Clearspace where we give users the ability to show RSS / Atom feeds in a widget, optionally giving them the choice to show the full content of each item in the feed or just a summary of each item in the feed. The existing logic / pseudo-code looked something like this:

for (SyndEntry entry : feed.getEntries()) {
  if (showFullContent) {
    write(entry.getContents()[0].value);
  } else {
    write(entry.getDescription().value);
  }
}

The assumption was that description would return a summary and contents would return the full content. The problem is that Atom and RSS are spec’ed umm.. differently. RSS 2.0 says that ‘description’ is a synopsis of the item but then goes on in an example to show how the description can be much more than just a short plain text description. So then you’re left with descriptions that aren’t really a synopsis, it’s the full content… or it is sometimes and sometimes not. Then Atom came along with well defined atom:summary and atom:content elements which means ROME had to figure out a way to map description and content-encoded elements in RSS to atom:summary and atom:content. Dave Johnson summarized the mappings nicely in a blog post discussing the release of ROME 0.9, in short the mapping looks like this:

RSS <description> <--> SyndEntry.description <--> Atom <summary>
RSS <content:encoded> <--> SyndEntry.contents[0] <--> Atom <content>

Anyway, all this is to say that if you’re doing any work with SyndEntry, you’ll need to check both description and contents. Generally, if you’re looking for the full content, check the value of contents first. If that’s null, check the value of description. If you’re looking for a summary, check the value of description first BUT don’t assume that you’ll actually get a short summary. Use something like StringUtils.abbreviate(…) to make certain that you’ll get a short summary back and not the entire content.

Debugging SOAP / XFire with ethereal

I’ve spent way more time than I should have the last couple weeks working to help migrate a website built against Jive Forums to run against a Clearspace X instance. As part of the migration, one of the things I did was to move all the data syndication that had been done with RSS and custom namespaces to use the Clearspace SOAP API, which is built on a technology called XFire. The first problem I ran into was that production website was configured so that requests to http://example.com were redirected to http://www.example.com/, which resulted in errors like this in the logs:

Jul 5, 2007 11:30:11 PM org.apache.commons.httpclient.HttpMethodDirector isRedirectNeeded
INFO: Redirect requested but followRedirects is disabled

That error was pretty easy to fix (swap in http://www.example.com in place of http://example.com), but the next thing I ran into was way less intuitive. When I invoked a certain service, I’d get a stack trace that looked like this:

Exception in thread "main" org.codehaus.xfire.XFireRuntimeException: Could not invoke service.. 
Nested exception is org.codehaus.xfire.fault.XFireFault: Unexpected character '-' (code 45) in prolog; expected '<'
 at [row,col {unknown-source}]: [2,1]
org.codehaus.xfire.fault.XFireFault: Unexpected character '-' (code 45) in prolog; expected '<'
 at [row,col {unknown-source}]: [2,1]
	at org.codehaus.xfire.fault.XFireFault.createFault(XFireFault.java:89)
	at org.codehaus.xfire.client.Client.onReceive(Client.java:386)

which was troubling because the exact same SOAP method invocation worked fine on both my local machine and in the test environment. What was different? Two things: the production system was running on Java 6 and the production system was configured to run behind an Apache HTTP server proxied by mod_caucho versus no Apache HTTP server / proxy in development or on my machine. I needed to see what was going on between the server and the client (one of the things that makes SOAP so hard is that you can't just GET a URL to see what's being returned) so I fired up ethereal at the behest of one of my coworkers. I kicked off a couple of SOAP requests with ethereal running, recorded the packets and then analyzed the capture. Said coworker then pointed out the key to debugging HTTP requests with ethereal: right click on the TCP packet you're interested in and then click 'Follow TCP Stream'. The invocation response looked like this when run against the development environment:

HTTP/1.1 200 OK
Date: Mon, 02 Jul 2007 21:59:30 GMT
Server: Resin/3.0.14
Content-Type: multipart/related; type="application/xop+xml"; start=""; start-info="text/xml"; .boundary="----=_Part_5_25686393.1183413571061"
Connection: close
Transfer-Encoding: chunked

1dce

------=_Part_5_25686393.1183413571061
Content-Type: application/xop+xml; charset=UTF-8; type="text/xml"
Content-Transfer-Encoding: 8bit
Content-ID: 
...

and looked like this when invoked against the production instance:

HTTP/1.1 200 OK
Date: Mon, 02 Jul 2007 21:41:56 GMT
Server: Apache/2.0.52 (Red Hat)
Vary: Accept-Encoding,User-Agent
Cache-Control: max-age=0
Expires: Mon, 02 Jul 2007 21:41:56 GMT
Transfer-Encoding: chunked
Content-Type: text/plain; charset=UTF-8
X-Pad: avoid browser bug

24e

------=_Part_29_31959705.1183412516805
Content-Type: application/xop+xml; charset=UTF-8; type="text/xml"
Content-Transfer-Encoding: 8bit
Content-ID: 
...

Notice the different content type returned by the production server? So then the mystery became not 'what?' but 'who?' I googled around for a bit and found a bug filed against JIRA that had all the same symptoms as the problem I was running into: the solution posted in the comments of the bug said that the problem was with mod_caucho. I worked with the ISP that hosts the production instance of Clearspace, got them to remove mod_caucho and use mod_proxy to isolate that piece of the puzzle and sure enough, the problem went away. Our ISP recommended that we not settle for mod_proxy for the entire site and instead wrote up a nifty solution using mod_rewrite and mod_proxy, which I've pasted below:

 RewriteRule ^/clearspace/rpc/soap(/?(.*))$ to://www.example.com:8080/clearspace/rpc/soap$1
 RewriteRule ^to://([^/]+)/(.*)    http://$1/$2   [E=SERVER:$1,P,L]
 ProxyPassReverse /community/rpc/soap/ http://www.example.com/clearspace/rpc/soap/

Hope that helps someone down the road!

Free Ticket to OSCON

Gotcha! Please excuse the corporate shilling for a second: the company I work for, Jive Software, is running a pretty cool promotion right now: write up a blog post about how you or your company is using Clearspace, Jive Forums, Openfire or Spark and if you’re blog post is selected as the most thorough, detailed, entertaining and well.. you get the idea. Bottom line: if you win, Jive will pay for either your airfare and hotel in Portland or will buy your ticket to OSCON (which is happening in only a couple weeks!). You can get the whole scoop over on the Jive Talks blog.