Fun with Supporting Multiple Databases

In my day job over at Jive Software I get to work on the crazy cool Clearspace product and unless you’ve been reading the system requirements lately, you probably didn’t notice that we support six different database platforms: MySQL, Oracle, Postgres, DB2, SQL Server and HSQLDB. Clearspace borrowed a number of classes from Jive Forums so a lot of the database specific code has already been flushed out, but I ran into a couple interesting things this past week that I thought needed to be blogged.

The first thing I ran into was a varchar case sensitivity issue: DB2, Oracle, Postgres and HSQLDB are all sensitive about case with respect to the values they store. So if you do this:

INSERT INTO appuser(username,password) VALUES('Administrator','password');

and then try to retrieve the user:

SELECT username,password FROM appuser WHERE username = 'administrator'

you’ll get different results than you will with MySQL and SQL Server (which are both case insensitive by default). Bottom line: if you plan supporting an application on a variety of database servers, make sure you toLowerCase() or toUpperCase() the string values you store if you plan on doing look ups against those values. If you’re into this kind of thing, page 87 of the SQL 92 standard appears to discuss the case sensitivity issues, but I can’t make heads of tails of it. Any good specification readers out there?

The second thing involved Oracle, which has this wonderful feature where you can’t insert an empty string into a varchar column. Example:

INSERT INTO appuser(username, password) VALUES('administrator', '');

Bennett McElwee blogged about this feature in more detail on his blog, but the jist of this nugget of goodness is that if you attempt to insert an empty space into an Oracle varchar, your empty space will get converted to

null

which is not helpful in the least bit. Consider yourself warned.

Finally, and again Oracle. Assume you have the sample table I used above with two columns: username and password. Assume further that you want to modify the username column to be 100 chars instead of 50. You write an ALTER statement that looks something like this for SQL Server:

ALTER TABLE appuser ALTER COLUMN username nvarchar(100) NOT NULL

on MySQL you’d have this:

ALTER TABLE appuser MODIFY COLUMN username nvarchar(100) NOT NULL

on DB2:

ALTER TABLE appuser ALTER COLUMN username varchar(100) NOT NULL

All very similar. But Oracle, no, if you tried this:

ALTER TABLE appuser MODIFY(COLUMN username varchar(100) NOT NULL)

on Oracle you’d get this error:

ORA-01451: column to be modified to NULL cannot be modified to NULL

See Oracle, for whatever reason, decided that if you are going to modify the column that you can ONLY specify the nullability of a column IF the nullability itself is changing. Which is ridiculous nice.

And yes, nullability is a word.

Using ROME to get the body / summary of an item

I’ve been using ROME for a couple years now and I’m still learning new things. Today I was working on an issue in Clearspace where we give users the ability to show RSS / Atom feeds in a widget, optionally giving them the choice to show the full content of each item in the feed or just a summary of each item in the feed. The existing logic / pseudo-code looked something like this:

for (SyndEntry entry : feed.getEntries()) {
  if (showFullContent) {
    write(entry.getContents()[0].value);
  } else {
    write(entry.getDescription().value);
  }
}

The assumption was that description would return a summary and contents would return the full content. The problem is that Atom and RSS are spec’ed umm.. differently. RSS 2.0 says that ‘description’ is a synopsis of the item but then goes on in an example to show how the description can be much more than just a short plain text description. So then you’re left with descriptions that aren’t really a synopsis, it’s the full content… or it is sometimes and sometimes not. Then Atom came along with well defined atom:summary and atom:content elements which means ROME had to figure out a way to map description and content-encoded elements in RSS to atom:summary and atom:content. Dave Johnson summarized the mappings nicely in a blog post discussing the release of ROME 0.9, in short the mapping looks like this:

RSS <description> <--> SyndEntry.description <--> Atom <summary>
RSS <content:encoded> <--> SyndEntry.contents[0] <--> Atom <content>

Anyway, all this is to say that if you’re doing any work with SyndEntry, you’ll need to check both description and contents. Generally, if you’re looking for the full content, check the value of contents first. If that’s null, check the value of description. If you’re looking for a summary, check the value of description first BUT don’t assume that you’ll actually get a short summary. Use something like StringUtils.abbreviate(…) to make certain that you’ll get a short summary back and not the entire content.

Finding the number of files of a certain file type that a process has open

Used this week when trying to debug a problem with an application that was resulting in errors that said “too many files open”:

/usr/sbin/lsof -p $processID | grep .$ext | wc -l

where $processID is the ID of the process that is opening the files and $ext is the (most likely) three letter file extension of the file that is being open.

Also, if you run:

ulimit

and you get ‘unlimited’ as the result, you’d be wrong to assume that you have an unlimited number of file handles available to you:

ulimit -n

will give you the total number of file handles you’re allowed.

And that’s one to grow on.

More here.

Links: 9-8-2007