All posts by ajohnson

Bulk import of email addresses into Movable Type notifications

One of the blogs on my Movable Type installation is actually using the notifications feature built into Movable Type (which gives you the ability to let people subscribe to a blog via email) and needed to do a bulk import of email addresses that he already had in csv format. Unfortunately, Six Apart doesn’t provide a way to do a bulk import so I dug into google. Because the install is backed by MySQL, the solution was to use the LOAD DATA INFILE command, which reads a csv file from disk and loads the data into a table you specify. In case someone else ever needs to do this with Movable Type, the syntax I ended up using looked like this:

LOAD DATA LOCAL INFILE 'contacts.csv' 
INTO TABLE mt_notification 
FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n' 
(notification_email,notification_blog_id,notification_name)

and my contacts.csv file looked like this:

ajohnson@cephas.net,10,Aaron Johnson
...

Also interesting to note that according to the MySQL documentation, the LOAD DATA INFILE command is usually 20 times faster (!) than using separate INSERT INTO statements.

Links: 9-21-2006

Links: 9-13-2006

Links: 9-8-2006

AtomicInteger

A couple weeks ago I came across some code that used an instance of an AtomicInteger, which is part of the java.util.concurrent.atomic package, which I believe the majority of which was written by Doug Lea, author of the excellent util.concurrent library. I wrote the class name down in my todo.txt thinking that it would be interesting to find out how AtomicInteger (and all the other atomic related classes) give you the ability to increment a shared integer in a multi-threaded environment, but I didn’t get around to it until this past weekend. If you’re not familiar with the atomic classes in JDK 5.0, the big takeaway is that they “…support lock-free thread-safe programming on single variables”. In code that means that if you have this:

public final class Counter {
  private long value = 0;
  public synchronized long getValue() {
    return value;
  }

  public synchronized long increment() {
    return ++value;
  }
}

you should be using something more like this:

public class NonblockingCounter {
    private AtomicInteger value;

    public int getValue() {
        return value.get();
    }

    public int increment() {
        int v;
        do {
            v = value.get();
        while (!value.compareAndSet(v, v + 1));
        return v + 1;
    }
}


Regardless of your familiarality, herewith are some notes on my research:

  • I downloaded the source code for JDK 5.0 and popped open the source for AtomicInteger, skipping down to the getAndIncrement method. The source for that method is curious, to say the least. It starts with a for loop that never ends, retrieves the current int value, adds one to that value and then fires the compareAndSet method, only returning if the compareAndSet method returns true. It took me a couple minutes to wrap my brain around this small block of code: no locking and what looks like an endless loop. Here it is if you don’t feel like downloading the code yourself:
        public final int getAndIncrement() {
            for (;;) {
                int current = get();
                int next = current + 1;
                if (compareAndSet(current, next))
                    return current;
            }
        }
    

    What’s remarkable about this block is that it seemingly ignores the fact that it was designed to be accessed simultaneously by multiple threads… and in fact that’s exactly how it was designed. If this method were a conversation between a person and a machine, it might go something like this:

    Person: Get me the current value!
    Machine: 1
    Person: Add one to that value!
    Machine: 2
    Person: If 1 is equal to the current value, then set the current value to 2, otherwise, let’s do it all over again!
    Machine: Try again.
    Person: WEEE! Let’s try again!

    I think the exclamation points are apt in this case because it turns out that the atomic classes use algorithms that are sometimes called ‘optimistic’ (which is probably not a trait that you ascribe to most programmers unless they’re talking about schedules) but are described on Wikipedia (and probably elsewhere) as ‘lock-free’ or ‘wait-free’ algorithms. They’re optimistic in the sense that they assume the best case:

    I’m sure that no other threads will be accessing the method when I’m using the method. Even if they do, I don’t care, I’ll just try again (and again and again)…

    Optimistic. The pessimistic version assumes the worst case:

    I’m sure some other thread will be trying to access the method when I’m using the method. I’ll have to block them from using it every time I invoke this method.

    If you check out the Wikipedia article I linked to above, you’ll see the implementation of these lock-free algorithms use atomic primitives, the notable one being the ‘compare and swap‘; the getAndIncrement method on the AtomicInteger class in Java has a corresponding compareAndSet method, which is backed by a class called Unsafe, which itself doesn’t have publicly available documentation. However, if you download the JDK 5.0 source code like I did and peek at it, you’ll see a method that looks like this:

    public final native boolean compareAndSwapInt(Object o, long offset, int expectedValue, int newValue);
    

    And that’s where the source code adventure stopped. If I’m understanding the method signature correctly, it means Sun has a native library (written in C?) for every OS that sets the value of the Object o at memory location ‘offset’ to value ‘newValue’ if the current value is equal to ‘expectedValue’.

  • I found a really really great article on IBM developerWorks by Brian Goetz1 called “Going atomic” (you’ll probably recognize the Counter and NonBlockingCounter examples I posted above). In it he does a great job describing compare and swap and the difference between a lock-free and wait-free implementation and then, just to show off, shows some benchmarking graphs that compare synchronization, ReentrantLock, fair Lock and AtomicLong on an 8-way Ultrasparc3 and a single-processor Pentium 4.

    You should also read his articles on developerWorks called More flexible, scalable locking in JDK and Introduction to nonblocking algorithms.

  • Dave Hale wrote an article that discusses the use of AtomicIntegers in a multi-threaded environment called “Parallel Loops with Atomic Integers“. He also pointed out an article published in the March 2005 edition of Dr. Dobb’s Journal called “The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software“.

Hope you learned something today!

1: I saw Brian speak at the No Fluff Just Stuff Symposium in Boston last fall. He’s as good in person as he is on paper. Go hear him speak if you have a chance.

XSL / CSS Processing Instructions using ROME

Have you seen the way that the smart guys at FeedBurner display RSS feeds in a browser (here’s a sample if you haven’t)? If you’re like me, the first time you see a feed they manage, you’ll probably think that you’re viewing a page that contains a link to an RSS or ATOM feed, not the actual feed. In fact what you’re seeing is the feed transformed by your browser using XSL and CSS. Take a peek at the source and you’ll see that the XSL and CSS transformations are produced by what are technically called processing instructions. I won’t go into the work that they’ve done to create that look (but if you poke around the source it’s not trivial), but the inclusion of the processing instruction… now that’s something I can help you out with. I’ve used ROME on a couple different projects now because it is trivial to create RSS, RSS2 or ATOM feeds with only a couple lines of code. There are a number of tutorials up on the ROME site that show you how to create a feed and then write to a String, a File or a Writer which you should check out if you don’t have any experience with ROME. Assuming that you do however and given that you have a feed (technically a SyndFeed in ROME), you would write it out in a servlet like this:

SyndFeedOutput output = new SyndFeedOutput();
output.output(feed, response.getWriter());

ROME abstracts you from having to work directly with an XML document, which is handy most of the time. But if you want to add a processing instruction to style your feed, it takes a little fiddling with the source. If you peek at the source of SyndFeedOutput, you’ll see that it wraps WireFeedOutput. WireFeedOutput uses a JDOM Document and the XMLOutputter class to create write the feed. Conveniently, the Document class has methods for adding processing instructions, in fact it has a ProcessingInstruction class. Putting all these things together, you’d get this if you were creating the RSS feeds for FeedBurner using ROME:

WireFeedOutput feedOutput = new WireFeedOutput();
Document doc = feedOutput.outputJDom(feed.createWireFeed());
// create the XSL processing instruction
Map xsl = new HashMap();
xsl.put("href", "http://feeds.feedburner.com/~d/styles/rss2full.xsl");
xsl.put("type", "text/xsl");
xsl.put("media", "screen");
ProcessingInstruction pXsl = new ProcessingInstruction("xml-stylesheet", xsl);
doc.addContent(0, pXsl);
// create the CSS processing instruction
Map css = new HashMap();
css.put("href", "http://feeds.feedburner.com/~d/styles/itemcontent.css");
css.put("type", "text/css");
css.put("media", "screen");
ProcessingInstruction pCss = new ProcessingInstruction("xml-stylesheet", css);
doc.addContent(1, pCss);
// write the document to the servlet response
XMLOutputter outputter = new XMLOutputter(format);
outputter.output(doc,response.getWriter());

So now that I’ve made that easy for you, who’s going to show me the website that has a bunch of easy to use CSS and XSL templates for transforming RSS and ATOM feeds? Better yet, when are browsers going to have this kind of logic baked in so that my mom doesn’t have to look at an RSS feed that looks like a bunch of gobbly gook XML?