HTTP Spider & Lucene

Spent the majority of my day today refactoring the HTTP spider & Lucene indexing application I’ve been writing on and off for the last couple months as a learning exercise. One of the first things I did was modify the 3 modules to implement the Runnable interface rather than extending the Thread object. Big thanks for Joe for his detailed thoughts on the subject. Probably the biggest reason for doing so is that implementing the Runnable interface means that the classes (a class that handles retrieving web pages, a class that indexes the web pages using Lucene and a class that saves the resulting web pages to a database) could possibly extend some type of task/thread class that I’d want to implement in the future (again, a Joe suggestion).

After completing that, I explored the various ways in which one might interface with the software… the only way (right now) being via the command line with multiple arguments. Since remembering command line arguments can be tedious, I looked at the Properties class, whose methods give you the ability to load a text file with key/element pairs and then get() and set() properties within the file. Java.sun.com has an introduction to the System and Properties class.

Finally, I rewrote each module (mentioned above) so that while still running inside of a while(boolean) loop, they sleep for .5 seconds before iterating through the loop. Hopefully (and it appears this is true) this means that the CPU isn’t stressed out too much.

I uploaded the source here (it also requires the commons http client jar, the commons logging jar, and the lucene jar. If you’re a Java programmer, I’d love your feedback on the code, not from a feature standpoint but from a syntax and architectural standpoint (ie: I care less about whether or not you think you’d actually use this and more about what you think of the code.) How would you change it? What did I do wrong? What did I do right?

IBM developerWorks

Just me and the computers this weekend. I’ve not spent alot of time on IBM’s site, but developerWorks and alphaWorks (emerging technologies) are two really great sites. Not sure how often they update the site, but the Java technology directory on developerWorks has some nifty articles right now:

JDBC query logging made easy: “Add logging to your JDBC code with an enhanced PreparedStatement” This article shows you how you can implement the PreparedStatement interface, overriding default query behavior, to provide better debugging and logging facilities. I used PreparedStatements all over the place on karensrecipes.com, next rev I’ll be sure to add in a PreparedStatementLogging class. Also, the author mentions that the LoggableStatement class he’s written is a great example of the Decorator design pattern, which I wondering about a couple months ago.

Accessing SQL and XML content using JSTL: “Custom tag libraries for exchanging XML and database content in JSP pages” Looks like JSTL might be one or two revisions away from being as simple and easy to use as ColdFusion tags. Check out this JSTL SQL query example:

<sql:setDataSource var=”dataSrc”
    url=”jdbc:mysql:///taglib” driver=”org.gjt.mm.mysql.Driver”
    user=”admin” password=”secret”/>
    <sql:query var=”queryResults” dataSource=”${dataSrc}”>
  select * from blog group by created desc limit ?
  <sql:param value=”${6}”/></sql:query>

<table border=”1″>
  <tr>
    <th>ID</th>
    <th>Created</th>
    <th>Title</th>
    <th>Author</th>
  </tr>
<c:forEach var=”row” items=”${queryResults.rows}”>
  <tr>
    <td><c:out value=”${row.id}”/></td>
    <td><c:out value=”${row.created}”/></td>
    <td><c:out value=”${row.title}”/></td>
    <td><c:out value=”${row.author}”/></td>
  </tr>
</c:forEach>
</table>

Pretty close to a cfquery isn’t it?

Flash Remoting Docs bug

Got an email from Scott who was trying to work with C# and Flash Remoting. He couldn’t get the “Creating an assembly that returns an ActionScript object” example to work from the FlashRemoting LiveDocs. Looks to me like the documentation is a bit off. From the documentation:

 public ASObject returnObject()
    {
      ASObject aso = new ASObject();
      aso.ASType = “Calculator”;
      aso.Add(“x”, 100);
      aso.Add(“y”, 300);
      Flash.result = aso;
    }

should probably be:

 public ASObject returnObject()
    {
      ASObject aso = new ASObject();
      aso.ASType = “Calculator”;
      aso.Add(“x”, 100);
      aso.Add(“y”, 300);
      return aso;
    }

Personal rant: I tried creating a Livedocs account, but for whatever reason, the account I created didn’t work… why does everyone make you create a username? Why not just use your email address?

Sony Ericsson J2ME SDK

Sony released the final version of their SDK for J2ME and guidelines for Java developers.

Russ mentioned a couple weeks ago that the 3650 supports Sun’s Media API; apparently the T610 supports a subset of it: “The Mobile Media API (MMAPI) extends the functionality of the J2ME platform by providing audio, video and other time-based multimedia support to resource-constrained devices. As a simple and lightweight optional package, it allows Java developers to gain access to native multimedia services available on a given device. The T610 supports a subset of the MMAPI. The T610 MMAPI implementation allows the developer to access the audio functionality on the device. In particular, the T610 implementation provides support for ToneGeneration, IMelody, AMR and MIDI audio formats. The T610 MMAPI implementation does not support any video formats.

xPetstore 3.1 Released

Another version of the petstore application has been released, this one a re-implementation of the Sun Microsystem PetStore based on xDoclet. If you’re into learning more about EJB 2.0, CMP, CMR, Servlets, Web Filters and JSP TagLibs, then take a peek at the site.

Declarative Security for Web Application

Catching up on the homework for the free J2EE Programming Class I’m taking, this week drills down into the security options offered by servlet containers, specifically Tomcat. One of the things I hadn’t spent much time on before was the declarative security functionality that exists (apparently) in all servlet containers. Unlike ColdFusion and ASP, servlet containers (and thus Tomcat) give system administrators (not the developer) the ability to create password protected directories, ‘realms’ and users that access the directories within a specific realm. All the administration is done within the web.xml file of your web application. Here’s an example:

<web-app>
 <!– … –>
 <security-constraint>
   <web-resource-collection>
     <web-resource-name>Sensitive</web-resource-name>
     <url-pattern>/sensitive/*</url-pattern>
   </web-resource-collection>
   <auth-constraint>
     <role-name>administrator</role-name>
     <role-name>executive</role-name>
   </auth-constraint>
 </security-constraint>
 <login-config>
    <auth-method>FORM</auth-method>
    <form-login-config>
        <form-login-page>/login.jsp</form-login-page>
        <form-error-page>/login-error.html</form-error-page>
    </form-login-config>
 </login-config>
 <!– … –>
</web-app>

The above locks down the ‘/sensitive/’ directory (and everything inside it) to users in the administrative and executive realms and forces anyone and everyone trying to access said directory to login using /login.jsp.

Couple benefits that I see to this idea of declarative (rather than programmatic) security:

a) administration chores are handled by system administrators, no developer intervention is required outside of setting up the login pages.

b) According to this article, “.. J2EE compliant servlet containers are required to track authentication information at the container level (rather than at the web application level)” which means that if you setup multiple websites on a J2EE compliant servlet container, you get single sign on to all the applications running on that servlet container. Very cool.

c) Very little coding is required of programmers, giving them more time to focus on the applications they’re building, this also means fewer bugs.

Interested in reading more?

Declarative Web Application Security with Servlets and JSP: http://www.informit.com/isapi/product_id~%7B116C8D3F-BE60-47A3-B8EC-EF132654A5A3%7D/content/index.asp

Tomcat 4 Servlet/JSP Container Realm Configuration: http://jakarta.apache.org/tomcat/tomcat-4.1-doc/realm-howto.html

Tomcat 4 Single Sign On: http://jakarta.apache.org/tomcat/tomcat-4.1-doc/config/host.html

JRUN Security (no mention of any declarative security functionality): http://livedocs.macromedia.com/jrun4docs/JRun_Administrators_Guide/authentic.jsp

The JavaTM Web Services Tutorial: Web-Tier Security: http://java.sun.com/webservices/docs/1.0/tutorial/doc/WebAppSecurity4.html

Eclipse SSH Plugin

If you use Eclipse, check out this SSH plugin from eclipse plugins: Eclipse SSH Console. Download, unzip, restart Eclipse and then go to Window –> Show View –> Other –> Eclipse SSH Console. One less window I have to have open now when developing.

Also very cool and somewhat related is the Eclipse Tail tool, described as: “… tail for eclipse (like the tail unix command). You can set keywords and your logs will be colorized under 5/6 levels : no matches, debug, info, warn, error, fatal; like in log4J.”