An interesting read on Oreilly (onlamp.com) today: What I Hate About Your Programming Language
Category Archives: Software Development
Business Jargon
Veen comes up with a pretty humorous list of jargon he’s found while consulting for a large organization. My personal favorite is ‘boil the ocean’.
HTTP Testing Tools
After Ray mentioned the CFUnit Testing Components from the DRK today in our development meeting, we got around to talking about automated testing of websites. David mentioned httpunit as one option (you can see a couple examples of how it might be used here). Looks like Cactus might also be an option, althought it appears to be Java specific and aimed more at the testing of the components of a website (EJBs, Taglibs, Servlets) rather than the end result (HTML). Does anyone have any experience with these tools? Do you use other tools?
Luhn formula
Need to validate a credit card number? Use the Luhn formula.
“Based on ANSI X4.13, the LUHN formula (also known as the modulus 10 — or mod 10 — algorithm ) is used to generate and/or validate and verify the accuracy of credit-card numbers.
Most credit cards contain a check digit, which is the digit at the end of the credit card number. The first part of the credit-card number identifies the type of credit card (Visa, MasterCard, American Express, etc.), and the middle digits identify the bank and customer.
To generate the check digit, the LUHN formula is applied to the number. To validate the credit-card number, the check digit is figured into the formula. Here’s how the algorithm works for verifying credit cards; the math is quite simple:
1) Starting with the second to last digit and moving left, double the value of all the alternating digits.
2) Starting from the left, take all the unaffected digits and add them to the results of all the individual digits from step 1. If the results from any of the numbers from step 1 are double digits, make sure to add the two numbers first (i.e. 18 would yield 1+8). Basically, your equation will look like a regular addition problem that adds every single digit.
3) The total from step 2 must end in zero for the credit-card number to be valid.
The LUHN formula was created in the late 1960s by a group of mathematicians. Shortly thereafter, credit card companies adopted it. Because the algorithm is in the public domain, it can be used by anyone. The LUHN formula is also used to check Canadian Social Insurance Number (SIN) validity. In fact, the LUHN formula is widely used to generate the check digits of many different primary account numbers. Almost all institutions that create and require unique account or identification numbers use the Mod 10 algorithm.
Many banks use the Luhn formula as it is easy to find out the customers of the banks and look at their financial transactions or the loans they have taken from them. This is an efficient and fast way of keeping track of the customers without spending a ton of time, unlike the olden methods where banks would have to type down the whole ID number.
Spider/Text Indexer/Search Web Application update
I hacked a bit on my Spider/Text Indexer/Search Web Application this weekend. All of the Java I’ve written has lived in a servlet container, so (maybe out of ignorance) haven’t spent much time thinking about concurrency. This spider project on the other hand will be doing a) fetching of a URL, b) extracting links from the fetched URL, c) indexing the content, and d) persisting the content into a database. I’m sure this is something that CS grads do in their first year of school, but it’s new (and fun!) to me. So today I read up on threads in Concurrent Programming in Java, recommended by Joe. The code that I hacked at this weekend, I have 3 different threads going: one for fetching and extracting urls, one for indexing with Lucene and one for persisting to the database, each of these classes extends java.lang.Thread. The flow of the program kinda looks like this:
Spider class
— Fetcher extends Thread (retrieve URL, extract URL, loop until all possible URLs are retrieved)
— Indexer extends Thread (use Lucene to index retrieved URLs)
— Archiver extends Thread (persist content to database so that we can offer ‘cached’ versions like google)
The fetcher thread works off of a Vector (I know that an ArrayList would be faster, but it’s not synchronized)of urls, and then feeds the retrieved documents into a Vector of documents that the Indexer picks up, indexes and then updates a Vector of documents to be persisted to the database. The Spider class controls the stopping and starting of each Thread by modifying a boolean property within each class that extends Thread.
However, in the aforementioned book, I read this:
“… the best default strategy is to define a Runnable as a separate class and supply it in a Thread constructor. Isolating code within a distinct class relieves you of worrying about any potential interactions of synchronized methods or blocks used in the Runnable with any that may be used by methods of class Thread. More generally, this separation allows independent control over the nature of the action and the context in which it is run: The same Runnable can be supplied to threads that are otherwise initialized in different ways, as well as to other lightweight executors. Also note that subclassing Thread precludes a class from subclassing any other class.”
So does anyone have experience using Threads? How would you design something like this? Do you consider it best practice to have your classes implement the Runnable interface and then pass the instances of those classes to a generic Thread constructor?
Second, in my mind I’m imagining that I have an assembly line: urls get retrieved, extracted, then indexed, then archived. My assembly line right now is dumb: each Thread just keeps trying to grab something from the queue until someone presses the stop button. I’ve seen the terms “Producer” and “Consumer” thrown about.. I’m guessing that it might be better to have the threads notify each other whenever they put something into the queue. Better? Worse? Make a difference at all?
CFMX & HttpServletRequest/HttpServletResponse
I’m doing some research for an article I’m writing for Macromedia Devnet (hopefully to be published in May or June). I won’t go into the details of the article, but part of it deals (tangentially) with CFMX and Java integration. Specifically, I’m using CFMX to call a relatively simple Java API that consists of a couple methods, ie:
doSomething(javax.servlet.http.HttpServletRequest req, javax.servlet.http.HttpServletResponse res, long someObject)
Looks pretty imposing doesn’t it? It’s actually pretty easy to call using CFMX.
<cfscript>
myObject = CreateObject(“java”, “com.thirdpartyApp.OtherClass”);
// retrieve the long value from the third party class
t = CreateObject(“java”, “com.thirdpartyApp.Class”);
longValue = t.SOME_CONSTANT;
// get a javax.servlet.http.HttpServletRequest object from CFMX
req = getPageContext().getRequest();
// get a javax.servlet.http.HttpServletResponse object from CFMX
res = getPageContext().getResponse();
myObject.doSomething(req, res, longValue);
</cfscript>
One interesting to note is that if you do getClass() on the req and res objects like this:
<cfoutput>#req.getClass()#</cfoutput>
<cfoutput>#res.getClass()#</cfoutput>
above you get the following class names:
req = jrun.servlet.ForwardRequest
res = coldfusion.jsp.JspWriterIncludeResponse
and although they don’t look like javax.servlet.http.HttpServletResponse and javax.servlet.http.HttpServletRequest objects, they actually extend the javax.servlet.ServletRequestWrapper and javax.servlet.ServletResponseWrapper classes, which are wrappers (as their names imply) for the javax.servlet.ServletRequest and javax.servlet.ServetResponse interfaces.
Anyway, interesting journey.. great help from Sean Corfield for his post about the getClass() method you can use to find out the type of an object and to Charlie Arehart for his comprehensive Java/CFMX archive.
Top 12 Reasons to Write Unit Tests
From ONJava.com: Top 12 Reasons to Write Unit Tests
- Tests Reduce Bugs in New Features
- Tests Reduce Bugs in Existing Features
- Tests Are Good Documentation
- Tests Reduce the Cost of Change
- Tests Improve Design
- Tests Allow Refactoring
- Tests Constrain Features
- Tests Defend Against Other Programmers
- Testing Is Fun
- Testing Forces You to Slow Down and Think
- Testing Makes Development Faster
- Tests Reduce Fear
Definition of Object-Oriented programming
Object-Oriented programming. The C# book I’m reading has an excellent definition; it says “… object-oriented programming encapsulates the characteristics and capabilities of an entity in a single, self-contained and self-sustaining unit of code.” Pretty obvious stuff. So my purely theoretical question for the late night: given an object oriented system with classes that describe entities … where in the system do you put the retrieval of objects? For instance, let’s say I have an ecommerce system that has classes ‘Order’, ‘Product’, and ‘Consumer’. Order will have methods like return(), commit(), cancel(), Product will have methods like getPrice(), updateStock() and so on… The bottom line is that methods are the way we access (getting) an objects properties and also the way that we manipulate an objects properties (setting). So lets say that somewhere in this system, I want to be able to query all the orders in system, returning open orders, closed orders, orders over $500.. It doesn’t feel right to write a method like this:
[java]
public static Resultset getOpenOrders()
[c#]
public static DataTable getOpenOrders()
for the Order object. Where does a method like this fit in the system?