James Turner, a committer on the Struts project and a Senior Editor at LinuxWorld Magazine, wrote an article for the magazine a couple days ago that caught the eye of a couple bloggers (Beattie, Colburn). There are a bunch of interesting comments on Russell’s blog in reference to his commens on the article. Notable among the comments:
… the difference between “Computer Science” and “Software Engineering” – the former being concerned with fundamental constructs and abstractions, the latter being concerned with putting them together into useful things — sort of like the distinction between Physics and Mechanical Engineering (or perhaps Civil Engineering), although much the way Computer Science is an abuse of the term Science, Software Engineering mistreats the term Engineering…. CS gives us things like Boyer-Moore search, Bloom filters, and LZ compression; SE gives us things like Apache and Google
— a link to an article that Joel Spolsky wrote back in 2001 (Don’t Let Architecture Astronauts Scare You)
— Carlos Perez chewed on the usuability of API’s for a bit (Even More Wisdom on Designing Usable APIs), which I found interesting because he talks about how Microsoft actually has done research into the usability of API’s (cognitive dimensions framework).
I think Russell misses the point; he brings up Bill Day using a PHP weblog solution as proof that Java is too hard. I use Moveable Type for my weblog, but that doesn’t mean I’d write a banking application in Perl or PHP. With that said, I completely agree with Mr. Turner’s assessment of the Bouncy Castle Crypto API. I tried to use it for the exact same thing that James did last week and I never did get it working. I bet we both would be using the Bouncy Castle API had they taken a couple hours to write a short and sweet introduction to the API.
Is Java hard or is it convoluted? Is there a unified framework or is it a hodgepodge. I remember when I was using Java more heavily in the 1.3 era. All I wanted to do was use some regular expressions. I ended up having to look around for libraries only eventually finding them at Apache.
A case in point of this whole discussion is the tool you’re going to write to parse web pages, Aaron. In java I’m sure you’ll end up with quite a hefty chunk of code. That’s WITHOUT the UI. Write the same thing in Perl and *presto*, it works like a charm.
yo dude,
java 1.4 makes things a bit easier because regex is built into the string object and there is also a regex package in 1.4 (maybe earlier too but I don’t think so):
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html#matches(java.lang.String)
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html#replaceAll(java.lang.String,%20java.lang.String)
etc…
The problem isn’t w/ regular expressions in this case.. I got far enough w/ regex, it’s that even after replacing what I think is useless stuff (<img>, <td> etc..), I’m still left with text that is duplicated again and again on every page (imagine an HTML document from a large site.. usually you have a header and footer and said header and footer contains text like ‘Login Now!’… you don’t need/want that to show up in your summary… but I don’t know *exactly* what the text is… it’s almost like the parser needs to in some sense ‘learn’ what the duplicated text is and then remove that .. it’s like noise, remove the noise and then you’ll have a clearer picture..
AJ