All posts by ajohnson

.NET, Interface Design, Rich Internet Applications, Software Development

Shallow comparison of ASP.NET to Flex

November 17, 2003 ajohnson 2 Comments

Flex seems pretty interesting when you realize how similar it is to something like ASP.NET. Look how similar this snippet of Flex:
<?xml version="1.0" encoding="iso-8859-1"?> <mx:Application xmlns:mx="http://www.macromedia.com/2003/mxml"> <mx:script> function copy() { destination.text=source.text } </mx:script> <mx:TextInput id="source" width="100"/> <mx:Button label="Copy" click="copy()"/> <mx:TextInput id="destination" width="100"/> </mx:Application>
to this snippet of ASP.NET code:
<script language="C#" runat="server"> public void Copy(Object src, EventArgs e) { destination.Text = source.Text; } </script> <form runat="server"> <asp:textbox id="source" size="30" runat="server" /> <asp:button id="copy" OnClick="Copy" text="Copy" runat="server" /> <asp:textbox id="destination" size="30" runat="server" /> </form>
I can’t wait to see the IDE rolled out for Eclipse and the .NET version. Cool stuff Macromedia!

.NET, J2EE, Lucene, Software Development

QueryParser … in NLucene

November 16, 2003 ajohnson 7 Comments

Misleading title. I implemented the first of the examples that Erik Hatcher used in his
article about the Lucene QueryParser, only I used NLucene. Lucene and NLucene are very similar, so if anything, it’s interesting only because it highlights a couple of the differences between C# and Java.

First, here’s the Java example taken directly from Erik’s article:
public static void search(File indexDir, String q) { Directory fsDir = FSDirectory.getDirectory(indexDir, false); IndexSearcher is = new IndexSearcher(fsDir); Query query = QueryParser.parse(q, "contents", new StandardAnalyzer()); Hits hits = is.search(query); System.out.println("Found " hits.length() + " document(s) that matched query '" q "':"); for (int i = 0; i The NLucene version looks eerily similar: public static void Search(DirectoryInfo indexDir, string q) { DotnetPark.NLucene.Store.Directory fsDir = FsDirectory.GetDirectory(indexDir, false); IndexSearcher searcher = new IndexSearcher(fsDir); Query query = QueryParser.Parse(q, "contents", new StandardAnalyzer()); Hits hits = searcher.Search(query); Console.WriteLine("Found " + hits.Length + " document(s) that matched query '" + q + "':"); for (int i = 0; i The differences are mainly syntax.


First, Erik used the variable name 'is' for his IndexSearcher.  In C# 'is' is a keyword, so I switched the variable name to 'searcher'.  If you're really geeky, you might want to brush up on all the Java keywords and the C# keywords.
Second, while Java uses the File class to describe directories and files, the .NET Framework uses the DirectoryInfo class. 
Third, Java programmers are encouraged to capitalize class names and use camel Case notation for method and variable names while  C# programmers are encouraged to Pascal notation for methods and camel Case for variables, so I switched the static method name from 'search' to 'Search'.  
Next, 'Directory' is a system class, so the reference to the NLucene directory needed to be fully qualified:



DotnetPark.NLucene.Store.Directory fsDir = FsDirectory.GetDirectory(indexDir, false);



rather than this:



Directory fsDir = FsDirectory.GetDirectory(indexDir, false);



Finally, the Hits class contains a couple differences.  Java programmers use the length() method on a variety of classes, so it made sense for the Java version to use a length() method as well.  C# introduced the idea of a property, which is nothing more than syntactic sweetness that allows the API developer to encapsulate the implementation of a variable, but allow access to it as if it were a public field.  The end result is that instead of writing:



for (int i = 0; i

in Java, you'd use this in C#:



for (int i = 0; i

The authors of Lucene also decided to use the C# indexer functionality (which I wrote about a couple days ago) so that an instance of the Hits class can be accessed as if it were an array:



Document doc = hits[i].Document;



I put together a complete sample that you can download and compile yourself if you're interested in using NLucene.  Download it here.




	
	
				
			Lucene
		
			Lucene’s Query API
		
			November 15, 2003 ajohnson			Leave a comment
						

	


		
		Erik Hatcher wrote an excellent article on the specifics of Lucene’s Query API, specifically on how the QueryParser class uses the Query subclasses including TermQuery, PhraseQuery, RangeQuery, WildcardQuery, PrefixQuery, FuzzyQuery and BooleanQuery.  Very useful stuff.
Not unsurprisingly, he’s also writing a book on Lucene titled “Lucene in Action”, to be published by Manning.
	

	
	



	
	
				
			Software Development
		
			2003 Lightweight Languages Workshop notes
		
			November 12, 2003 ajohnson			1 Comment
						

	


		
		Joe and I went to the Lightweight Languages Workshop at MIT this past Saturday.  In short, a bunch of nerds got together for the entire day to talk about stuff like haskell, lisp, lua, scheme, and boundaries.  If you’re at all interested, you can watch the webcasts (which are of surprisingly good quality) in Real Media or Windows Media here and Joe wrote up his notes already. I’m behind a bit, so here’s mine a couple days late.
The initial session “Toward a LL Testing Framework for Large Distributed Systems” was especially interesting to me for a couple reasons: a) it was based on technology deployed for DARPA called UltraLog, b) ultralog is an “… ultra-survivable multi-agent society” and c) it uses Jabber, Python and Ruby.  Specifically, they used the above technologies to enable them to get an up close and personal look at how their entire system (in this case 1000’s of agents) was performing.  Said another way, they created a way to quickly and unobtrusively gather information from a variety of datapoints while the program is running.  If you have a single website on 1 server, this problem doesn’t matter to you much.  But imagine a system of 5000 servers (which is something I was asked to imagine this past Monday, more on that at a later time). An application running on 5000 servers would generate an unuseable amount of information; simple logging statements won’t help you. I’m rambling though.  The interesting takeaway from all this is the idea of creating instrumentation for your applications [google search for ‘code instrumentation’]. 
URLs harvested from the other sessions:
· Web Authoring System Haskell (WASH)
· XS: Lisp on Lego MindStorms
· the idea of continuations, where:



def foo(x):

  return x+1



becomes:



def foo(x,c):

  c(x+1)


· dynamic proxies [googled] [javaworld.com] [onjava.com]
· The Great Computer Language Shootout
· lua: embeddable in C/C++, Java, Fortran, Ruby, OPL, C#…. runs in Palm OS, Brew, Playstation II, XBox and Symbian.
· c minus minus
· scheme
	

	
	



	
	
				
			ColdFusion
		
			CFX_Lucene updates
		
			November 12, 2003 ajohnson			Leave a comment
						

	


		
		Couple people have written me in the last couple days with updates they’ve done to the Lucene and ColdFusion tags I wrote a couple months ago.  
First, Nick Burch from Torchbox updated the CFX tag so that it “… behaves better under error conditions and … the command line debug now works.”  I also read here that they (torchbox) are hoping to release an open source package written in Java “… to convert file

types to plain text and a CF custom tag to interface to Lucence which will

search them.”
Today, Scott piped in with a nice addition that adds the score to the query returned to the calling tag:



// Define column indexes

String[] columns = { "URL", "TITLE", "SUMMARY", "SCORE" } ;

// loop over all the results, add each to the query

for (int i = 0; i

For those of you like Scott who want to index PDF and Office documents, I'd suggest you start taking a look at these JGURU FAQ's:
Java Guru: How can I index PDF documents?
Java Guru: How can I index Word documents?
Cheers!
	

	
	



	
	
				
			.NET
		
			C# Indexers
		
			November 6, 2003 ajohnson			Leave a comment
						

	


		
		At Mindseye we’ve written a content management system, which is really just the net result of writing and improving upon a modular code base for a bunch of different websites in a variety of programming languages (ASP.NET, ASP, ColdFusion, and Java).  In this content management system, which we’ve affectionately called ‘Element’, a website is distilled down in various ‘objects’; things like events, newsletters, products, etc.  Long story short, each object (in whatever language) is usually represented as a specific type and is represented internally as an XML document. In the .NET/C# version that I’ve been working with lately, a newsletter would look vaguely like this:



public class Newsletter {

  private XmlDocument newsletter;

  public Newsletter(XmlDocument doc) {

   this.newsletter = doc;

  }

  // other methods left out

}



Putting aside your opinion on this being a good or bad design for a second, I’ve always struggled with how to best make the elements of the encapsulated xml document available to external classes.  Right now I have a method:



public string GetProperty(string label) {

  // retrieve the appropriate element and return as string

  return theValue;

}



and this works pretty well, but it’s lengthy. Another way of doing it would be to make each element of the XmlDocument a public property, but this would require alot of typing and would require that you recompile the class everytime the data structure it represented changed. So tonight, during nerd time (ie: extended time by myself at Barnes and Noble) I read about C# indexers.  You’ve probably used an indexer before; for instance the NameValueCollection class contains a public property called Item, which itself is a indexer for a specific entry in the NameValueCollection.  Unbeknownst to me before tonight, you can create your own indexers, so instead of having to access an element of an object like this:



string newsletterLabel =  newsletterInstance.GetProperty("label");



you could instead use this syntax:



string newsletterLabel = newsletterInstance["label"];



which just feels more natural to me.  Implementing the indexer in your class is simple. Using the example ‘Newsletter’ class above:



public class Newsletter {

  private XmlDocument newsletter;

  public Newsletter(XmlDocument doc) {

   this.newsletter = doc;

  }

  // indexer

  public string this [string index] {

    get {

      // logic to retrieve appropriate element from xmldoc by name

      return theValue;

    }

  }

}



I’m guessing that generally the index will be an integer rather than the string that I have above, nothing much changes if the parameter is integer:



public string this [int index] {

  get {

    // logic to retrieve appropriate element from xmldoc by index

    return theValue;

  }

}



Extremely handy stuff to know! Couple more thoughts on the subject:
· Indexers

· Comparison Between Properties and Indexers

· Properties

· Developer.com: Using Indexers
	

	
	



	
	
				
			Software Development
		
			Contracts and Interoperability
		
			November 4, 2003 ajohnson			Leave a comment
						

	


		
		Bill Venners posts part 5 of his converstation with Anders Hejlsberg, this one entitled “Contracts and Interoperability“.
	

	
	



	
	
				
			.NET, Software Development
		
			C# static constructors, destructors
		
			November 4, 2003 ajohnson			8 Comments
						

	


		
		Spent some more time on the C# logging class I’ve been working on.  Per Joe’s suggestions, I modified the StreamWriter so that it is now a class variable and as such, it need not be initialized every time the class is used.  Instead, you can use a static constructor (the idea exists in both C# and Java, although they go by different names). In C#, you simply append the keyword ‘static’ to the class name:



public class Logger {

 static Logger() {

  // insert static resource initialization code here

 }

}



In Java it’s called a static initialization block (more here) and it looks like this:



public class Logger {

 static {

  // insert static resource initialization code here

 }

}



If you’d like a real life example of a Java static initialization block, check out the source for the LogManager class in the log4j package.
Anyway, now the Logger class declares a StreamWriter:



private static StreamWriter sw;



and then uses the static constructor to initialize it for writing to the text file:



static Logger() {

  // load the logging path

  // ...

  // if the file doesn't exist, create it

  // ...

  // open up the streamwriter for writing..

  sw = File.AppendText(logDirectory);

}



Then use the lock keyword when writing to the resource to make sure that multiple threads can access the resource:



 ...

 lock(sw) {

   sw.Write("\r\nLog Entry : ");

   ...

   sw.Flush();

 }



Now all objects that call the various static methods will be using the same private streamwriter.  But you’re left with one more problem.  The streamwriter is never explicitly closed.  If the StreamWriter was an instance variable, then we could solve this by implementing a destructor. The destructor would take this form:



~ Logger() {

  try {

   sw.Close();

 } catch {

   // do nothing, exit..

  }

}



However, in this case the StreamWriter is a static/class variable, no ‘instance’ of Logger ever exists in the system, the ~Logger destructor will never get called.  Instead, when the StreamWriter is eligible for destruction the garbage collector runs the StreamWriter’s Finalize method (which itself will then presumably call the Close() method of the StreamWriter instance), which will then automatically free up the resources used by the StreamWriter.
I updated the Logger class and it’s personal testing assistant TestLogger (which has also been updated to use 3 threads). You can download them here:
· Logger.cs

· TestLogger.cs
	

	
	



	
	
				
			.NET
		
			Cross site scripting: removing meta-characters from user-supplied data in CGI scripts using C#, Java and ASP
		
			October 31, 2003 ajohnson			6 Comments
						

	


		
		Ran into some issues with cross site scripting attacks today. CERT® has an excellent article that show exactly how you should be filtering input from forms. Specifically, it mentions that just filtering *certain* characters in user supplied input isn’t good enough. Developers should be doing the opposite and only explicitly allowing certain characters. Using 
“… this method, the programmer determines which characters should NOT be present in the user-supplied data and removes them. The problem with this approach is that it requires the programmer to predict all possible inputs that could possibly be misused. If the user uses input not predicted by the programmer, then there is the possibility that the script may be used in a manner not intended by the programmer.”
They go on to show a examples of proper usage in both C and Perl, but who uses C and Perl? 😉  Here are the same examples in C#, Java and ASP.
In C#, you’ll make use of the Regex class, which lives in the System.Text.RegularExpressions namespace. I left out the import statements for succinctness here (you can download the entire class using the links at the end of this post), but you simply create a new Regex object supplying the regular expression pattern you want to look for as an argument to the constructor. In this case, the regular expression is looking for any characters not A-Z, a-z, 0-9, the ‘@’ sign, a period, an apostrophe, a space, an underscore or a dash. If it finds any characters not in that list, then it replaces them with an underscore.



public static String Filter(String userInput) {

  Regex re = new Regex("([^A-Za-z0-9@.' _-]+)");

  String filtered = re.Replace(userInput, "_");

  return filtered;

}



In Java it’s even easier.  Java 1.4 has a regular expression package (which you can read about here) but you don’t even need to use it. The Java String class contains a couple methods that take a regular expression pattern as an argument.  In this example I’m using the replaceAll(String regex, String replacement) method:



public static String Filter(String userInput) {

  String filtered = userInput.replaceAll("([^A-Za-z0-9@.' _-]+)", "_");

  return filtered;

}



Finally, in ASP (VBScript) you’d use the RegExp object in a function like this:



Function InputFilter(userInput)

  Dim newString, regEx

  Set regEx = New RegExp

  regEx.Pattern = "([^A-Za-z0-9@.' _-]+)"

  regEx.IgnoreCase = True

  regEx.Global = True

  newString = regEx.Replace(userInput, "")

  Set regEx = nothing

  InputFilter = newString

End Function



I think the next logical step would to be write a Servlet filter for Java that analyzes the request scope and automatically filters user input for you, much like the automatic request validation that happens in ASP.NET.
You can download the full code for each of the above examples here:
· InputFilter.cs

· InputFilter.java

· InputFilter.asp
Feel free to comment on the way that you do cross site scripting filtering.
	

	
	



	
	
				
			Software Development
		
			Lightweight Languages Workshop at MIT
		
			October 31, 2003 ajohnson			Leave a comment
						

	


		
		Fun stuff going on at MIT in a couple days:
“LL3 will be an intense, exciting, one-day forum bringing together the best programming language implementors and researchers, from both academia and industry, to exchange ideas and information, to challenge one another, and to learn from one another. 

The workshop series focuses on programming languages, tools, and processes that are usable and useful. Lightweight languages have been an effective vehicle for introducing new features to mainstream programmers. ”
More information here.
	

	
	

		
		Posts navigation
		
			← Previous
1
…
130
131
132
…
183
Next →




		Now with 50% less caffeine!
	
	
		
		What’s Going On Here?
			My name is Aaron Johnson and I created this blog both for me (mostly) and sometimes you. I've been saving my delicious pinboard.in links here and blogging since 2002. During the week (and at night and some weekends and well.. most of the time), I work in engineering product management look after engineering at a software company in Portland, Oregon. When I'm not working, I'm hanging out with my amazing wife, our dinosaur Star Wars loving son three boys,   and five chickens, and giant dog in the burbs outside of Portland, Oregon.
		
See Also
			

Pinboard
Instagram
Bookboard
LinkedIn
Strava
Twitter

		
Monthly Archives

			
					October 2024 (1)
	September 2024 (1)
	August 2024 (1)
	June 2024 (1)
	May 2024 (1)
	April 2024 (1)
	March 2024 (1)
	February 2024 (1)
	January 2024 (1)
	December 2023 (1)
	November 2023 (1)
	October 2023 (1)
	September 2023 (1)
	July 2023 (1)
	March 2023 (1)
	February 2023 (1)
	January 2023 (1)
	November 2022 (1)
	October 2021 (1)
	September 2021 (1)
	July 2021 (1)
	June 2021 (2)
	May 2021 (1)
	April 2021 (1)
	February 2021 (3)
	January 2021 (2)
	December 2020 (3)
	November 2020 (3)
	October 2020 (4)
	August 2020 (2)
	July 2020 (3)
	June 2020 (3)
	May 2020 (5)
	April 2020 (4)
	March 2020 (2)
	February 2020 (3)
	January 2020 (4)
	December 2019 (4)
	November 2019 (2)
	October 2019 (4)
	September 2019 (2)
	August 2019 (7)
	July 2019 (3)
	June 2019 (3)
	May 2019 (1)
	April 2019 (4)
	March 2019 (6)
	February 2019 (5)
	January 2019 (4)
	December 2018 (3)
	November 2018 (8)
	October 2018 (2)
	September 2018 (5)
	August 2018 (5)
	July 2018 (4)
	May 2018 (2)
	April 2018 (7)
	March 2018 (5)
	February 2018 (3)
	January 2018 (5)
	December 2017 (5)
	November 2017 (4)
	October 2017 (8)
	September 2017 (2)
	August 2017 (3)
	June 2017 (3)
	May 2017 (2)
	April 2017 (1)
	January 2017 (10)
	December 2016 (4)
	August 2016 (1)
	July 2016 (3)
	June 2016 (5)
	May 2016 (7)
	April 2016 (2)
	March 2016 (7)
	February 2016 (4)
	January 2016 (7)
	December 2015 (2)
	November 2015 (9)
	October 2015 (4)
	September 2015 (8)
	August 2015 (1)
	July 2015 (4)
	June 2015 (5)
	May 2015 (4)
	April 2015 (12)
	March 2015 (5)
	February 2015 (6)
	January 2015 (7)
	December 2014 (6)
	November 2014 (9)
	October 2014 (14)
	September 2014 (9)
	August 2014 (5)
	July 2014 (5)
	June 2014 (8)
	May 2014 (4)
	April 2014 (2)
	March 2014 (2)
	February 2014 (3)
	January 2014 (10)
	December 2013 (2)
	November 2013 (3)
	October 2013 (5)
	September 2013 (5)
	August 2013 (3)
	July 2013 (4)
	June 2013 (4)
	May 2013 (6)
	April 2013 (4)
	March 2013 (3)
	February 2013 (5)
	January 2013 (7)
	December 2012 (1)
	November 2012 (4)
	October 2012 (5)
	September 2012 (3)
	August 2012 (3)
	July 2012 (7)
	June 2012 (5)
	May 2012 (3)
	April 2012 (5)
	March 2012 (5)
	February 2012 (9)
	January 2012 (9)
	December 2011 (10)
	November 2011 (6)
	October 2011 (6)
	September 2011 (5)
	August 2011 (5)
	July 2011 (8)
	June 2011 (13)
	May 2011 (3)
	April 2011 (10)
	March 2011 (6)
	February 2011 (2)
	January 2011 (4)
	December 2010 (8)
	November 2010 (12)
	October 2010 (9)
	September 2010 (6)
	August 2010 (4)
	July 2010 (8)
	June 2010 (9)
	May 2010 (4)
	April 2010 (9)
	March 2010 (6)
	February 2010 (9)
	January 2010 (10)
	December 2009 (10)
	November 2009 (10)
	October 2009 (6)
	September 2009 (10)
	August 2009 (13)
	July 2009 (12)
	June 2009 (11)
	May 2009 (8)
	April 2009 (4)
	March 2009 (7)
	February 2009 (2)
	January 2009 (3)
	December 2008 (4)
	November 2008 (7)
	October 2008 (10)
	September 2008 (6)
	August 2008 (7)
	July 2008 (9)
	June 2008 (15)
	May 2008 (9)
	April 2008 (10)
	March 2008 (8)
	February 2008 (6)
	January 2008 (15)
	December 2007 (10)
	November 2007 (9)
	October 2007 (6)
	September 2007 (9)
	August 2007 (12)
	July 2007 (9)
	June 2007 (6)
	May 2007 (8)
	April 2007 (10)
	March 2007 (14)
	February 2007 (12)
	January 2007 (17)
	December 2006 (11)
	November 2006 (11)
	October 2006 (8)
	September 2006 (11)
	August 2006 (14)
	July 2006 (11)
	June 2006 (13)
	May 2006 (11)
	April 2006 (8)
	March 2006 (5)
	February 2006 (7)
	January 2006 (8)
	December 2005 (6)
	November 2005 (6)
	October 2005 (9)
	September 2005 (3)
	August 2005 (11)
	July 2005 (12)
	June 2005 (11)
	May 2005 (4)
	April 2005 (5)
	March 2005 (8)
	February 2005 (5)
	January 2005 (3)
	December 2004 (6)
	November 2004 (7)
	October 2004 (4)
	September 2004 (9)
	August 2004 (5)
	July 2004 (10)
	June 2004 (12)
	May 2004 (4)
	April 2004 (13)
	March 2004 (10)
	February 2004 (9)
	January 2004 (13)
	December 2003 (8)
	November 2003 (9)
	October 2003 (17)
	September 2003 (28)
	August 2003 (21)
	July 2003 (24)
	June 2003 (31)
	May 2003 (43)
	April 2003 (30)
	March 2003 (48)
	February 2003 (45)
	January 2003 (43)
	December 2002 (28)
	November 2002 (30)
	October 2002 (34)
	September 2002 (41)
	August 2002 (35)
	July 2002 (20)
	June 2002 (1)