{"id":532,"date":"2003-10-31T16:27:25","date_gmt":"2003-10-31T20:27:25","guid":{"rendered":"http:\/\/wordpress.cephas.net\/?p=532"},"modified":"2003-10-31T16:27:25","modified_gmt":"2003-10-31T20:27:25","slug":"cross-site-scripting-removing-meta-characters-from-user-supplied-data-in-cgi-scripts-using-c-java-and-asp","status":"publish","type":"post","link":"https:\/\/cephas.net\/blog\/2003\/10\/31\/cross-site-scripting-removing-meta-characters-from-user-supplied-data-in-cgi-scripts-using-c-java-and-asp\/","title":{"rendered":"Cross site scripting: removing meta-characters from user-supplied data in CGI scripts using C#, Java and ASP"},"content":{"rendered":"<p>Ran into some issues with cross site scripting attacks today. CERT\u00ae has an <a href=\"http:\/\/www.cert.org\/tech_tips\/cgi_metacharacters.html\">excellent article<\/a> that show exactly how you should be filtering input from forms. Specifically, it mentions that just filtering *certain* characters in user supplied input isn&#8217;t good enough. Developers should be doing the opposite and only explicitly allowing certain characters. Using <\/p>\n<p>&#8220;<i>&#8230; this method, the programmer determines which characters should NOT be present in the user-supplied data and removes them. The problem with this approach is that it requires the programmer to predict all possible inputs that could possibly be misused. If the user uses input not predicted by the programmer, then there is the possibility that the script may be used in a manner not intended by the programmer.<\/i>&#8221;<\/p>\n<p>They go on to show a examples of proper usage in both C and Perl, but who uses C and Perl? \ud83d\ude09  Here are the same examples in C#, Java and ASP.<\/p>\n<p>In C#, you&#8217;ll make use of the <a href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/cpref\/html\/frlrfsystemtextregularexpressionsregexclasstopic.asp\">Regex<\/a> class, which lives in the System.Text.RegularExpressions namespace. I left out the import statements for succinctness here (you can download the entire class using the links at the end of this post), but you simply create a new Regex object supplying the regular expression pattern you want to look for as an argument to the constructor. In this case, the regular expression is looking for any characters <b>not<\/b> A-Z, a-z, 0-9, the &#8216;@&#8217; sign, a period, an apostrophe, a space, an underscore or a dash. If it finds any characters not in that list, then it replaces them with an underscore.<br \/>\n<code><br \/>\npublic static String Filter(String userInput) {<br \/>\n&nbsp;&nbsp;Regex re = new Regex(\"([^A-Za-z0-9@.' _-]+)\");<br \/>\n&nbsp;&nbsp;String filtered = re.Replace(userInput, \"_\");<br \/>\n&nbsp;&nbsp;return filtered;<br \/>\n}<br \/>\n<\/code><br \/>\nIn Java it&#8217;s even easier.  Java 1.4 has a regular expression package (which you can read about <a href=\"http:\/\/developer.java.sun.com\/developer\/technicalArticles\/releases\/1.4regex\/\">here<\/a>) but you don&#8217;t even need to use it. The Java <a href=\"http:\/\/java.sun.com\/j2se\/1.4.2\/docs\/api\/java\/lang\/String.html\">String<\/a> class contains a couple methods that take a regular expression pattern as an argument.  In this example I&#8217;m using the <a href=\"http:\/\/java.sun.com\/j2se\/1.4.2\/docs\/api\/java\/lang\/String.html#replaceAll(java.lang.String,%20java.lang.String)\">replaceAll(String regex, String replacement)<\/a> method:<br \/>\n<code><br \/>\npublic static String Filter(String userInput) {<br \/>\n&nbsp;&nbsp;String filtered = userInput.replaceAll(\"([^A-Za-z0-9@.' _-]+)\", \"_\");<br \/>\n&nbsp;&nbsp;return filtered;<br \/>\n}<br \/>\n<\/code><br \/>\nFinally, in ASP (VBScript) you&#8217;d use the <a href=\"http:\/\/msdn.microsoft.com\/library\/default.asp?url=\/library\/en-us\/script56\/html\/vsobjRegExp.asp?frame=true\">RegExp<\/a> object in a function like this:<br \/>\n<code><br \/>\nFunction InputFilter(userInput)<br \/>\n&nbsp;&nbsp;Dim newString, regEx<br \/>\n&nbsp;&nbsp;Set regEx = New RegExp<br \/>\n&nbsp;&nbsp;regEx.Pattern = \"([^A-Za-z0-9@.' _-]+)\"<br \/>\n&nbsp;&nbsp;regEx.IgnoreCase = True<br \/>\n&nbsp;&nbsp;regEx.Global = True<br \/>\n&nbsp;&nbsp;newString = regEx.Replace(userInput, \"\")<br \/>\n&nbsp;&nbsp;Set regEx = nothing<br \/>\n&nbsp;&nbsp;InputFilter = newString<br \/>\nEnd Function<br \/>\n<\/code><br \/>\nI think the next logical step would to be write a Servlet filter for Java that analyzes the request scope and automatically filters user input for you, much like the <a href=\"http:\/\/www.asp.net\/faq\/requestvalidation.aspx\">automatic request validation<\/a> that happens in ASP.NET.<\/p>\n<p>You can download the full code for each of the above examples here:<\/p>\n<p>&middot; <a href=\"\/images\/files\/InputFilter.cs\">InputFilter.cs<\/a><br \/>\n&middot; <a href=\"\/images\/files\/InputFilter.java\">InputFilter.java<\/a><br \/>\n&middot; <a href=\"\/images\/files\/InputFilter.asp\">InputFilter.asp<\/a><\/p>\n<p>Feel free to comment on the way that you do cross site scripting filtering.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ran into some issues with cross site scripting attacks today. CERT\u00ae has an excellent article that show exactly how you should be filtering input from forms. Specifically, it mentions that just filtering *certain* characters in user supplied input isn&#8217;t good enough. Developers should be doing the opposite and only explicitly allowing certain characters. Using &#8220;&#8230; &hellip; <a href=\"https:\/\/cephas.net\/blog\/2003\/10\/31\/cross-site-scripting-removing-meta-characters-from-user-supplied-data-in-cgi-scripts-using-c-java-and-asp\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Cross site scripting: removing meta-characters from user-supplied data in CGI scripts using C#, Java and ASP<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[17],"tags":[],"_links":{"self":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/532"}],"collection":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/comments?post=532"}],"version-history":[{"count":0,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/532\/revisions"}],"wp:attachment":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/media?parent=532"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/categories?post=532"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/tags?post=532"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}