I’ve spent way more time than I should have the last couple weeks working to help migrate a website built against Jive Forums to run against a Clearspace X instance. As part of the migration, one of the things I did was to move all the data syndication that had been done with RSS and custom namespaces to use the Clearspace SOAP API, which is built on a technology called XFire. The first problem I ran into was that production website was configured so that requests to http://example.com were redirected to http://www.example.com/, which resulted in errors like this in the logs:
Jul 5, 2007 11:30:11 PM org.apache.commons.httpclient.HttpMethodDirector isRedirectNeeded INFO: Redirect requested but followRedirects is disabled
That error was pretty easy to fix (swap in http://www.example.com in place of http://example.com), but the next thing I ran into was way less intuitive. When I invoked a certain service, I’d get a stack trace that looked like this:
Exception in thread "main" org.codehaus.xfire.XFireRuntimeException: Could not invoke service.. Nested exception is org.codehaus.xfire.fault.XFireFault: Unexpected character '-' (code 45) in prolog; expected '<' at [row,col {unknown-source}]: [2,1] org.codehaus.xfire.fault.XFireFault: Unexpected character '-' (code 45) in prolog; expected '<' at [row,col {unknown-source}]: [2,1] at org.codehaus.xfire.fault.XFireFault.createFault(XFireFault.java:89) at org.codehaus.xfire.client.Client.onReceive(Client.java:386)
which was troubling because the exact same SOAP method invocation worked fine on both my local machine and in the test environment. What was different? Two things: the production system was running on Java 6 and the production system was configured to run behind an Apache HTTP server proxied by mod_caucho versus no Apache HTTP server / proxy in development or on my machine. I needed to see what was going on between the server and the client (one of the things that makes SOAP so hard is that you can't just GET a URL to see what's being returned) so I fired up ethereal at the behest of one of my coworkers. I kicked off a couple of SOAP requests with ethereal running, recorded the packets and then analyzed the capture. Said coworker then pointed out the key to debugging HTTP requests with ethereal: right click on the TCP packet you're interested in and then click 'Follow TCP Stream'. The invocation response looked like this when run against the development environment:
HTTP/1.1 200 OK Date: Mon, 02 Jul 2007 21:59:30 GMT Server: Resin/3.0.14 Content-Type: multipart/related; type="application/xop+xml"; start=""; start-info="text/xml"; .boundary="----=_Part_5_25686393.1183413571061" Connection: close Transfer-Encoding: chunked 1dce ------=_Part_5_25686393.1183413571061 Content-Type: application/xop+xml; charset=UTF-8; type="text/xml" Content-Transfer-Encoding: 8bit Content-ID: ...
and looked like this when invoked against the production instance:
HTTP/1.1 200 OK Date: Mon, 02 Jul 2007 21:41:56 GMT Server: Apache/2.0.52 (Red Hat) Vary: Accept-Encoding,User-Agent Cache-Control: max-age=0 Expires: Mon, 02 Jul 2007 21:41:56 GMT Transfer-Encoding: chunked Content-Type: text/plain; charset=UTF-8 X-Pad: avoid browser bug 24e ------=_Part_29_31959705.1183412516805 Content-Type: application/xop+xml; charset=UTF-8; type="text/xml" Content-Transfer-Encoding: 8bit Content-ID:...
Notice the different content type returned by the production server? So then the mystery became not 'what?' but 'who?' I googled around for a bit and found a bug filed against JIRA that had all the same symptoms as the problem I was running into: the solution posted in the comments of the bug said that the problem was with mod_caucho. I worked with the ISP that hosts the production instance of Clearspace, got them to remove mod_caucho and use mod_proxy to isolate that piece of the puzzle and sure enough, the problem went away. Our ISP recommended that we not settle for mod_proxy for the entire site and instead wrote up a nifty solution using mod_rewrite and mod_proxy, which I've pasted below:
RewriteRule ^/clearspace/rpc/soap(/?(.*))$ to://www.example.com:8080/clearspace/rpc/soap$1 RewriteRule ^to://([^/]+)/(.*) http://$1/$2 [E=SERVER:$1,P,L] ProxyPassReverse /community/rpc/soap/ http://www.example.com/clearspace/rpc/soap/
Hope that helps someone down the road!