I’ve spent way more time than I should have the last couple weeks working to help migrate a website built against Jive Forums to run against a Clearspace X instance. As part of the migration, one of the things I did was to move all the data syndication that had been done with RSS and custom namespaces to use the Clearspace SOAP API, which is built on a technology called XFire. The first problem I ran into was that production website was configured so that requests to http://example.com were redirected to http://www.example.com/, which resulted in errors like this in the logs:
Jul 5, 2007 11:30:11 PM org.apache.commons.httpclient.HttpMethodDirector isRedirectNeeded
INFO: Redirect requested but followRedirects is disabled
That error was pretty easy to fix (swap in http://www.example.com in place of http://example.com), but the next thing I ran into was way less intuitive. When I invoked a certain service, I’d get a stack trace that looked like this:
Exception in thread "main" org.codehaus.xfire.XFireRuntimeException: Could not invoke service..
Nested exception is org.codehaus.xfire.fault.XFireFault: Unexpected character '-' (code 45) in prolog; expected '<'
at [row,col {unknown-source}]: [2,1]
org.codehaus.xfire.fault.XFireFault: Unexpected character '-' (code 45) in prolog; expected '<'
at [row,col {unknown-source}]: [2,1]
at org.codehaus.xfire.fault.XFireFault.createFault(XFireFault.java:89)
at org.codehaus.xfire.client.Client.onReceive(Client.java:386)
which was troubling because the exact same SOAP method invocation worked fine on both my local machine and in the test environment. What was different? Two things: the production system was running on Java 6 and the production system was configured to run behind an Apache HTTP server proxied by mod_caucho versus no Apache HTTP server / proxy in development or on my machine. I needed to see what was going on between the server and the client (one of the things that makes SOAP so hard is that you can't just GET a URL to see what's being returned) so I fired up ethereal at the behest of one of my coworkers. I kicked off a couple of SOAP requests with ethereal running, recorded the packets and then analyzed the capture. Said coworker then pointed out the key to debugging HTTP requests with ethereal: right click on the TCP packet you're interested in and then click 'Follow TCP Stream'. The invocation response looked like this when run against the development environment:
HTTP/1.1 200 OK
Date: Mon, 02 Jul 2007 21:59:30 GMT
Server: Resin/3.0.14
Content-Type: multipart/related; type="application/xop+xml"; start=""; start-info="text/xml"; .boundary="----=_Part_5_25686393.1183413571061"
Connection: close
Transfer-Encoding: chunked
1dce
------=_Part_5_25686393.1183413571061
Content-Type: application/xop+xml; charset=UTF-8; type="text/xml"
Content-Transfer-Encoding: 8bit
Content-ID:
...
and looked like this when invoked against the production instance:
HTTP/1.1 200 OK
Date: Mon, 02 Jul 2007 21:41:56 GMT
Server: Apache/2.0.52 (Red Hat)
Vary: Accept-Encoding,User-Agent
Cache-Control: max-age=0
Expires: Mon, 02 Jul 2007 21:41:56 GMT
Transfer-Encoding: chunked
Content-Type: text/plain; charset=UTF-8
X-Pad: avoid browser bug
24e
------=_Part_29_31959705.1183412516805
Content-Type: application/xop+xml; charset=UTF-8; type="text/xml"
Content-Transfer-Encoding: 8bit
Content-ID:
...
Notice the different content type returned by the production server? So then the mystery became not 'what?' but 'who?' I googled around for a bit and found a bug filed against JIRA that had all the same symptoms as the problem I was running into: the solution posted in the comments of the bug said that the problem was with mod_caucho. I worked with the ISP that hosts the production instance of Clearspace, got them to remove mod_caucho and use mod_proxy to isolate that piece of the puzzle and sure enough, the problem went away. Our ISP recommended that we not settle for mod_proxy for the entire site and instead wrote up a nifty solution using mod_rewrite and mod_proxy, which I've pasted below:
RewriteRule ^/clearspace/rpc/soap(/?(.*))$ to://www.example.com:8080/clearspace/rpc/soap$1
RewriteRule ^to://([^/]+)/(.*) http://$1/$2 [E=SERVER:$1,P,L]
ProxyPassReverse /community/rpc/soap/ http://www.example.com/clearspace/rpc/soap/
Hope that helps someone down the road!