{"id":884,"date":"2006-12-04T23:47:50","date_gmt":"2006-12-05T07:47:50","guid":{"rendered":"http:\/\/cephas.net\/blog\/2006\/12\/04\/rssatom-feeds-last-modified-and-etags\/"},"modified":"2006-12-05T14:42:46","modified_gmt":"2006-12-05T22:42:46","slug":"rssatom-feeds-last-modified-and-etags","status":"publish","type":"post","link":"https:\/\/cephas.net\/blog\/2006\/12\/04\/rssatom-feeds-last-modified-and-etags\/","title":{"rendered":"RSS\/Atom feeds, Last Modified and Etags"},"content":{"rendered":"<p>Sometime last week I read <a href=\"http:\/\/www.intertwingly.net\/blog\/2006\/11\/22\/Detecting-Not-Modified-Reliably\">this piece<\/a> by <a href=\"http:\/\/en.wikipedia.org\/wiki\/Sam_Ruby\">Sam Ruby<\/a>, which summarized says this:<\/p>\n<blockquote><p>\n&#8230;don\u00e2\u20ac\u2122t send Etag and Last-Modified headers unless you really mean it.  But if you can support it, please do.  It will save you some bandwidth and your readers some processing.\n<\/p><\/blockquote>\n<p>The <a href=\"http:\/\/www.jivesoftware.com\/products\/clearspace\/\">product I&#8217;ve been working on at work<\/a> (<strike>which I should be able to start talking about soon<\/strike> which I can talk about now) for the last couple months uses feeds (either Atom, RSS 1.0 or RSS 2.0, your choice) extensively but didn&#8217;t have <a href=\"http:\/\/www.w3.org\/Protocols\/rfc2616\/rfc2616-sec14.html\">Etag or Last-Modified<\/a> support so I spent a couple hours working on it this past weekend. We&#8217;re using <a href=\"https:\/\/rome.dev.java.net\/\">ROME<\/a>, so the code ended up looking something like this:<\/p>\n<pre>\r\nHttpServletRequest request = ...\r\nHttpServletResponse response = ....\r\nSyndFeed feed = ...\r\nif (!isModified(request, feed)) {\r\n  response.setStatus(HttpServletResponse.SC_NOT_MODIFIED);\r\n} else {\r\n  long publishDate = feed.getPublishedDate().getTime();\r\n  response.setDateHeader(\"Last-Modified\", publishDate);\r\n  response.setHeader(\"Etag\", getEtag(feed));\r\n}\r\n...\r\nprivate String getEtag(SyndFeed feed) {\r\n  return \"\\\"\" + String.valueOf(feed.getPublishedDate().getTime()) + \"\\\"\";\r\n}\r\n...\r\nprivate boolean isModified(HttpServletRequest request, SyndFeed feed) {\r\n  if (request.getHeader(\"If-Modified-Since\") != null && request.getHeader(\"If-None-Match\") != null) {\r\n  String feedTag = getEtag(feed);\r\n    String eTag = request.getHeader(\"If-None-Match\");\r\n    Calendar ifModifiedSince = Calendar.getInstance();\r\n    ifModifiedSince.setTimeInMillis(request.getDateHeader(\"If-Modified-Since\"));\r\n    Calendar publishDate = Calendar.getInstance();\r\n    publishDate.setTime(feed.getPublishedDate());\r\n    publishDate.set(Calendar.MILLISECOND, 0);\r\n    int diff = ifModifiedSince.compareTo(publishDate);\r\n    return diff != 0 || !eTag.equalsIgnoreCase(feedTag);\r\n  } else {\r\n    return true;\r\n  }\r\n}\r\n<\/pre>\n<p>There are only a two gotchas in the code:<\/p>\n<ol>\n<li>The value of the Etag must be quoted, hence the <code>getEtag(...)<\/code> method above returning a string wrapped in quotes.  Not hard to do, but easy to miss.<\/li>\n<li>The first block of code above uses the <code>setDateHeader(String name, long date)<\/code> to set the &#8216;Last-Modified&#8217; HTTP header, which conveniently takes care of formatting the given date according to the <a href=\"http:\/\/www.ietf.org\/rfc\/rfc0822.txt\">RFC 822 specification for dates and times<\/a>. The <a href=\"https:\/\/rome.dev.java.net\/apidocs\/0_8\/com\/sun\/syndication\/feed\/synd\/SyndFeed.html#getPublishedDate()\">published date<\/a> comes from ROME.  Here&#8217;s where it gets tricky: if the client returns the &#8216;If-Modified-Since&#8217; header and you retrieve said date from the request using <code>getDateHeader(String name)<\/code>, you&#8217;ll get a Date in the GMT timezone, which means if you want to compare the date you&#8217;ll have to get the date into your own timezone.  That&#8217;s relatively easy to do by creating a <code>Calendar<\/code> instance and setting the time of the instance to the value you retrieved from the header. The Calendar instance will transparently take care of the timezone change for you.  But there&#8217;s still one thing left: the date specification for RFC 822 doesn&#8217;t specify a millisecond so if the <code>long<\/code> value you hand to <code>setDateHeader(long date)<\/code> method contains a millisecond value and you then try to use the same value to compare against the &#8216;If-Modified-Since&#8217; header, you&#8217;ll never get a match.  The easy way around that is to manually set the millisecond bits on the date you get back from the &#8216;If-Modified-Since&#8217; header to zero. <\/li>\n<\/ol>\n<p>If you&#8217;re interested, there are a number of other blogs \/ articles about Etags and Last-Modified headers:<\/p>\n<ul>\n<li><a href=\"http:\/\/fishbowl.pastiche.org\/2002\/10\/21\/http_conditional_get_for_rss_hackers\">http:\/\/fishbowl.pastiche.org\/2002\/10\/21\/http_conditional_get_for_rss_hackers<\/a><\/li>\n<li><a href=\"http:\/\/mxblogspace.journurl.com\/users\/admin\/index.cfm?mode=article&#038;entry=1853\">http:\/\/mxblogspace.journurl.com\/users\/admin\/index.cfm?mode=article&#038;entry=1853<\/a><\/li>\n<li><a href=\"http:\/\/www.emilsit.net\/blog\/archives\/wordpress-etag-bug\/\">http:\/\/www.emilsit.net\/blog\/archives\/wordpress-etag-bug\/<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Sometime last week I read this piece by Sam Ruby, which summarized says this: &#8230;don\u00e2\u20ac\u2122t send Etag and Last-Modified headers unless you really mean it. But if you can support it, please do. It will save you some bandwidth and your readers some processing. The product I&#8217;ve been working on at work (which I should &hellip; <a href=\"https:\/\/cephas.net\/blog\/2006\/12\/04\/rssatom-feeds-last-modified-and-etags\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">RSS\/Atom feeds, Last Modified and Etags<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[5,3,2,32,12],"tags":[],"_links":{"self":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/884"}],"collection":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/comments?post=884"}],"version-history":[{"count":0,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/884\/revisions"}],"wp:attachment":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/media?parent=884"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/categories?post=884"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/tags?post=884"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}