{"id":1048,"date":"2008-01-27T22:03:15","date_gmt":"2008-01-28T06:03:15","guid":{"rendered":"http:\/\/cephas.net\/blog\/2008\/01\/27\/the-data-life-cycle-of-a-blog-post\/"},"modified":"2008-01-27T22:38:18","modified_gmt":"2008-01-28T06:38:18","slug":"the-data-life-cycle-of-a-blog-post","status":"publish","type":"post","link":"https:\/\/cephas.net\/blog\/2008\/01\/27\/the-data-life-cycle-of-a-blog-post\/","title":{"rendered":"The Data Life Cycle of a Blog Post"},"content":{"rendered":"<p><a href=\"http:\/\/www.wired.com\/special_multimedia\/2008\/ff_secretlife_1602\">Cool flash infographic<\/a> in the latest issue of Wired that shows what happens to your blog post after you click  the &#8216;publish&#8217; button (I&#8217;ll save you the hassle of actually viewing it: after you click the &#8216;publish&#8217; button, exciting things like ping servers, data miners, search engines, text scrapers, aggregators, social bookmarking sites, online media, spam blogs and finally readers get involved).  Since it&#8217;s <a href=\"http:\/\/www.wired.com\/wired\/\">Wired<\/a> and not <a href=\"http:\/\/xml.sys-con.com\/\">XML Journal<\/a>, they stopped at the infographic, but man, it should would be cool to see all the ways that data massaged, reformatted, sliced and diced and transmitted, because there&#8217;s a lot that happens in that process.  Just for the fun of it, I&#8217;m gonna walk through the scenarios I know about.  <\/p>\n<p>First, you click the publish button.  But that might be a publish button on a desktop blogging client like <a href=\"http:\/\/windowslivewriter.spaces.live.com\/\">Windows Live Writer<\/a> or it might be <a href=\"http:\/\/office.microsoft.com\/en-us\/help\/FX102376791033.aspx\">the publish button in Microsoft Word<\/a> or it might be a real live HTML button that says &#8216;publish&#8217;. So before you even get to the publish part, we&#8217;ve got the possibility of the <a href=\"http:\/\/en.wikipedia.org\/wiki\/MetaWeblog\">MetaWeblog API<\/a> (which is XML-RPC, effectively XML over HTTP), <a href=\"http:\/\/www.atomenabled.org\/developers\/protocol\/\">Atom Publishing Protocol<\/a> (again effectively XML over HTTP) or a plain HTTP (or HTTPS!) POST. <\/p>\n<p>OK, so now your blog post has been published on your blog. What next?  Probably unbeknownst to you, your blog post has been <a href=\"http:\/\/en.wikipedia.org\/wiki\/Ping_blog\">automatically submitted to one or more ping servers<\/a> using XML-RPC (XML over HTTP). Because search engines got into the blogging business, you can even ping <a href=\"http:\/\/blogsearch.google.com\/ping\/RPC2\">Google<\/a> and <a href=\"http:\/\/api.my.yahoo.com\/rss\/ping\">Yahoo<\/a> (curiously not Microsoft, why?).  If you don&#8217;t want to hassle with a bunch of different sites, you can always use <a href=\"http:\/\/pingomatic.com\/\">pingomatic.com<\/a>, which will ping (as of 1\/27\/2008) twenty one different ping servers for you.<\/p>\n<p>Oh, I forgot to mention. If you&#8217;re using TypePad, Livejournal or Vox, the information about your blog post isn&#8217;t sent to these ping servers using XML-RPC, it&#8217;s <a href=\"http:\/\/updates.sixapart.com\/\">streamed as XML <b>in real-time<\/b> over HTTP<\/a> to many of the same parties.  <\/p>\n<p>Great, your blog post has now been sent to everyone, you&#8217;re good right? Nope. Now comes the onslaught of spiders and bots, awoken by the ping you sent, who will request your feed (RSS \/ Atom over HTTP) and your blog post (HTML over HTTP) and your first born child again and again and again.  And now that your blog post is published and assuming that you&#8217;ve published something of value, you&#8217;ll see real people stop by and  comment on your blog post and maybe bookmark it in a site like <a href=\"http:\/\/del.icio.us\/\">del.icio.us<\/a> or <a href=\"http:\/\/ma.gnolia.com\/\">ma.gnolia.com<\/a>, snipping a quote from your blog post and then <a href=\"http:\/\/swem.wm.edu\/blogs\/waynegraham\/index.cfm\/2007\/5\/8\/ColdFusion-and-Lucene\">publishing that snippet to their own blogs<\/a> or <a href=\"http:\/\/jira.atlassian.com\/browse\/JRA-7604\">to their bug tracker<\/a> and now your blog post has replicated, it lives in small parts all over the web, each part getting published and spidered and syndicated and ripped again and again and again. It&#8217;s beautiful isn&#8217;t it?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cool flash infographic in the latest issue of Wired that shows what happens to your blog post after you click the &#8216;publish&#8217; button (I&#8217;ll save you the hassle of actually viewing it: after you click the &#8216;publish&#8217; button, exciting things like ping servers, data miners, search engines, text scrapers, aggregators, social bookmarking sites, online media, &hellip; <a href=\"https:\/\/cephas.net\/blog\/2008\/01\/27\/the-data-life-cycle-of-a-blog-post\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">The Data Life Cycle of a Blog Post<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[31,14,32,10],"tags":[],"_links":{"self":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/1048"}],"collection":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/comments?post=1048"}],"version-history":[{"count":0,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/1048\/revisions"}],"wp:attachment":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/media?parent=1048"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/categories?post=1048"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/tags?post=1048"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}