{"id":781,"date":"2006-03-14T21:57:07","date_gmt":"2006-03-15T01:57:07","guid":{"rendered":"http:\/\/wordpress.cephas.net\/?p=781"},"modified":"2006-03-14T21:57:07","modified_gmt":"2006-03-15T01:57:07","slug":"nutch-yahoo-and-hadoop","status":"publish","type":"post","link":"https:\/\/cephas.net\/blog\/2006\/03\/14\/nutch-yahoo-and-hadoop\/","title":{"rendered":"Nutch, Yahoo!, and Hadoop"},"content":{"rendered":"<p>It&#8217;s been awhile since I mentioned anything about <a href=\"http:\/\/lucene.apache.org\/\">Lucene<\/a>, my favorite Java based open source indexing and search library (which I built the <a href=\"http:\/\/cephas.net\/projects\/karakoram\/\">karakoram spider \/ search application<\/a> around). Doug Cutting, who created Lucene and who has spent the last couple years working on Nutch, was <a href=\"http:\/\/nutch.sourceforge.net\/blog\/2006\/03\/im-now-yahoo.html\">recently hired<\/a> by Yahoo!. I just have a couple questions: <\/p>\n<p>a) why would Yahoo want to hire a guy writing a Java based web crawler and indexer?<\/p>\n<p>b) where does he get all the cool names? <a href=\"http:\/\/lucene.apache.org\/nutch\/\">Nutch<\/a>? <a href=\"http:\/\/lucene.apache.org\/hadoop\/\">Hadoop<\/a>?<\/p>\n<p>c) How cool does Hadoop sound? Hadoop Distributed Filesystem (HDFS) and an implementation of MapReduce. Hmm.. where else have I heard about those terms <a href=\"http:\/\/labs.google.com\/papers\/gfs.html\">bantered<\/a> <a href=\"http:\/\/labs.google.com\/papers\/mapreduce.html\">about<\/a>?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s been awhile since I mentioned anything about Lucene, my favorite Java based open source indexing and search library (which I built the karakoram spider \/ search application around). Doug Cutting, who created Lucene and who has spent the last couple years working on Nutch, was recently hired by Yahoo!. I just have a couple &hellip; <a href=\"https:\/\/cephas.net\/blog\/2006\/03\/14\/nutch-yahoo-and-hadoop\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Nutch, Yahoo!, and Hadoop<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[3,19,4,2],"tags":[],"_links":{"self":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/781"}],"collection":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/comments?post=781"}],"version-history":[{"count":0,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/posts\/781\/revisions"}],"wp:attachment":[{"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/media?parent=781"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/categories?post=781"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cephas.net\/blog\/wp-json\/wp\/v2\/tags?post=781"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}