Processing Big Data with MapReduce

Processing Big Data with MapReduce

I am proud to announce my latest series of screencasts on Hadoop MapReduce. It’s published again by the good people at Pragmatic Programmers.  These screencasts are the best way for a beginner to learn about Hadoop, unless they’re sitting in my class at Cloudera...
Hadoop The Definitive Guide 3rd Edition Review

Hadoop The Definitive Guide 3rd Edition Review

My original review of Hadoop The Definitive Guide (TDG) was for the 2nd edition.  Recently, the 3rd edition was released.  I reread the book in its entirety. The new edition covers the latest changes to the 1.x (0.20) and the 2.x (0.23).  The book’s examples now use...
EC2 Performance, Spot Instance ROI and EMR Scalability

A Few More Million Amazonian Monkeys

Update 5: The monkeys recreated every work of Shakespeare and went viral. See the project project postmortem for my thoughts on going viral and what I learned during the project. Update 6: I created a new visualization of the monkeys’ data. Update 4: The monkeys...