Hadoop Book Reviews

Blog Summary: (AI Summaries by Summarizes)
  • Two Hadoop books were reviewed: "Hadoop the Definitive Guide 2nd Edition" and "Hadoop In Action" by Tom White and Chuck Lam respectively.
  • "Hadoop the Definitive Guide 2nd Edition" is more focused on programming and has more real-world and applicable code examples.
  • The book goes into better detail about the programming side of things like debugging and logging.
  • "Hadoop In Action" is more aimed at people wanting to learn about Hadoop and gives a better overview of maintaining and setting up a Hadoop cluster.
  • The book also contains more overview chapters of the Hadoop associated projects like Pig and HBase.

Update: Review of Hadoop the Definitive Guide 3nd Edition

I spent some time reading 2 Hadoop books: Hadoop the Definitive Guide 2nd Edition by Tom White and Hadoop In Action by Chuck Lam.  Both books were well written but seemed to be aimed at a different audience.

Hadoop the Definitive Guide 2nd Edition seems to be aimed more at the programmer. There are lots of code samples and the author goes through the code line by line and does a great job of explaining why each one is important. I liked this book’s code examples better than Hadoop In Action because the book’s examples seemed to more real world and applicable. He goes into better detail about the programming side of things like debugging and logging. If you know enough about MapReduce to be dangerous, but want to know about Hadoop’s implementation of it, head to chapter 6 “How MapReduce Works”. I am a visual person and enjoyed this book’s diagrams for understanding the flow. Hadoop In Action doesn’t have any diagrams. This book contains more overview chapters of the Hadoop associated projects like Pig and HBase.

Hadoop In Action seems to be aimed more at people wanting to learn about Hadoop. It isn’t a cursory look at Hadoop, but this would be the book I would recommend to a manager or non-programmer to learn about Hadoop. For managers, I would send them straight to chapter 7 “Cookbook” which shows how other companies have used this technology. It also gives a better overview of maintaining and setting up a Hadoop cluster.

If you starting from scratch on Hadoop, I recommend you start out with Hadoop In Action. If you are going straight to coding or already have a handle on MapReduce, then I recommend you buy Hadoop the Definitive Guide 2nd Edition.

Related Posts

The Difference Between Learning and Doing

Blog Summary: (AI Summaries by Summarizes)Learning options trading involves data and programming but is not as technical as data engineering or software engineering.Different types of

The Data Discovery Team

Blog Summary: (AI Summaries by Summarizes)Data discovery team plays a crucial role in searching for data in the IT landscape.Data discovery team must make data