My article on CEO.com was posted today. I talk about ways to hire and interview your first software engineer. I reference the importance of giving back to software groups as you use their help. Here is the guest post I wrote for startup communities discussing how to give back.Read More
I am proud to announce my latest series of screencasts on Hadoop MapReduce. It’s published again by the good people at Pragmatic Programmers. These screencasts are the best way for a beginner to learn about Hadoop, unless they’re sitting in my class at Cloudera University.
Here’s few links to get started after you’ve purchased the screencasts:
First, you want a way to run Hadoop, MapReduce and Eclipse. There is a virtual machine that is set up and running with everything you need. I have a mini-screencast showing how to use Eclipse and debug things.
Finally, you’ll need the dataset for the second episode. It uses the Nasdaq daily stock prices from InfoChimps.
The focus of the screencast isn’t administration and installation. This screencast is focused on the developer side of things. The source code is written to run on the Cloudera QuickStart VM out of the box.Read More
One of the common questions I get from students and developers relates to IDEs and MapReduce. How you create a MapReduce project in Eclipse and debug it? I have created a short screencast showing you how.
Cloudera QuickStart VM
The Cloudera QuickStart VM lets developers get started with writing MapReduce code without having to worry about software installs and configuration. Everything is installed and ready to go. You can download the image type that corresponds to your preferred virtualization platform.
Eclipse is installed on the VM and there is a link on the desktop to start it.
MapReduce and Eclipse
You can run and debug MapReduce code in Eclipse just like any other Java program. There are a few differences between running MapReduce in a distributed cluster and in an IDE like Eclipse. When you run MapReduce code in Eclipse, Hadoop runs in a special mode called LocalJobRunner. All of the Hadoop daemons are run in a single JVM (Java Virtual Machine) instead of several different JVMs. Another difference is that all file paths default to local file paths and not HDFS file paths.
With those caveats in mind, you can start putting in your breakpoints and debug your MapReduce code like any other Java program.
If you want to clone the same Git project as I do in the screencast, you can find it here. From the terminal type in:
git clone email@example.com:eljefe6a/UnoExample.git
The project will be cloned to the current directory as a subdirectory.
Note that creating Eclipse projects manually is the easy way to get started. If you are going to have Hadoop as part of an automated build process, you will want to do this in Maven. In Maven, you can create Eclipse projects. This blog post tells you how. If you want to compile Hadoop from source using Eclipse, this blog post shows how.
Whether you want to start writing some MapReduce code or debug existing code, the QuickStart VM will help you do it quickly and easily. This screencast walks you through it and gets you coding on your favorite IDE.Read More
I’ve written a guest post for my OSCON talk that is published on the O’Reilly Programming Blog. I talk about augmenting dataset and dealing with unstructured data. See you at OSCON!Read More Read More
I’ve been speaking at some conferences lately. I knew I needed to get better with my submissions and hoped Alistair or Edd would write a blog post about how to improve. Alistair did except it got out of hand and became a book.
This book became Propose, Prepare, Present. It was exactly what I needed to improve my submissions. He doesn’t just cover the tactics of what makes a submission good or bad. I appreciated this well-rounded approach.
The book shows some of the cardinal sins one can commit. These aren’t just problems with a submission that will get you rejected; there are the ones that get you on the informal black book of conferences. Some of these are obvious and some aren’t.
He also gets into the behind the scenes of the conference industry to show why things happen. I run our local developers group and I’ve always been curious about the inner workings of a big conference. I experience many of the same problems, but at a much, much smaller level. The book shows the politics and money that the big boys deal with. It covers the review process and sheer number of submissions that conference gets. A good conference isn’t lacking submissions and the odds are low of getting picked. This is the time when a poorly written or a typo laden submission will get you tossed quickly. It’s nothing personal and someone else put the effort into making theirs better.
I’d like to add a few tips from my own experience.
- Start at the company or local level and work your way up. Speaking at a national conference like Alistair’s means competing against a lot of other people. Your local user group has much less competition. Hone your craft there and move on to the national and international levels.
- Software Developers may not be the best public speakers. Invest some money in your Developers to give a better talk.
- Bad stands out more than good. I can remember the bad speakers at my group faster than the good ones.
- Having to contact a speaker several times to nail things down sucks. It takes time from other things and the organizers will appreciate it.
- Say thanks! Very easy to do, it happens rarely, and the organizers appreciate it on an otherwise thankless job.
The verdict? Buy it. I’m curious to see how much my submissions improve.Read More