- The video demonstrates live coding of a dedupe algorithm.
- The algorithm is used to remove duplicates from several data files.
- The video shows a simple version of the algorithm and a more complicated version with custom logic.
- The video is a helpful resource for those interested in learning how to write code with Hadoop MapReduce and become a Data Engineer.
- The video is accompanied by an invitation to join an online course for further learning.
In this video, I live code a dedupe algorithm. If you’re not familiar with this algorithm, you need to take several data files and remove the duplicates. I show the simple version. Then, I show a more complicated version that adds some custom logic.
If you want to learn more about how to write code with Hadoop MapReduce and become a Data Engineer, join my online course.