Professional Data Engineering Review – Sanjoy Roy

Blog Summary: (AI Summaries by Summarizes)
  • The Professional Data Engineering course by Jesse Anderson covers a well-rounded curriculum that includes data engineering and data science.
  • The course is a semester's worth of learning and covers all relevant topics, including data ingestion, processing, and visualization at scale.
  • Jesse Anderson is a phenomenal teacher who can explain complex topics in a simplified manner, including multi-threading, which is a core concept for writing programs at scale.
  • The course covers building a data pipeline and programming at scale in great detail, and the hands-on labs are essential to understanding the concepts.
  • The personal project is where the real value of the course lies, and it gives students the confidence to apply the tools, technologies, and principles they learned.

Note: this is a guest post from Sanjoy Roy who is reviewing my Professional Data Engineering course.

Since late 2014, I have been drawn into various analytics projects which required a good mix of skills for both data engineering and data science. There are a lot of good MOOC available now which covers very focussed areas of data science/engineering. What I was looking for a well-rounded course which not only covered the areas of data engineering but also sufficient hands-on/labs to put those skills into practice. And that’s where I landed onto Jesse’s Professional Data Engineering course which had everything that I was looking for. And I will explain you why this course has all those qualities I was seeking.

First and foremost, this course is a semester’s worth of learning even though it is stated as 8 weeks – reason is because it has got the whole gamut of information and tools required for any data engineering solution. For people who have no programming experience whatsoever, it may take somewhere about 12 to 14 weeks to complete the course, if he/she can stick to a regiment. To be honest – this course is pretty tough for people who do not like Java, and for particularly those who do not like to get their hands dirty with programming.

This course covers all the relevant topics – from (a) what would you do before embarking on a data engineering solution, and then (b) concepts of data ingestion, data processing and data visualization, at scale – where each topic is being dealt with the required depth. Jesse is a phenomenal teacher, he can explain the most complex of topics by immensely simplifying it. Jesse knows his stuff but more importantly he can explain it like nobody else I could imagine. It is not easy to explain multi-threading, which is a core concept to write programs at scale, which I believe Jesse has done exceedingly well.

This course covers in great details the 2 most important concepts in any data engineering solution – (a) building a data pipeline and (b) programming at scale, and in my humble opinion both the concepts are dealt with more than sufficient depth. Before embarking on this course, I thought I knew a thing about big-data, but this course was a real eye-opener for me. After going thru the course, and thanks to the hands-on labs, I ramped up my knowledge to deal with tough problems related to Big Data Engineering.

My advice to all takers of this course – each video should be watched with absolute attention and each hands-on lab need to be done – without exception. Period. Great content/videos, great Labs, and the real value is the Virtual Machine – where all the tools are packed and requires minimum effort to get deployed and started.

The next thing I like about this course is – how you could apply the tools to solve various data engineering problems. The commendable facet of this course is – you will get precisely the same output from the programme depending on the effort you have put into (a) doing the hands-on labs and (b) most importantly – your personal project. This is where the value of the course lies. And to top-it-all, you will have the guidance of Jesse – who is always there to help you solve your issues and help you move to the right direction when you ask for help. I felt like doing an university course where I could approach my teacher any time when I had any doubts. However trivial or serious the issue was, Jesse was always there to help me. There were numerous situations where I was stuck while implementing my personal project, but everytime, Jesse’s expert guidance helped me.

Personal project is where the rubber hits the road, it gives you the confidence that you have created something worthwhile by putting all the tools, technologies and principles into practice – which you know to the absolute detail. My personal project as a part of this course – analysis of Market Indices and it impact on stocks, and perform Monte-Carlo simulations using those models and performing various Risk Analyses, and to project expected returns. However it didn’t finish there. After that start, I became interested in further analysis of such publicly available data-sets and build a more comprehensive data product which can be applied across various markets and financial conditions. And this course helped me navigate in the right direction. It empowers you with life-long learning.

To sum it up – Data Science and Data Engineering is the buzzword in today’s IT landscape – everyone speaks about it but only a few knows how to deal with it. Undoubtedly, it is a hard concept to grasp, only because the challenges are unique and hence each problem needs to be addressed uniquely. This course unravels the myths of data science/engineering and teaches you to approach it the right way. Worth mentioning over and over again, not only I got the right guidance but I also got a great mentor in Jesse.

The subject of data engineering is technical, and it is tough, but to achieve anything worthwhile, it needs striving. Hence, to all techies/geeks who love data, who love Java/Scala, or rather any programming, who want to venture seriously into the field of data engineering, and who are willing to put their best effort to learn those skills and put them into practice, this is THE DATA ENGINEERING COURSE that covers all aspects (other than streaming data). I am also looking forward to learn about dealing with streaming data in Jesse’s Real-time Data Engineering course.

Note: Sanjoy’s personal project is using Big Data with Monte Carlo simulations. You see his presentation here.

Related Posts