Breaking into Data Engineering

Authors:  || Published: 2024-05-12T20:06:00 || Updated: 2024-05-12T20:06:00 || 2 min read
Categories:  || Tags:   || Post-format:  aside

Breaking into Data Engineering

I get asked by individuals on some mentoring forums about how to break into data engineering. This post is a collection of resources and ideas that I have proffered to answer this question.

  1. Courses
    1. Big Data Specialization in Coursera offered by UC San Diego. [mandatory]
    2. ETL and Data Pipelines with Shell, Airflow and Kafka course in Coursera. [mandatory]
    3. Advanced Data Engineering course in Coursera. [mandatory]
    4. IBM Data Engineering Professional Certificate program in Coursera. This is a 13 course program that covers a wide range of topics in data engineering. It will be a significant time commitment. However, if not this whole course, I definitely recommend doing the “ETL and Data Pipelines with Shell, Airflow and Kafka” course linked above which is part of this program. [optional]
  2. YouTube Channels [ref]. I recommend watching videos from these channels to immerse yourself in the world of data engineering and pick up as many concepts as possible in as quick a time. While it will be overwhelming in the beh=ginning, stick through it. It will start making sense after a while.
    1. Data Engineering Simplified
    2. Data with Zach
    3. Andreas Kretz. Related website: Learn Data Engineering.
    4. Darshil Parmar
    5. Seattle Data Guy. Related website: The Seattle Data Guy.
    6. Karolina Sowinska
    7. mehdio DataTV
    8. E-Learning Bridge
    9. Ken Jee
  3. Books
    1. Fundamentals of Data Engineering by Joe Reis, Matt Housley. [mandatory]
    2. The Data Engineering Cookbook by Andreas Kretz. [mandatory]
    3. Designing Data-Intensive Applications by Martin Kleppmann. This book is slightly dated, but still extremely useful. [optional]
  4. Meetups: Try attending some local/virtual meetups on data engineering. I would check meetup.com.
  5. Conferences: Attend some data engineering conferences. You should summarize your learning and sessions from these conferences. This will be a good talking point in interviews. Do not hesitate to send a thank you note to a speaker whose session you enjoyed. This will help you build a network.
    1. I would check out the Data Council conferences.
    2. Databricks has a conference called Data + AI Summit.
    3. Big Data World has a conference called Big Data World.
  6. Podcasts [optional]
    1. Data Engineering Podcast.
    2. The Data Engineering Show.
  7. Freelancing jobs: Try to get some freelancing jobs on data engineering. This will help you build a portfolio and get some real-world experience. You can check out websites like Upwork, Freelancer, Fiveer, etc.