-
Notifications
You must be signed in to change notification settings - Fork 0
Journey
I began exploring Apache Spark in 2022 when we had our first baby. My journey started with the Udemy course: Udemy - Taming Big Data
This course focused heavily on RDDs and required significant effort to set up on my laptop. After completing it, I pursued the Databricks Certified Data Engineer Associate certification, using materials from the Databricks Academy along with another Udemy course: Udemy - Derar Alhussein
I successfully passed the certification exam. However, due to limited project opportunities, my Spark journey gradually faded into the background.
Revisiting Apache Spark in 2025 In March 2025, I realized that most of my work involved SQL, which pushed me to explore areas I previously found challenging in the data domain. This led me to machine learning, where I started by learning Python and Pandas. My deep-dive into data engineering truly began after mastering Python.
Now, I am revisiting Apache Spark with the following resources:
Study Materials
- 8 Steps for a Developer to Learn Apache Spark
- Spark: The Definitive Guide by Matei Zaharia
- Advanced Apache Spark Training - Sameer Farooqui
Hands-on Practice
- Visual Studio Professional Subscription
- Google Colab