Opleiding: Apache Spark Fundamentals

Get started processing data with Apache Spark and PySpark

With the rise of cloud computing, distributed storage and (big) data processing, many organisations are starting to use Apache Spark for their data processes. Whether it is for data science, data analysis or data engineering, Apache Spark can be the right tool for the job. It is a foundation under Azure Synapse Analytics, Microsoft Fabric and Databricks.
This training aims to walk you through the fundamentals of working with Apache Spark, starting with what it is and how it works. You will then continue to read, transform and write data using PySpark.
Finally, to make sure your code can be safely used in production, there will be an added focus on using development best practices.
What is Spark, where did it come from, why was it created? And how does it work?
Lessons
- History of Apache Spark
- Technical Architecture (Driver, Cluster Manager, Executors)
- RDD and Dataframe
- Pyspark
- Benefits of using Spark
- Running Spark locally

After completing this module, students will be able to:
- Explain how Spark works

To work with data, we first need to retrieve it from wherever it is located. This is done through spark.read.
Lessons
- spark.read
- read options…

Meer...
€1.610
ex. BTW
Aangeboden door
Info Support
Onderwerp
Apache Spark
Niveau
Duur
2 dagen
Looptijd
14 dagen
Taal
en
Type product
training
Lesvorm
Klassikaal
Aantal deelnemers
Min: 1
Max: 12
Tijdstip
Overdag
Tijden en locaties
Veenendaal
ma 13 jul. 2026
Veenendaal
do 13 aug. 2026
Keurmerken aanbieder
Microsoft Learning Partner
Cedeo
Cedeo Open
Cedeo Maatwerk