Data Processing with Spark
DataFrames allow for powerful data processing using a fluent API.
Filtering and Aggregation
df.filter($"id" > 1).groupBy("name").count().show()
Reading from CSV
val csvDf = spark.read.option("header", "true").csv("data.csv")
Writing Output
csvDf.write.mode("overwrite").parquet("output")
Spark supports many formats: CSV, Parquet, JSON, and more.