Understanding Spark Transformations and Actions – Spark RDD Operations
Perficient Digital Transformation
JANUARY 4, 2024
Resilient Distributed Dataset (RDD): Usually, Spark tasks operate on RDDs, which is fault-tolerant partitions of simultaneous operations. Narrow transformations allow Spark to execute computations within a single partition without needing to shuffle or redistribute data across the cluster.
Let's personalize your content