#custom-partitioner

[ follow ]
#rdd-api
frommedium.com
1 month ago

Spark Scala Exercise 22: Custom Partitioning in Spark RDDsLoad Balancing and Shuffle

Implementing a custom partitioner in Spark Scala allows for co-locating related keys, balancing skewed loads, and optimizing reduce-side joins, giving control over task distribution.
Data science
[ Load more ]