Functional Elegance: Making Spark Applications Cleaner with the Cats Library

from Medium 2 months ago

From my perspective, this code is not readable because it mixes the process of dataset transformation with metadata counting - and worse, it's almost impossible to rewrite within a method.
Mediumhttps://medium.com/appsflyerengineering/functional-elegance-making-spark-applications-cleaner-with-the-cats-library-0f9da61ff065

Such code is difficult to maintain and reuse. If you want to read the MedCleanData dataset and store the same metadata elsewhere in the project, you will have to copy not only the code with the dataset transformations, but also the code for calculating the metadata separately.
Mediumhttps://medium.com/appsflyerengineering/functional-elegance-making-spark-applications-cleaner-with-the-cats-library-0f9da61ff065

This function should obviously be decomposed into several smaller functions, so that each separate function returns a pair from the dataset and the collected metadata.
Mediumhttps://medium.com/appsflyerengineering/functional-elegance-making-spark-applications-cleaner-with-the-cats-library-0f9da61ff065

Every small function transformer (readMedCleanDataset or addCustomColumns) can be reused in other parts of the project.
Mediumhttps://medium.com/appsflyerengineering/functional-elegance-making-spark-applications-cleaner-with-the-cats-library-0f9da61ff065

Read at Medium

#code-readability #software-maintainability #decomposition #single-responsibility-principle #composability

[

]

[

...

]

Functional Elegance: Making Spark Applications Cleaner with the Cats LibraryFunctional Elegance: Making Spark Applications Cleaner with the Cats Library Briefly

Functional Elegance: Making Spark Applications Cleaner with the Cats Library
Functional Elegance: Making Spark Applications Cleaner with the Cats Library
Briefly