Incremental Processing using Netflix Maestro and Apache Iceberg
Briefly

Incremental processing is an approach to process new or changed data in workflows. The key advantage is that it only incrementally processes data that are newly added or updated to a dataset, instead of re-processing the complete dataset. This not only reduces the cost of compute resources but also reduces the execution time in a significant manner.
The hourly processing semantics along with valid-through-timestamp watermark or data signals provided by the Data Platform toolset today satisfies many use cases, but is not the best for low-latency batch processing.
Late arriving data poses a challenge in workflows as it requires accurate processing to maintain data accuracy. The IPS solution aims to address this challenge by providing a clean and efficient solution with minimal migration and maintenance costs.
Read at Medium
[
add
]
[
|
|
]