Incremental processing is an approach to process new or changed data in workflows. The key advantage is that it only incrementally processes data that are newly added or updated to a dataset, instead of re-processing the complete dataset. This not only reduces the cost of compute resources but also reduces the execution time in a significant manner.
The hourly processing semantics along with valid-through-timestamp watermark or data signals provided by the Data Platform toolset today satisfies many use cases, but is not the best for low-latency batch processing.
Late arriving data poses a challenge in workflows as it requires accurate processing to maintain data accuracy. The IPS solution aims to address this challenge by providing a clean and efficient solution with minimal migration and maintenance costs.
[
add
]
[
|
|
...
]