
"Agoda recently described how it consolidated multiple independent data pipelines into a centralized Apache Spark-based platform to eliminate inconsistencies in financial data. The company implemented a multi-layered quality framework that combines automated validations, machine-learning-based anomaly detection, and data contracts with upstream teams to ensure the accuracy of financial metrics used in statements and strategic planning, while processing millions of daily booking transactions."
"The problem emerged from a typical enterprise pattern: Agoda's Data Engineering, Business Intelligence, and Data Analysis teams had each developed separate financial data pipelines with independent logic and definitions. While this offered simplicity and clear ownership, it created duplicate processing and inconsistent metrics across the organization. As Warot Jongboondee from Agoda's engineering team explains, these discrepancies "could potentially impact Agoda's financial statements.""
Agoda consolidated multiple independent financial data pipelines into a centralized Apache Spark-based platform to eliminate inconsistencies and provide a single source of truth for sales, cost, revenue, and margin calculations. The Financial Unified Data Pipeline (FINUDP) delivers hourly updates to downstream teams and processes millions of daily booking transactions. The consolidation required stakeholder alignment across product, finance, and engineering and optimization of runtimes from about five hours to roughly thirty minutes through query tuning and infrastructure changes. A multi-layered quality framework combines automated validations, machine-learning anomaly detection, and data contracts with upstream teams, and halts processing when business-critical rules fail to prevent inaccurate financial reporting.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]