Grab Adds Real-Time Data Quality Monitoring to Its Platform
Briefly

Grab Adds Real-Time Data Quality Monitoring to Its Platform
"This engine takes topic data schemas, metadata, and test rules as inputs to create a set of FlinkSQL-based test definitions. A Flink job then executes these tests, consuming messages from production Kafka topics and forwarding any errors to Grab's observability platform. FlinkSQL was selected because its ability to represent stream data as dynamic tables allowed the team to automatically generate data filters for rules that could be efficiently implemented."
"The errors experienced by Grab came in two main types: syntactic and semantic. Syntactic issues are caused by errors in message structure. For example, a producer might send a string value for a field defined in the schema as an int, causing consumer applications to crash with deserialization errors. Semantic errors arise when the data values in the message are badly structured or are outside of acceptable limits."
"Grab, a Singapore-based digital service delivery platform, added data quality monitoring to their Coban internal platform to improve the quality of data delivered by Apache Kafka to downstream consumers. The changes are described in the company's engineering blog. "In the past, monitoring Kafka stream data processing lacked an effective solution for data quality validation," the team stated. "This limitation made it challenging to identify bad data, notify users in a timely manner, and prevent the cascading impact on downstream users from further escalating.""
Grab added data quality monitoring to its Coban platform to improve Apache Kafka data delivered to downstream consumers. Monitoring previously lacked effective validation, causing difficulty identifying bad data, notifying users promptly, and preventing cascading downstream impact. Errors occurred as syntactic issues like incorrect message types causing deserialization failures, and semantic issues where values violate format or range rules. Engineering implemented architecture for data contract definition, automated testing, and alerts centered on a test configuration and transformation engine. The engine generates FlinkSQL-based test definitions from schemas, metadata, and rules; a Flink job executes tests on production topics and forwards errors to the observability platform. FlinkSQL enabled dynamic table representations and efficient rule filter generation, and the platform simplified managing large numbers of field-specific rules.
Read at InfoQ
Unable to calculate read time
[
|
]