#data-pipeline

[ follow ]
frommedium.com
6 days ago
Scala

How I Made My Apache Spark Jobs Schema-Agnostic ( Part-2 )

Dynamic transformations enable flexible schema adaptations without code changes.
Using schema metadata simplifies column management, renaming, and casting.
frommedium.com
3 weeks ago
Scala

Spark Scala Exercise 24: Error Handling and Logging in SparkBuild Safe, Auditable ETL Pipelines

Build a defensive Spark ETL pipeline to ensure robust data processing.
Handle data issues like schema mismatches and corrupt records effectively.
Implement custom logging and audit trails for better failure management.
frommedium.com
1 month ago
Data science

Spark Scala Exercise 8: Working with Date-Time in SparkExtract, Transform, and Analyze

Date and time operations are vital for analysis in various sectors, enabling insights into trends and customer behavior.
fromEdcrewe
1 month ago
DevOps

Talk about Cloud Prices at PyConLT 2025

Cloud pricing involves almost 5 million SKUs across major providers, necessitating a robust data pipeline for accurate estimates.
fromHackernoon
5 years ago
JavaScript

Behind Every Question-Answer AI Is a Data Pipeline Built for Scale - Here's How to Build Your Own | HackerNoon

A data pipeline using Google Cloud services and LangChain efficiently indexes document embeddings into Redis, supporting RAG-based question-answering systems.
fromNew Relic
7 months ago
DevOps

Data pipeline observability

Data observability is critical for accurate invoicing in consumption-based pricing models.
fromInfoQ
9 months ago
Business intelligence

Canva Opts for Amazon KDS over SNS+SQS to Save 85% with 25 Billion Events per Day

Canva chose Amazon KDS over other solutions for its Product Analytics Platform due to lower costs and high performance requirements.
[ Load more ]