AWS Glue 5.0 Introduces Spark 3.5.2 and Enhanced ETL Performance
Briefly

Amazon's AWS Glue 5.0, introduced at re:Invent, enhances serverless data integration with updated runtimes such as Spark 3.5.2 and Python 3.11. The release focuses on improving ETL job speeds and security while simplifying data handling across diverse sources. Key features include support for advanced open table formats like Apache Iceberg and Delta Lake, automatic partition pruning, and native Amazon S3 access. Performance tests showed a 58% speed increase and a 36% reduction in costs for data integration workloads compared to the previous version, AWS Glue 4.0.
AWS Glue 5.0 improves the price-performance of your AWS Glue jobs, observing 58% faster TPC-DS tests compared to AWS Glue 4.0 while reducing costs by 36%.
The latest release of AWS Glue 5.0 introduces upgraded runtimes and performance enhancements, designed to simplify the process of preparing and integrating data.
Read at InfoQ
[
|
]