fromInfoQ
13 hours agoYelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
In essence, Yelp now writes terabytes of daily access logs but converts them into compact, parquet-formatted archives that are easy to query with tools like Amazon Athena. Through a process of periodic "compaction," raw plaintext log objects are merged into fewer, larger Parquet files, reducing storage usage by about 85% and cutting the number of objects by more than 99.99%. This transformation makes analytics efficient and cost-effective, enabling quick lookups for permission debugging, cost attribution, incident investigation, and data retention analysis.
Software development








