HaMind

Category: AWS

Caching in AWS Glue Spark: Boosting Performance with Efficient Data Reuse

April 29, 2025

As a developer working with AWS Glue and Apache Spark, one of the most powerful tools in your performance optimization toolkit is caching. Caching can significantly reduce computation time and resource usage, especially in complex ETL (Extract, Transform, Load) pipelines where the same data is reused across multiple operations. In this blog, I’ll explain how…
Speeding Up Date-Based Queries in Amazon Athena: Simple Partition Tips

April 2, 2025

Amazon(AWS) Athena is a handy tool for digging into data stored in Amazon S3. When your data is split into partitions (like folders), writing smart queries can save time and money. This blog explains how to filter date-based partitions effectively, showing what works best for speed and efficiency. What’s Partition Pruning? Partition pruning is like…