Articles tagged with "level-300"

Handling Errors and Retries in StepFunctions

“Everything fails all the time” has been preached to us by Werner Vogels for a few years now. Every engineer working on building and maintaining systems knows this to be true. Distributed systems come with their own kind of challenges, and one of the AWS services that help deal with those is AWS Step Functions. AWS Step Functions allow you to describe workflows as JSON and will execute those workflows for you. In this blog, we’ll explore what happens when things inevitably go wrong and the options the service offers to perform error handling and retries using an example application.

Push-Down-Predicates in Parquet and how to use them to reduce IOPS while reading from S3

Working with datasets in pandas will almost inevitably bring you to the point where your dataset doesn’t fit into memory. Especially parquet is notorious for that since it’s so well compressed and tends to explode in size when read into a dataframe. Today we’ll explore ways to limit and filter the data you read using push-down-predicates. Additionally, we’ll see how you can do that efficiently with data stored in S3 and why using pure pyarrow can be several orders of magnitude more I/O-efficient than the plain pandas version.

The beating heart of SQS - of Heartbeats and Watchdogs

Using SQS as a queue to buffer tasks is probably the most common use case for the service. Things can get tricky if these tasks have a wide range of processing durations. Today, I will show you how to implement an SQS consumer that utilizes heartbeats to dynamically extend the visibility timeout to accommodate different processing durations.