Chris Murton

Amazon Aurora MySQL Parallel Query: Beware

Chris Murton — Sun, 30 Apr 2023 23:00:00 GMT

I'll start by saying that I'm a big fan of Amazon Aurora. Plain old RDS shares a lot of the benefits by removing the 'undifferentiated heavy lifting' of running your own database cluster on EC2 but Aurora goes further by allowing you access to extra smarts such as:

Removing the need to provision and manage storage due to the shared layer native to Aurora
Running multi-master in a single region
Running a master in one region and having continual, millisecond latency replication to other regions with the ability to initiate swift failover via Global Database
Supporting the concept of blue/green deployments of your database cluster
Vastly and opportunistically improving the performance of certain queries by offloading them from the database engine itself to the shared storage layer beneath

It is the last feature in that list that I want to talk about today.

The general recommendation from AWS before embarking on any compute migration (or greenfield deployment) is to experiment to find the sweet spot for performance vs. cost, i.e. right-sizing your infrastructure. I'm here to tell you the same is absolutely true for the Parallel Query functionality in Amazon Aurora for MySQL.

What is Parallel Query?

AWS pitch it as below:

While some databases can parallelize query processing across CPUs in one or a handful of servers, Parallel Query takes advantage of Aurora’s unique architecture to push down and parallelize query processing across thousands of CPUs in the Aurora storage layer. By offloading analytical query processing to the Aurora storage layer, Parallel Query reduces network, CPU, and buffer pool contention with the transactional workload.

Essentially rather than chew up CPU cores on the Aurora writer or reader nodes, your query will (if chosen) be pushed down to the storage layer thereby freeing up resources to handle more concurrent queries.

Why should I beware?

Whilst Parallel Query is disabled by default but as it is something which appears on the surface to be entirely beneficial (who wouldn't want increased performance and reduced CPU contention!) you may want to toggle it on. It works in conjunction with the Query Optimiser present in native MySQL and variants to decide the most optimal path to take to retrieve your query results most efficiently; the optimiser itself is relatively opaque to the client. In my experience, it can slow down your query response times and cause a significant increase in your monthly AWS spending if your schema and dataset don't fit the profile.

In my testing, this has been especially relevant on large tables with extensive column indexes. I would see queries that would depend almost exclusively on indexes be targeted for pushing down to the storage layer when it would have taken a much shorter period to complete if the query had taken the non-Parallel Query route. In some cases, the query took over 900x longer using Parallel Query rather than having this functionality disabled and resorting to traditional query optimiser behaviour.

What is also not immediately obvious is this has a direct impact on your AWS bill; Aurora charges both for compute time and IOPS to the storage layer, and whilst queries made to indexes that are already loaded in memory can be satisfied without generating read traffic to the storage subsystem, every query pushed down to storage results in read IOPS.

How to experiment

AWS document how to go about kicking the tyres on Parallel Query at https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-mysql-parallel-query.html#aurora-mysql-parallel-query-sql-explain. Different variables need adjusting depending on the major version of Aurora MySQL you are running, but enabling this at a session level allows you to experiment without downtime.

How much could it cost?

This will vary drastically based on your workload, request volume, row count and query and schema complexity.

The cost factor only gets a cursory mention in the AWS documentation:

If your Aurora MySQL cluster uses parallel query, you might see an increase in VolumeReadIOPS values. Parallel queries don't use the buffer pool. Thus, although the queries are fast, this optimized processing can result in an increase in read operations and associated charges.

In my experience, turning off Parallel Query support (after demonstrating it did not provide any performance improvement for a given dataset) reduced a production AWS bill by $4k a month.

Conclusion

There are undoubtedly use cases where Amazon Aurora Parallel Query makes sense. Examples show that in best-case scenarios it can dramatically improve query responsiveness and offload work from your Aurora cluster nodes, but it does not fit every use case, every query and every workload.

The worst-case scenario is what I experienced - slower query response times and a much larger AWS bill for the privilege of using it.

CloudWatch isn't just for the cloud

Chris Murton — Thu, 06 Apr 2017 23:00:00 GMT

Amazon CloudWatch is an inexpensive way to monitor your solution in AWS and due to the extensive set of metrics provided for every AWS service out of the box, it’s easy and quick to adopt. Granted it’s not going to replace your enterprise monitoring solution overnight, but for those who haven’t got the time or justification to spin up their own Nagios/Zabbix/OpsView solution, it can do basic event monitoring and alerting.

But there's more...

As well as Amazon CloudWatch supporting a bunch of metrics for most AWS services, it has powerful functionality that frankly isn’t talked up enough: custom metrics. Combine that fact with the very easy-to-use AWS CLI/API that you can run from anywhere and you can quickly start storing metrics for anything and everything in Amazon CloudWatch - even if it’s not in the cloud.

It's a bit chilly in the garage!

For ages, I’ve had a USB CurrentCost meter hooked up in the garage so that I can track power consumption over time. As well as power consumption in watts the meter is capable of reporting the current temperature where it’s sat, so I previously set up MRTG and latterly Cacti to collect and store these values over time so I could graph it. Yep - if I can graph it and store it for analysis I will. Blame my past as a monitoring specialist for that. Of course, this is all very nice but MRTG doesn’t natively support any form of alerting and at the time Cacti’s monitoring support felt like a bit of a hack.

As part of my drive to minimise the amount of hardware I have at home and simplify some of the random scripts I have - I started looking at alternative ways to collect, store and hopefully alert on these values when they went out of bounds. Step forward Amazon CloudWatch.

Create an IAM User

The first thing you’ll need to do is create an IAM user in your AWS account with sufficient privileges to talk to CloudWatch via the AWS API. If you do this via the wizard you can get it to give you an access key ID & secret access key which you’ll need for later.

Go ahead and create the IAM user via the AWS Management Console and then give it an inline policy (or create a managed policy) like the one below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowCloudWatchPutMetricData",
      "Action": "cloudwatch:putMetricData",
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

This gives your IAM user the ability to publish metrics into CloudWatch and nothing else.

Set up the AWS CLI

On the machine that had my USB CurrentCost meter attached, I needed to download and install the AWS CLI. Depending on your OS and distribution there are a variety of ways to achieve this; Ubuntu provide a package (apt-get install aws-cli) but for most other distributions you’re best off running this if you’ve got Python’s package manager installed, preferably inside a virtual environment:

pip install --upgrade --user awscli

Hopefully that’s got the AWS CLI all installed for you. Next pick the user that will be running the CLI to push your custom metrics to CloudWatch and sudo/su to that user. Once you’re in a shell as that user, run the following and use the credentials you obtained when you created your IAM user:

$ aws configure
AWS Access Key ID [None]: ZZZZZZZZZZZZZZZZZZZZ
AWS Secret Access Key [None]: AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTt
Default region name [None]: (any valid AWS region.. I chose eu-west-1)
Default output format [None]:

This configures the AWS CLI to use the credentials of your IAM user by default whenever it is run under that user account, by creating ~/.aws/config and ~/.aws/credentials.

Test

You should now be ready to start publishing custom metrics to Amazon CloudWatch. You can incorporate a call out to the AWS CLI in any of your custom scripts like so:

$ MONITOR_VALUE=15.1
$ aws cloudwatch put-metric-data --namespace Home/Environmentals \
    --metric-name GarageTemperatureCelsius --value ${MONITOR_VALUE}

Wrap that command into a cronjob or custom script that extracts the metric you’re interested in, replace MONITOR_VALUE with it, change the metric name to something that represents your data and also change the namespace to protect the innocent remembering it can be anything that doesn’t begin “AWS/”.

You now have your custom metric going into CloudWatch that you can create alarms against, sending notifications to SNS topics when that garage gets a bit nippy.