Amazon RDS is the managed relational database solution of AWS. You leave the setup and maintenance of your database to AWS and focus on building applications using it. You can launch and maintain community edition MySQL, PostgreSQL databases as well as commercial Oracle and SQL Server databases on Amazon RDS. However, a few years ago, AWS developed its own cloud-native, enterprise-level database engine called Amazon Aurora, which provides MySQL and PostgreSQL compatibility. In this post, I will discuss some of Aurora’s unique features and why you should use it instead of an Amazon RDS DB instance with community edition MySQL or PostgreSQL databases.
Recently, AWS also launched the serverless and multi-master versions of Amazon Aurora, and any of these features can alone be the reason to choose it. However, in this post, we will focus on single-master Aurora deployment and its advantages over RDS.
What is Amazon Aurora?
First of all, Amazon Aurora is neither MySQL nor PostgreSQL. It is a different, cloud-native database engine developed by AWS providing versions compatible with these two databases. You also launch and manage your Aurora database clusters with the help of Amazon RDS. This is why you see it on Amazon RDS Console. Besides, Aurora provides all the advantages of RDS. Hence, we can consider it as an enhanced version of Amazon RDS.
Let’s say you need to create a MySQL or a PostgreSQL database on AWS. I generally see that many clients have already been using Amazon RDS and even do not consider Aurora. However, Aurora provides additional performance, reliability, and durability features to you, especially if you are developing a crucial, enterprise-level application. Therefore, although it is slightly a few bucks more expensive per hour, I think it will be more feasible in the long term, and its benefits are worth the cost. So let’s discuss what makes Aurora special.
Aurora’s storage is fault-tolerant by design
When you use regular Amazon RDS, the architecture is similar to installing it on Amazon EC2 manually but leaving the provisioning and maintenance to AWS. Of course, RDS provides many features like automatic failovers, backups, etc. However, in reality, you have an instance and an EBS volume attached to it.
To achieve reliability on this architecture, you need to enable the Multi-AZ feature on your RDS instance and replicate it synchronously to a standby replica in another Availability Zone. So you gain two copies of your database in two availability zones. You can see the diagram of the regular RDS below.
However, Aurora provides more reliability in terms of storage. Its database storage is separate from the instances. In Aurora, your data has 6 copies as 10GB chunks distributed to 3 Availability Zones. Hence, even if you have only one Aurora instance, your data will still have 6 copies.
Besides, Aurora scans each copy of data nodes regularly and corrects them using one of the remaining copies if there is a failure in it. Your database storage is reliable and fault-tolerant by design.
Aurora’s performance is better and more consistent than Amazon RDS
When you use community edition MySQL or PostgreSQL, the performance degrades in time if the load increases. This happens because of the synchronous replication between replicas. However, due to its unique storage design, Aurora’s performance stays consistent when the load increases. AWS even says that Aurora has at least 3x more performance than the community edition PostgreSQL and 5x more than the community edition MySQL.
Aurora writes logs directly to the storage without keeping log buffers. The replication to the replicas is asynchronous and for only cached data. Because the replicas also share the same storage cluster, the replica lag is small and consistent over time.
In Aurora, the main reason for using a read replica is to achieve high-availability and sometimes read scalability. Because as I said before, the storage cluster is separate from your DB instances. You don’t need a read replica to replicate your data. Your data is durable by design.
More read replicas & reader endpoints on Aurora
On Amazon Aurora, you can scale your read queries by creating 15 read replicas whereas 5 in Amazon RDS community edition MySQL or PostgreSQL versions.
As you know, on Amazon RDS, there is a cluster endpoint which you use for your write queries. It is the DNS endpoint pointing to your current master db instance. During a failover, RDS routes this endpoint to the new master by a simple DNS change. However, for read replicas, you have to balance the load in your application using the instance endpoints. Regular RDS does not provide a load balancer for read replicas.
However, on Aurora, you still use the cluster endpoint for your write queries. But it also provides a reader endpoint acting as a load balancer for your read replicas. So you can use this endpoint for your read queries and create 15 read replicas behind it. In the case of a failover, one of the read replicas become master and is removed from this reader set.
Failover time is faster on Aurora
On regular Amazon RDS, because of the native storage structure of both community edition DB engines, failover time depends on the load and deteriorates in time. This mainly happens because of log buffers. Unlike regular RDS, Aurora does not keep a log buffer on the instances. Instead, Aurora keeps the logs at the storage which is separate from the DB instances. Therefore, during a failover, only the DNS propagation matters.
Besides, DNS propagation takes around 30 seconds. So the failover is also faster, providing more high availability to your applications.
What about security and backups?
Both regular RDS and Aurora share the same security and backup features of Amazon RDS. Hence, you still achieve security using the same tools like security groups, IAM authentication, encryption at rest and in transit, etc. Nothing is different in terms of security.
Both Aurora and community edition versions of RDS provide point in time recovery and 1 to 35 days of backup retention. So the backup feature is also the same. This is because Aurora is just a different engine managed by Amazon RDS. Hence, we actually compare its differences to community edition MySQL and PostgreSQL engines here.
Conclusion
Aurora’s unique storage design provides many advantages when compared to community edition RDS databases. Although there is a small increase in hourly bills when compared to RDS, I highly recommend using it for your enterprise-level applications.
For those looking for more technical details about the storage design, how logging handled, I recommend watching the ReInvent::2018 deep dives on Aurora. I believe you will find lots of useful information there.
Thanks for reading!