Setting Object Cache Durations for Your Amazon CloudFront Distributions

Cache expiry on Amazon CloudFront

Using Amazon CloudFront is crucial for the speed of your website. Because when you use CloudFront, it caches your content at AWS Edge locations to serve them to your users faster. For example, this blog’s original AWS region is Europe Frankfurt (eu-central-1) that is the closest region to my location. If I did not place Amazon CloudFront in front of my S3 bucket, all requests to this blog will be served from Frankfurt. As you would guess, this would cause slower pages for most of my readers all around the World.

Luckily, I have an Amazon CloudFront distribution in front of my blog. So, only the first reader close to an AWS Edge location will be served from this region. All subsequent requests around that Edge location will be served directly from the Edge location’s cache.

However, you will also need to update your website content. So, from time to time, CloudFront needs to expire your content on the Edge location’s cache, and check whether it was updated from the original location. In this blog post, I will talk about how to set caching times for the objects you serve from your CloudFront distributions.

Using the object’s ‘Cache-Control’ header, Max-Age and s-maxage directives

CloudFront supports Cache-Contol header as well as max-age and s-maxage directives for the object expiration on Edge locations. Both directives take values in seconds. But there is a slight difference between them.

  • max-age defines how long the object will be cached on the user browsers. But it is also valid for your CloudFront distribution if you do not provide an s-maxage header with your object.
  • s-maxage means shared max-age. It affects only the CloudFront distribution or other CDN you use. The browsers of your users ignore it.

Today, most browsers support the Cache-Control header and max-age directive. So, if you set a Cache-Control header with a max-age directive on an object, the browsers of your users may also cache the content accessed for the time specified there. Hence, if one of your users accesses your website for a second time during this period, she/he may be served directly from the browser cache, even without accessing the Edge location. I use specifically ‘may’ here because these are only valid if your users enable caching on their browsers, and they do not clear their caches very often.

As a best practice, I mostly use the max-age header only. However, if you need to use different caching times for browsers and edge locations, your s-maxage value should be greater than the max-age value to make sense. In that case, max-age will determine how often users’ browsers should reach the Edge locations to check whether the object was updated. Similarly, s-maxage will do the same for the Edge location to check the object on the origin. If you use a smaller s-maxage value, your users mostly by-pass the Edge location and access the origin each time the cache on their browsers will expire.

How to set an object’s ‘Cache-Control’ header on Origin

To provide a Cache-Control header, your backends should include it with a valid value in their responses to each request. If you use an S3 bucket as the origin of your website or your assets, you can set this header on the object’s metadata.

For example, if you need to set max-age to 1 day (86400 seconds), and s-maxage to 2 days (172800 seconds), you can set a Cache-Control header like below.

Cache-Contol: max-age=86400, s-maxage=172800

Of course, you can only choose to set the max-age header as well.

Cache-Contol: max-age=86400

For HTML pages, depending on your update frequency, you can set this value from a few hours to a few days. If you update often, you can disable caching by providing max-age=0. Also, if you update your pages seldomly, you can set a longer value such as a month. It is up to you.

But for assets like images, stylesheets, and scripts, a longer caching time from six months to one year would be a best practice. It will also affect your SEO positively. But how would you update your assets, then? Well, you can achieve this by renaming them differently in each update. You can append a timestamp or a string representing the object version. Some bundlers like Webpack do this automatically.

Using the object’s ‘Expires’ header

An alternative to the Cache-Contol header is the Expires header. Using the Expires header, you can set the date/time the object will expire. It is like responding with ‘Do not come back until this date! I do not expect the object will change until then.’.

I recommend using Cache-Control header because instead of an exact date/time, you set a relative time starting from the time object is downloaded from the Edge or origin. When you use the Expires header, your backend has to calculate and set the expiry date exactly in each request.

Using CloudFront distribution behaviors

Amazon CloudFront also allows you to provide default and custom behaviors for your objects. If you set them, the caching time changes for an object depending on whether you also provide Cache-Control header with it.

Differences between the default and custom behaviors on your CloudFront distribution

If you set caching times on the default behavior of your CloudFront distribution, it will be valid unless a request does not match a custom behavior. For example, you can set a custom behavior for your assets by creating a custom behavior for assets/* path on the assumption that you serve them from this path. Then, if you place this custom behavior before your default behavior, a request for the image on the assets/image1.jpg path will match this behavior and it will overwrite the default values. However, all other requests unmatched by this custom behavior will still use the default behavior.

Setting Minimum TTL, Default TTL and Maximum TTL values on behaviors

For object caching, you can provide minimum, maximum, and default time to live (TTL) values in seconds for objects using CloudFront behaviors. AWS documentation shows tables to describe how long an object will be cached on the Edge location when you use behaviors and/or object headers. But, there are only a few basic rules here.

1) Default TTL of a behavior is only valid if the object does not contain a Cache-Control or Expires header. These headers overwrite the default caching time on the CloudFront behavior if they are also within the minimum-maximum TTL limits defined for it. 2) If the object contains both Cache-Control and Expires headers, Cache-Control overwrites the Expires header. 3) If the object contains one of these headers, but it is smaller than the Minimum TTL set on the behavior, it will be cached for the Minimum TTL value. Because the headers are outside the caching limits of the CloudFront behavior. 4) Similar to rule 3, if the object contains one of these headers, but it is larger than Maximum TTL value set on the behavior, it will be cached for the Maximum TTL value.

The browsers only take Cache-Control or Expires headers into account. The caching times set on the distribution behaviors do not affect them.

If you do not want to cache your objects matching a behavior on the Edge locations, you can set all of the Minimum TTL, Default TTL, and Maximum TTL values on it to zero. Then, all requests for them will be redirected to the origin. I talked about why you may need this for your dynamic websites on my previous Serving Dynamic Websites with Amazon CloudFront post. You can read it to understand how a setting like this works.

What happens when the object cache expires on an Edge location?

We talked about how to set object caching times for your CloudFront distribution. After downloading an object from the origin, the Edge location will cache it until the caching time specified for it. If a request arrives at the same edge location after this caching period finishes, CloudFront will forward the request to the origin and ask whether the object changed.

  • If the object changed since last caching, the origin responds with the latest version of it and 200 OK status code.
  • If the object did not change since last caching, the origin returns only status code 304 Not Modified, and the Edge location will continue serving it from the cache for a new caching period.

How does the origin verify that the object changed?

The origin responds with an ETag header in each request showing the time the object was modified. CloudFront also caches this header with the object at the Edge location.

Then, after the cache expiry, it includes this header to the new request. The origin compares the ETag header it received with the one it already has and decides whether the object has changed.

What if you enable compression on your CloudFront behaviors?

For faster delivery, you can enable compression on your behaviors. Then, for certain file types, CloudFront will compress the objects before serving them. So, you do not need to compress your objects on your original location for this.

However, if you enable compression, CloudFront will only cache the compressed version and remove the ETag header returned by the origin. As a result, when the object expires on the Edge location, the origin will not be able to determine whether it changed since the last request. Therefore, it will always serve the object with the response.

Removing the cached object from Edge locations manually

If you need to remove the cached version of an object from all Edge locations before its cache expires, you can invalidate the object on your distribution using AWS Management Console or AWS CLI. You can read my previous Invalidating Paths on Your Amazon CloudFront Distributions Using AWS CLI post to learn how to do this using AWS CLI.

Conclusion

Placing a CloudFront distribution in front of origins is an improvement but not enough. You should also align the caching times on your objects using one of these methods. If you do this, the download speed and the SEO of your pages will also be affected positively.

Thanks for reading!

Need help? Let’s get in touch!

Do you need help with setting a CloudFront distribution to speed up your website delivery, and increase the user experience on your website? Then, you can get in touch with me using the contact form on Shikisoft website.

I will be happy to discuss the details to see how I can help you according to my schedule.

References

For further reference, I recommend you read more on the links below.

Emre Yilmaz

AWS Consultant • Instructor • Founder @ Shikisoft

Follow