Using Amazon CloudFront is crucial for the speed of your website. Because when you use CloudFront, it caches your content at AWS Edge locations to serve them to your users faster. For example, this blog’s original AWS region is Europe Frankfurt (eu-central-1) that is the closest region to my location. If I did not place Amazon CloudFront in front of my S3 bucket, all requests to this blog will be served from Frankfurt. As you would guess, this would cause slower pages for most of my readers all around the World.
Luckily, I have an Amazon CloudFront distribution in front of my blog. So, only the first reader close to an AWS Edge location will be served from this region. All subsequent requests around that Edge location will be served directly from the Edge location’s cache.
However, you will also need to update your website content. So, from time to time, CloudFront needs to expire your content on the Edge location’s cache, and check whether it was updated from the original location. In this blog post, I will talk about how to set caching times for the objects you serve from your CloudFront distributions.
Using the object’s ‘Cache-Control’ header, Max-Age and S-Max-Age directives
Cache-Contol header as well as
s-max-age directives for the object expiration on Edge locations. Both directives take values in seconds. But there is a slight difference between them.
max-agedefines how long the object will be cached on the user browsers. But it is also valid for your CloudFront distribution if you do not provide an
s-max-ageheader with your object.
s-max-agemeans shared max-age. It affects only the CloudFront distribution or other CDN you use. The browsers of your users ignore it.
Today, most browsers support the
Cache-Control header and
max-age directive. So, if you set a Cache-Control header with a max-age directive on an object, the browsers of your users may also cache the content accessed for the time specified there. Hence, if one of your users accesses your website for a second time during this period, she/he may be served directly from the browser cache, even without accessing the Edge location. I use specifically ‘may’ here because these are only valid if your users enable caching on their browsers, and they do not clear their caches very often.
As a best practice, I mostly use the
max-age header only. However, if you need to use different caching times for browsers and edge locations, your
s-max-age value should be greater than the
max-age value to make sense. In that case, max-age will determine how often users’ browsers should reach the Edge locations to check whether the object was updated. Similarly, s-max-age will do the same for the Edge location to check the object on the origin. If you use a smaller s-max-age value, your users mostly by-pass the Edge location and access the origin each time the cache on their browsers will expire.
How to set an object’s ‘Cache-Control’ header on Origin
To provide a Cache-Control header, your backends should include it with a valid value in their responses to each request. If you use an S3 bucket as the origin of your website or your assets, you can set this header on the object’s metadata.
For example, if you need to set max-age to 1 day (86400 seconds), and s-max-age to 2 days (172800 seconds), you can set a Cache-Control header like below.
Of course, you can only choose to set the
max-age header as well.
For HTML pages, depending on your update frequency, you can set this value from a few hours to a few days. If you update often, you can disable caching by providing
max-age=0. Also, if you update your pages seldomly, you can set a longer value such as a month. It is up to you.
But for assets like images, stylesheets, and scripts, a longer caching time from six months to one year would be a best practice. It will also affect your SEO positively. But how would you update your assets, then? Well, you can achieve this by renaming them differently in each update. You can append a timestamp or a string representing the object version. Some bundlers like Webpack do this automatically.
Using the object’s ‘Expires’ header
An alternative to the Cache-Contol header is the
Expires header. Using the
Expires header, you can set the date/time the object will expire. It is like responding with ‘Do not come back until this date! I do not expect the object will change until then.’.
I recommend using
Cache-Control header because instead of an exact date/time, you set a relative time starting from the time object is downloaded from the Edge or origin. When you use the
Expires header, your backend has to calculate and set the expiry date exactly in each request.
Using CloudFront distribution behaviors
Amazon CloudFront also allows you to provide default and custom behaviors for your objects. If you set them, the caching time changes for an object depending on whether you also provide Cache-Control header with it.
Differences between the default and custom behaviors on your CloudFront distribution
If you set caching times on the default behavior of your CloudFront distribution, it will be valid unless a request does not match a custom behavior. For example, you can set a custom behavior for your assets by creating a custom behavior for
assets/* path on the assumption that you serve them from this path. Then, if you place this custom behavior before your default behavior, a request for the image on the
assets/image1.jpg path will match this behavior and it will overwrite the default values. However, all other requests unmatched by this custom behavior will still use the default behavior.
Setting Minimum TTL, Default TTL and Maximum TTL values on behaviors
For object caching, you can provide minimum, maximum, and default time to live (TTL) values in seconds for objects using CloudFront behaviors. AWS documentation shows tables to describe how long an object will be cached on the Edge location when you use behaviors and/or object headers. But, there are only a few basic rules here.
Default TTL of a behavior is only valid if the object does not contain a
Expires header. These headers overwrite the default caching time on the CloudFront behavior if they are also within the minimum-maximum TTL limits defined for it. 2) If the object contains both
Cache-Control overwrites the
Expires header. 3) If the object contains one of these headers, but it is smaller than the
Minimum TTL set on the behavior, it will be cached for the
Minimum TTL value. Because the headers are outside the caching limits of the CloudFront behavior. 4) Similar to rule 3, if the object contains one of these headers, but it is larger than
Maximum TTL value set on the behavior, it will be cached for the
Maximum TTL value.
The browsers only take
Expires headers into account. The caching times set on the distribution behaviors do not affect them.
If you do not want to cache your objects matching a behavior on the Edge locations, you can set all of the
Default TTL, and
Maximum TTL values on it to zero. Then, all requests for them will be redirected to the origin. I talked about why you may need this for your dynamic websites on my previous Serving Dynamic Websites with Amazon CloudFront post. You can read it to understand how a setting like this works.
What happens when the object cache expires on an Edge location?
We talked about how to set object caching times for your CloudFront distribution. After downloading an object from the origin, the Edge location will cache it until the caching time specified for it. If a request arrives at the same edge location after this caching period finishes, CloudFront will forward the request to the origin and ask whether the object changed.
- If the object changed since last caching, the origin responds with the latest version of it and
200 OKstatus code.
- If the object did not change since last caching, the origin returns only status code
304 Not Modified, and the Edge location will continue serving it from the cache for a new caching period.
How does the origin verify that the object changed?
The origin responds with an
ETag header in each request showing the time the object was modified. CloudFront also caches this header with the object at the Edge location.
Then, after the cache expiry, it includes this header to the new request. The origin compares the ETag header it received with the one it already has and decides whether the object has changed.
What if you enable compression on your CloudFront behaviors?
For faster delivery, you can enable compression on your behaviors. Then, for certain file types, CloudFront will compress the objects before serving them. So, you do not need to compress your objects on your original location for this.
However, if you enable compression, CloudFront will only cache the compressed version and remove the
ETag header returned by the origin. As a result, when the object expires on the Edge location, the origin will not be able to determine whether it changed since the last request. Therefore, it will always serve the object with the response.
Removing the cached object from Edge locations manually
If you need to remove the cached version of an object from all Edge locations before its cache expires, you can invalidate the object on your distribution using AWS Management Console or AWS CLI. You can read my previous Invalidating Paths on Your Amazon CloudFront Distributions Using AWS CLI post to learn how to do this using AWS CLI.
Placing a CloudFront distribution in front of origins is an improvement but not enough. You should also align the caching times on your objects using one of these methods. If you do this, the download speed and the SEO of your pages will also be affected positively.
Thanks for reading!
Need help? Let’s get in touch!
Do you need help with setting a CloudFront distribution to speed up your website delivery, and increase the user experience on your website? Then, you can get in touch with me using the contact form on Shikisoft website.
I will be happy to discuss the details to see how I can help you according to my schedule.
For further reference, I recommend you read more on the links below.