Since 2011, this blog has been served on a variety of platforms, including Ghost hosted on Heroku. However, I’ve been wanting to serve this blog as a static site from an S3 bucket, since it seems like overkill to provision and maintain a postgres DB (required by Ghost). Which is how I landed on Hugo.

Rather than focusing on the Ghost -> Hugo conversion process, this post will cover what I learned while deploying this blog on S3, CloudFront, ACM, and Route53, along with some “gotcha” stumbling points along the way.

Also, as a small disclaimer: This is a stack combo I chose as a software learning exercise around the integration and automation of these services. Depending on the use case and end users, it’s not something I’d necessarily recommend as a replacement for WYSIWYG blog platforms.

Why CloudFront?

CloudFront provides some helpful integration “glue” with AWS Certificate Manager (ACM), which stores and maintains SSL/TLS certs. ACM also integrates nicely with Route53, for ease of domain verification.

Another bonus is that ACM automatically renews SSL/TLS certs ongoing, if it’s set up with DNS verification.

That seems especially convenient when I recall past years of:

  • running cert provisioning / renewal commands
  • writing / managing automation scripts to periodically run those commands

This combo of integrations should help future-proof my blog against surprise SSL/TLS cert expirations. Whoo!

Trade-offs

By default, S3 content is cached by CloudFront for 24 hours. So to make the freshest updates available to visitors upon deploy, I’d have to take an extra step. Either:

  • invalidate the cache for the page(s) I wish to update (low lift)
  • version the filenames (high lift)

I don’t plan to make a high volume of updates to this blog, so the cache invalidation (1000 for free each month) will suit my needs for now.

S3 Baseline

First I got hugo deploy set up with my S3 bucket, and verified that uploads were working. Then, to ensure the blog would render as expected, I enabled S3 Website Hosting and hit the bucket endpoint via my browser. Rad, the blog rendered as expected! In order to use my custom domains with https, I would need to introduce CloudFront, ACM, and Route53 into the mix.

The road to 200 is paved with 403s

I looked at this documentation to begin serving the site via CloudFront: How do I use CloudFront to serve a static website hosted on Amazon S3?

Per those docs, it seemed like the recommended path was to use the Origin Access Control (OAC) pattern on S3, rather than S3 Website hosting. This provides more granular access controls between a CloudFront distribution and S3. There are a few other differences, which you can read about here: Website Endpoints. So I disabled S3 Website Hosting, in favor of the OAC path.

However, when I hit the apex domain (https://lorainekv.com), I was getting 403 errors. With so many systems in the mix (Route53, CloudFront, S3), it can be hard to determine where the issue is - AWS has a whole list of docs to troubleshoot where those 403s might be coming from.

Finding the “Seams” Between Services

Since I’d decided to go with the more restrictive OAC access pattern (rather than the public S3 website hosting pattern), I no longer had a publicly-accessible S3 endpoint to hit. So I began my troubleshooting one level up, with the CloudFront distribution domain. (This is an auto-generated URL you receive after provisioning a CloudFront distribution), and would help me test the connection between CloudFront and the S3 origin.

CloudFront’s Default Root Object

Sadly the CloudFront URL returned a 403. So I ran through this list of possible “gotchas” with CloudFront to S3 REST API and came across

If clients request the root of your distribution, then you must define a default root object.

Aha! I needed to tell CloudFront to send visitors to the top-level index.html page in my S3 bucket.

I went back to my CloudFront distribution and set Default Root Object to index.html.

Once the distribution was deployed, the CloudFront URL resolved with 200. Whoo!

Maintain Parity between Alternate Domain Names, CNAME records, and SSL/TLS Cert Domains

Once my CloudFront URL was up and running, I began testing https and http variations of www.lorainekv.com and the apex domain, lorainekv.com.

I noticed that the www subdomain was still resolving in a 403 error from CloudFront. I was puzzled, since I’d already added that to my DNS CNAME records and to the SSL/TLS cert.

After looking through this list of CloudFront 403s, I realized that I’d forgotten to add www.lorainekv.com to the distribution’s list of Alternate Domain Names.

Per the docs:

If you entered Alternate domain names (CNAMEs) for your distribution, then the CNAMEs must match the SSL certificate that you select

Once I updated that list, the www subdomain returned 200.

CloudFront + Hugo Gotcha: Origin Access Control (OAC) vs S3 Website Hosting

TLDR: For default Hugo compatibility reasons, I ended up re-enabling S3 Website Hosting + restricted access with a Referer header. See why below:

Once I got the apex domain up and running with the blog homepage displaying, I noticed that all of the individual blog post URLs, such as https://lorainekv.com/post/filing-a-pronunciation-bug-ticket-with-apple/ returned 404s! Why?

I took a look at how Hugo stores blog post files, and noticed that each post is nested as an index.html file under some-post-path directory. I noticed that appending index.html to the paths worked: https://lorainekv.com/post/filing-a-pronunciation-bug-ticket-with-apple/index.html. Did that mean I’d have to override all of Hugo’s blog urls to use OAC?

After doing some Googling, I ran across this Hugo support conversation, where the OP had already done a significant amount of digging and corresponding with AWS support on this same issue. Turns out, Hugo relies on implicit loading of index.html files. If an S3 bucket is configured to act as a static website host, then you get that implicit behavior…but you don’t when using the OAC option. I didn’t want to override Hugo’s default behavior, so I followed the AWS support suggestion and ultimately followed this path: Using a website endpoint as the origin, with access restricted by a Referer header

With custom headers enabled, only traffic from CloudFront should resolve. Hitting bare S3 links like http://lorainekv.com.s3-website-us-west-2.amazonaws.com/post/filing-a-pronunciation-bug-ticket-with-apple/index.html should now always return 403s.

Automation and alerting considerations

Now that I’d learned more about the nuances of this system, I thought about how I might go about automating blog updates and using Infrastructure as Code (IaC) with Terraform to provision and maintain these resources:

  1. Running hugo && hugo deploy before Terraform. This would help ensure the hosting bucket is up-to-date with all the latest static blog files.

  2. During the Terraform run, ensure custom domain list parity by referencing a list with all the custom domain names in the following areas:

  • the CloudFront distribution aliases
  • provisioning CNAME records and mapping them to distribution URLs
  • setting the domain_name and subject_alternative_names on ACM certs

This might look something like:

variables.tf

variable "custom_domains" {
  type    = list(string)
  default = ["lorainekv.com", "www.lorainekv.com"]
}
  1. Automate CloudFront invalidation on anything that needs to be served fresh

Alerting

To ensure ongoing availability, it would also be nice to have alerts set up for:

  • Uptime on
    • apex url
    • subdomains
    • CloudFront distribution url
  • Bare S3 endpoint URLS should return 403s

Overall, it was a fun experience to get these services to play together nicely. It was somewhat disappointing to learn about the Hugo + OAC gotcha later in the process - I could have discovered it sooner, had I clicked into an individual post while testing out the CloudFront URL layer. Luckily, the Hugo support thread provided ample context on how to adapt.