Deploying a Static Website via Azure

Continuous integration, deployment, and delivery of a static Jekyll site via GitHub, CircleCI, and Azure.

Posted by cryps1s on December 28, 2019 · 32 mins read

While AWS remains the market leader for public cloud providers, I have personally found Azure to be significantly more security-conscious and pleasurable to work with. As part of the learning process, I’ve slowly been migrating and deploying services onto Azure – including this blog.

This post is going to focus on deploying a simple, static website onto an Azure storage account (AWS S3 equivalent). As part of this deployment, I will front the storage account with a content delivery network (CDN), enable a valid HTTPS certificate, configure reasonable caching defaults, and set up continuous integration for deployment via CircleCI. The goal of this project is to have a remarkably secure website with minimal time, energy, and resources committed to maintaining it.

I will caveat this post by stating that I am definitively not a web developer, and most of these web technologies are outside my proficiency.

The Case for a Static Web Page

So the first question we should answer is: why do we want a static web page? There are a few compelling reasons why static web pages are so attractive:

  • Static web pages can be incredibly secure. A static web page has no moving parts, plugins, servers, or dynamic content that present attack surface. While this blog has absolutely nothing of significance or value if hacked, a static deployment means one less thing to worry about.
  • Static web pages don’t need a traditional hosting provider. You don’t need to shell out large amounts of cash for a wordpress or other hosting provider. You can simply throw your static web page in a cheap bucket and call it a day.
  • Static web pages are dead simple. There are no dynamically generated components to maintain which means every viewer receives the same viewing experience. Static web pages are typically also small and incredibly fast to load.
  • Static web pages are compatible with git and CI/CD. You can edit your web pages in markdown, manage versions through git, and auto-deploy with CI/CD. It’s a pretty magical process.

The static web page zeitgeist likely originated with the creation of Jekyll, a static site generator (SSG) which powers GitHub Pages functionality. Since Jekyll, SSGs have exploded in popularity and created a rich ecosystem of frameworks. While you can craft an artisanal static website by hand, these frameworks make it trivial to get started and deploy a new project.

There are a variety of SSG frameworks available, but Hugo, Jekyll, and Gatsby.js are perhaps the most well-known and popular. Each of these have their own language preferences, features, and benefits, but all serve the same purpose. A cross-comparison of these frameworks is outside the scope of this post (and outside of my depth of my knowledge), but I ultimately selected Jekyll for my personal blog.

Once you’ve selected a framework and found a free theme that appeals to you, you’ll need to get a local development environment ready.

Windows Development Environment

As most of my devices run with application whitelisting enabled, I needed to spin up a local development environment. If you’re not foolish enough to use application whitelisting, run another operating system (e.g. MacOS), or already have a local developer environment, this section may not be useful for you. Feel free to skip it.

For Jekyll, we’ll need a few components installed:

  • Windows Subsystem for Linux (WSL)
  • Git
  • Ruby
  • Jekyll (Ruby Gem)
  • HTML-Proofer (Ruby Gem)
  • Other ruby gems for your site (e.g. “jekyll-sitemap”, “jekyll-seo-tag”, etc.)

In my instance, I spun up a new Windows 10 developer environment in Hyper-V, but you could just as easily do this on your host.

  1. Install WSL using PowerShell.

     Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
    
  2. Install Ubuntu 16.04 LTS via PowerShell and Reboot.

     Invoke-WebRequest -Uri https://aka.ms/wsl-ubuntu-1604 -OutFile Ubuntu.appx -UseBasicParsing
     Add-AppxPackage .\app-name-as-per-above.appx
    
  3. Update WSL.

     sudo apt-get update && sudo apt-get upgrade -y
    
  4. Install Basic Tools.

     sudo apt-get install gnupg2
    
  5. Install Jekyll.

     sudo apt-add-repository ppa:brightbox/ruby-ng
     sudo apt-get update
     sudo apt-get install ruby2.5 ruby2.5-dev build-essential dh-autoreconf
     gem update
     gem install jekyll bundler html-proofer
    
  6. Create a New Site OR Install a Theme.

At the end of this process, you should have a Jekyll-compatible environment ready and either a new site, or a templated site, ready for configuration. This is the part where you actually make your blog, configure your template, and add content.

Protip: To test your local changes, you can use the jekyll serve command. This will open up a listener on http://127.0.0.1:4000 where you can preview your changes.

Configuring Your Repository

Now that we’ve got a rough skeleton for our blog, we’ll throw it in a GitHub.com repository.

Create a .gitignore file

First, we’ll create a local .gitignore file for your repository and add the following contents:

_site
.sass-cache
node_modules
.jekyll-cache/
.jekyll-metadata

This will allow us to version control the web site content without uploading the actual HTML pages. We’ll generate these from the source files as part of our CI/CD pipeline.

Upload Web Site Files

Next, we’ll commit everything and upload it to our repository. We can now version control all changes to our web site using our GitHub repository. An example of what this looks like is below.

alt text
Private GitHub repo for blog.dane.io.

We’ll come back and make some additional changes to our repository later but, for now, we’re going to move over to our Azure account and get that configured.

Configuring Azure

Protip: If you’re new to Azure, you can register for a free account and get $200 worth of credit and 12 months of some free services.

Double Protip: If you’re a Visual Studio subscriber, you get $50 a month in Azure credits in addition to software access (e.g. Windows 10, Server 2019) and other benefits. You might consider purchasing a subscription, or convincing your workplace to sponsor it, if you intend to play with Azure and the Windows platform long-term.

We’ll need an Azure account for hosting our web site. If you use Office365, you already have an Azure account. If not, you’ll need to get one. Go ahead and login or create your Azure account now.

Create the Storage Account

Note: By default, all storage accounts in Azure are encrypted using server side encryption (SSE) by Microsoft. We don’t need to do anything special for encryption at rest.

Next, we’ll create our storage account for hosting our static website. As our website content is stored in a GitHub repository, we don’t need to worry about backups, redundancy, or other availability or integrity protection mechanisms at the storage account level.

  1. Navigate to Storage Accounts within the Azure Portal.
  2. Create a new storage account with the following specifications:
    • Resource Group: New (e.g. blog_dane_io)
    • Storage Account Name: something representative (e.g. daneio). Note: this must be globally unique within Azure.
    • Location: Choose your location (e.g. US-West-2)
    • Performance: Standard
    • Account Kind: Storagev2 (general purpose)
    • Replication: Locally-redundant storage (LRS)
    • Access tier: Hot (can change this later if needed)
    • Connectivity method: Public endpoint (all networks)
    • Secure transfer required: Enabled
  3. Navigate to the storage account you created.
  4. Configure the following settings on the storage account:
    • Static website: Enabled. (Use default index.html and 404.html paths)

alt text
Storage account configuration.

alt text
Static website configuration.

We now have a storage account ready for hosting our static website content. If you use the Storage Explorer, you’ll notice that a default $web container now exists and is ready to serve up our website.

alt text
Static website container ($web).

If we only wanted to serve out of the bucket itself, we could simply configure a domain name and stop here. However, we want to do a few more things before we can call this project finished. Let’s go set up our custom domain name.

Configure Custom Domain Name

If you’re into vanity domains (and who isn’t?), you might consider using a custom domain or subdomain for your website. As I use Azure for managing my DNS, I’ll configure it to use the blog.dane.io subdomain.

  1. Navigate to DNS Zones within the Azure Portal.
  2. Create a new Resource Group and Instance (e.g. dane.io).
  3. Navigate to the newly created instance and grab the name server information:

     Name server 1: ns1-06.azure-dns.com.
     Name server 2: ns2-06.azure-dns.net.
     Name server 3: ns3-06.azure-dns.org.
     Name server 4: ns4-06.azure-dns.info.
    
  4. Update your domain registrar to point at the Azure name servers.

alt text
Validation of DNS changes.

We are now using Azure to manage our DNS. We’ll be able to create the custom records for our CDN by creating a record set within the Azure DNS console.

Enabling the Azure CDN

We’re going to front our website with an Azure content delivery network (CDN) to improve speeds, reduce bandwidth usage of our bucket, and distribute our content to geographically distributed points of presence.

  1. Navigate to your storage account you created within the Azure portal.
  2. Under “Azure CDN”, create a new endpoint with the following specifications:
    • Create new CDN profile.
    • Name: Use a representative name (e.g. daneio)
    • Pricing: Standard Microsoft.
    • CDN Endpoint Name: Use a representative name (e.g. daneio.azureedge.net) Note: this must be globally unique within Azure.
    • Origin hostname: Grab the name from the “Primary Static Website Endpoint” under the Properties tab. (Example: https://daneio.z5.web.core.windows.net)
    • Click create.

alt text
Custom origin information.

Next, we’re going to configure our DNS record to point to the CDN endpoint that we specified above (e.g. https://daneio.azureedge.net).

  1. Navigate to the DNS Zone you created within the Azure portal.
  2. Create a new record set with the following specifications:
    • Name: blog.dane.io (or whatever your own domain is)
    • Type: CNAME
    • TTL: 1 hour
    • Alias: daneio.azureedge.net.

This will redirect any requests to our subdomain (e.g. blog.dane.io) to the Azure CDN endpoint.

alt text
Successfully validated DNS record.

Once the record has been created, we’ll need to associate the domain with our Azure CDN endpoint.

  1. Navigate to the Azure CDN endpoint you created.
  2. Under “Custom Domains”, add a new custom domain with the following specifications:
    • Endpoint hostname: daneio.azureedge.net (or whatever your Azure CDN endpoint is)
    • Custom hostname: blog.dane.io (or whatever you used for the domain above)
  3. Click add.

Enabling TLS Encryption

Once we have confirmed the DNS record, we can have Azure provision and manage a digital certificate for us.

  1. Navigate to the Azure CDN endpoint you created.
  2. Under “Custom Domains”, select the custom domain you added.
  3. Enable custom domain HTTPS with a CDN-managed certificate.

Once this has been kicked off, it may take a few hours for the TLS certificate to be provisioned.

alt text
Successfully issued TLS certificate.

Configuring CDN Compression

Next, we’re going to ensure that compression is enabled for content delivered via the Azure CDN. While images are likely already compressed, we can save some bandwidth and improve delivery speed by compressing other MIME formats.

  1. Navigate to the Azure CDN endpoint you created.
  2. Under “Compression” ensure that Compression is enabled.

By default, fonts, XML, plaintext, CSV, HTML, and other MIME formats will be compressed. You may add additional MIME signatures to this list to provide compression on-the-fly.

alt text
MIME types compressed during CDN delivery.

Configuring CDN Cache

Next, we’re going to configure our CDN cache. This is especially important as assets cached via the CDN will be retained until the time-to-live (TTL) expires. If we fail to configure reasonable caching, updates to our website will be painful.

By default, Azure storage accounts set a cache on a per-object basis with a default of 7 days. While this is fine for static content (e.g. image assets, fonts), it will be a very poor experience for updates to HTML pages. While we could set the TTL for each object individually as we add it to the bucket, there is a really lazy way to solve this problem.

We’ll set the general CDN caching rule for our CDN:

  1. Navigate to the Azure CDN endpoint you created.
  2. Under “Caching rules” ensure that the query string caching behavior is set to ignore query strings.

alt text
Default caching behavior.

Next, we’ll create some custom cache rules using the rules engine. Our goal will be as follows:

  • Set the default TTL to a short-lived value (e.g. 5 minutes) for all assets. This will allow for quick updates when critical files (e.g. HTML) are changed.
  • Explicitly set the TTL to a long-lived value (e.g. 7 days) for all other static assets loaded (e.g. images, fonts, CSS).

Managing the cache in this way ensures that we can centrally adjust values instead of setting them on a per-object basis in the storage account.

To do so, perform the following:

  1. Navigate to the Azure CDN endpoint you created.
  2. Navigate to “Rules Engine”.
  3. Under the global rule click add action and add the following:
    • Always cache expiration: Override with 5 minute TTL.
  4. Click add rule and add the following:
    • Name: CacheControl
    • Logic: If URL file contains ‘/assets/’ (to lowercase), then set cache expiration override to 7 days.

In this instance, I have configured 2 specific rules:

  • Rule 1 (Global): Default assets get a 5 minute TTL.
  • Rule 2 (CacheControl): Any file delivered out of the /assets/ folder is given a 7 day TTL. We’ll use this folder for images, javascript, CSS, etc.

This combination of short and long TTLs ensures that our CDN is only delivering compressed text (e.g. HTML, CSS) on a frequent basis, but all large and static assets (e.g. images, gifs, fonts) are cached. When we make production changes to our website, it takes around 5 minutes for the HTML CDN cache to expire and be refreshed, making it a seamless user browsing experience.

alt text
Custom caching behavior rules.

Adding Basic Security Features

Note: Due to issues with the Microsoft CDN and Twitter card support, I switched over to Standard Akamai. Unfortunately, the Akamai CDN does not allow custom header manipulation. As such, I’m leaving this documentation for those who might still need to use the Microsoft CDN.

Next, we’ll want to configure a few basic security features:

While many of these are not strictly necessary given the static nature of the website, it’s fairly trivial to add and deploy. We’ll do it for completion’s sake.

We’ll start with HTTPS redirection. This is important as the CDN will not serve content over HTTP.

To do so, perform the following:

  1. Navigate to the Azure CDN endpoint you created.
  2. Navigate to “Rules Engine”.
  3. Click add rule and add the following:
    • Name: EnforceHTTPS
    • Logic: If request protocol equals HTTP, then URL redirect found (302) to protocol HTTPS.

alt text
Custom EnforceHTTPS rule.

Next, we’ll configure HSTS and a Content Security Policy for the website. HSTS ensures that browsers will only connect to the website over HTTPS, and the CSP will help prevent cross site scripting (XSS), as much of a rarity as that might be.

To do so, perform the following:

  1. Navigate to the Azure CDN endpoint you created.
  2. Navigate to “Rules Engine”.
  3. Under the global rule click add action and add the following:
    • And modify response header: Append Strict-Transport-Security with value max-age=315360000; preload.
    • Click add action again.
    • And modify responder header: Append Content-Security-Policy with the value default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'.

alt text
HSTS and CSP configured.

Next, we’ll prevent our site from being embedded on other websites (e.g. X-Frame-Options), prevent MIME sniffing (X-Content-Type-Options), and configure a referrer policy.

To do so, perform the following:

  1. Navigate to the Azure CDN endpoint you created.
  2. Navigate to “Rules Engine”.
  3. Click add rule and add the following:
    • Name: SecurityHeaders
    • Logic: If request method equals GET, then modify response header to append X-Content-Type-Options with value nosniff.
    • Click add action again.
    • Logic: Then modify response header to append Referrer-Policy with value strict-origin-when-cross-origin.
    • Click add action again.
    • Logic: Then modify response header to append X-Frame-Options with value DENY.

alt text
X-Frame-Options and HSTS.

We’ll go ahead and do a quick scan via Security Headers and validate things look good:

alt text
While not an A+, it’s good enough for Government work.

Configuring CircleCI

Note: It’s really easy to spill secrets via CircleCI and GitHub. I highly recommend you keep your repository private to reduce the likelihood of accidental misconfiguration.

Hooking CircleCI to GitHub

Once we have built our website, configured the storage account, configured the Azure CDN, and have a valid TLS certificate, we’re ready to hook everything together. We’ll first configure a CircleCI project for our GitHub repository:

  1. Authenticate to CircleCI using your GitHub account.
  2. Click Set Up Project for your website repository. Ignore the CircleCI yaml file right now.
  3. Under your new project in CircleCI, navigate to Advanced Settings. Change the following:
    • Build Forked Pull Requests: Disable this.
    • Pass Secrets to Builds From Forked Pull Requests: Disable this.

CircleCI now has a deploy key from the GitHub repository, and we’ve disabled building of forked pull requests. This is especially important if your repository is public, as adversaries can potentially steal secrets from environmental variables in your CircleCI node if these settings are enabled.

alt text
Also known as the “wreck my world” buttons.

Next, we’re going to go grab some credentials for our storage account in Azure:

  1. Navigate to the Azure storage account you created.
  2. Navigate to “Access Keys”.
  3. Copy the key specified under key1.

This is your access key. Keep it safe; anyone with access to this key will be able to do whatever they’d like to your storage account. We’re going to go ahead and give it to CircleCI so it’ll be able to modify the bucket (and pray CircleCI never has a breach).

To do so, perform the following:

  1. Navigate to the settings for your project in CircleCI.
  2. Under Build Settings, navigate to Environment Variables.
  3. Add the following two environmental variables:
    • Name: AZURE_STORAGE_ACCOUNT with value daneio (or whatever your bucket name is.)
    • Name: AZURE_STORAGE_KEY with value <paste your key here>.

alt text
Using environmental variables keeps credentials out of files in your repository.

Generating a CircleCI YAML File

Now we have CircleCI configured and ready to rock. The last step here will be generating a CircleCI YAML file for controlling when to build containers with our code. This is part art-form, part science, and may take a few (dozen) tries to get it right. I’ve included a copy of my current config.yml file below, which I’ll explain in further detail. Whether you use mine, grab a premade one, or make your own, you’ll need to throw it in your GitHub repository as .circleci/config.yml.

version: 2
jobs:
  build:
    docker:
      - image: circleci/ruby:latest
    working_directory: ~/repo
    steps:
      - checkout
      - restore_cache:
          keys:
            - rubygems-v2-\{\{ checksum "Gemfile.lock" \}\}
            - rubygems-v2-fallback
      - run:
          name: Install Dependencies
          command: |
            bundle install --jobs=4 --retry=3 --path vendor/bundle && bundle clean
      - save_cache:
          key: rubygems-v2-\{\{ checksum "Gemfile.lock" \}\}
          paths:
            - vendor/bundle
      - run:
          name: Jekyll build
          command: bundle exec jekyll build
      - run:
          name: HTMLProofer tests
          command: |
            bundle exec htmlproofer ./_site \
            --allow-missing-href \
            --allow-hash-href \
            --check-favicon \
            --check-html \
            --disable-external \
            --only-4xx
      - run:
          name: Cleanup filters
          command: |
            rm -f gulpfile.js jekyll-theme-clean-blog.gemspec LICENSE README.md package-lock.json package.json
      - persist_to_workspace:
          root: ./
          paths:
            - _site
  deploy:
    docker:
      - image: circleci/python:latest
    working_directory: ~/repo
    steps:
      - attach_workspace:
          at: ./
      - run:
          name: Install Azure CLI
          command: curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
      - run:
          name: Upload to Azure bucket
          command: az storage blob sync --source ./_site --container='$web'
workflows:
  version: 2
  Production Deployment:
    jobs:
      - build
      - deploy
      - deploy:
          requires:
            - build
          filters:
            branches:
              only: master

This YAML file is configured with two specific jobs: build and deploy.

The build job runs against every commit to the repository and performs the following:

  • A new linux container is spun up using one of Circle’s ruby images.
  • (Optional) Some caching shenanigans are used to speed up deploys.
  • The GitHub repository is checked out using the SSH deploy key.
  • (Optional) The ruby bundle for my theme is installed and cleaned.
  • Jekyll runs and builds the website. The output is saved to the _site folder locally within the CircleCI container.
  • HTMLProofer runs against the output website and looks for broken links, errors, etc.
  • Some junk files are deleted since I don’t want them hanging out in my webroot.
  • The output of the _site folder is saved for later use by the deploy job.

The deploy job only runs against changes to the master branch and performs the following:

  • A new linux container is spun up using one of Circle’s python images.
  • We re-attach the build workspace containing the contents of the _site folder.
  • We install the AzureCLI tool.
  • We use the AzureCLI tool to synchronize the storage account with the files we have locally.

Configuring GitHub Checks

The final step of this project is configuring branch protection and status checks for our GitHub repository. This will force us to use pull requests for merging to master, and force successful CircleCI builds as part of that pull request. This will hopefully prevent us from pushing something broken into production by relying on our CircleCI build jobs as a gate.

To do so, perform the following:

  1. Navigate to your GitHub repository and go to Settings -> Branches.
    • Enforce branch protection for master.
    • Enable Require status checks to pass before merging.
    • Enable Require branches to be up to date before merging
    • Select the status check for ci/circleci: build as required.
    • Select include administrators to force compliance for yourself.

alt text
Branch protections requiring a build CI Job to pass.

Example CircleCI Workflow

When properly configured, every commit to our website will automatically perform the build and identify any jekyll or HTML issues. When we feel comfortable with the final results and merge to the master branch, the deploy job will execute, updating our website on production. The synchronize command will manage all of our file uploads and deletes, making this CI job rather trivial to maintain.

To perform an update to the website, simply:

  1. Commit all changes to a new branch.
  2. Perform a pull-request to master from your branch. Pass the CircleCI build job.
  3. Merge to master and the deploy CircleCI job will run. You’re done.

alt text
Pull requests need to pass a CI job to deploy.

alt text
Successful build and deploy.

Pricing

Lastly, how much does all of this cost? Well, so far, it’s cost about $0.25 for a handful of days. Most of the costs incurred have been from figuring out the services and experimenting with CircleCI jobs (e.g. syncing lots of files to storage.)

I anticipate that (a) it will typically cost between $10 and $15 per month, and (b) some clown will likely decide to try and drive up the costs substantially through malicious abuse.

Luckily, you can set a set a spending limit on Azure subscriptions to prevent costs from going through the roof. We’ll see how this shakes out after a month or two of operation.

alt text
Temporary pricing chart.

Further Reading and Acknowledgements

  • BlackRock Digital for my Jekyll theme.
  • Scott Helme has a great website that discusses HTTP security headers.
  • Jessica Deen for helping solve some CDN/Twitter shenanigans.
  • CircleCI, Jekyll, and Azure documentation were stellar and assisted quite a bit on this project.
  • I cobbled my CircleCI yaml file together from quite a few sources which I neglected to document. Thanks to whoever you were.