stutstut.dev

My Server Setup Part 4: Continuous Delivery

In this series of posts I am describing the infrastructure that hosts all of my websites and development projects.

Summary

GitHub

This is where I’ve chosen to keep the source for my projects. Each service has its own repository, and the common feature is a Dockerfile in the root that will build the deployable image. No other build steps can be required, everything needs to happen in the Dockerfile for this process to work.

What this means in practice is that most of the Dockerfiles are multi-stage builds resulting in the smallest image possible that only containing files that are required to run the task.

Automatic tags

The rest of the automated build and deployment process is triggered by a new tag being created. To also automate that I use a GitHub action to create a tag whenever the main/master branch receives a merge.

name: auto_tag
on:
  push:
    branches:
      - main
jobs:
  tag_repo:
    name: Tag the main branch with the current date and time
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Get datetime
        id: date
        run: echo "::set-output name=date::$(date +'%Y%m%d%H%M%S')"
      - name: Tag and push
        run: |
          git tag "${{ steps.date.outputs.date }}"
          git push origin "${{ steps.date.outputs.date }}"          

This is really simple so I won’t describe it in detail.

Example tag: 20230103082900

DockerHub

I have a paid account at DockerHub so I can have private repositories. For each GitHub repository I want deployed to the cluster I have a DockerHub repository pointed at the GitHub repository.

The build settings are set to watch for new tags consisting of only numbers, i.e. the tags created by the action above. When it detects a new tag it will kick off a build, and will name the resulting image’s tag as release-{tag}. This process is pretty seemless and rarely fails, but if it does DockerHub will email me with details.

Nomad job definitions

Noman is the next part of the process, but before we cover that it’s important to understand how the Nomad jobs are defined in the Terraform files. Specifically where the image tags come from.

Nomad has support for consul-template built in. This has several useful features. The first is that it allows us to define a file template within the job that will be immediately read back into the evaluation engine as environment variables. The second is that Nomad will then watch any Consul keys being used for changes, and will trigger a re-evaluation should any of the values change.

Finally, we can use an environment variable as the image tag for the job. Hopefully you can see where this is going, but I will spell it out so it’s clear.

The template in the job:

{% raw %} template {
  data = <<EOF
{{ range ls "configs/stut.dev" }}
{{.Key}}={{.Value}}
{{ end }}
EOF
  destination = "local/env"
  env         = true
  change_mode = "restart"
} {% endraw %}

The data here is converting every key at a particular location in Consul into environment variables. For CD purposes we only need the one named IMAGE_TAG but we grab them all to be a bit more flexible.

The change_mode specifies that if any of the values change this job should be rstarted.

In the task config we then have this:

image = "stut/stut.dev:$${IMAGE_TAG}"

This uses the IMAGE_TAG environment variable to specify the image tag to use. You should now be able to guess what the Noman service is doing.

Noman

It’s a tiny Go service that presents a public HTTPS endpoint. Each service has a secret key which creates a unique URL at that endpoint. The DockerHub repo has that URL configured as a webhook.

When a new image has been successfully built the webhook passes the details through to Noman. Noman takes that information and updates the corresponding IMAGE_TAG key in Consul which triggers Nomad to restart the job. The restarted job will use the new image tag.

Outstanding issues

There is only one major issue that this seems to have. The change_mode in the temlate is set to restart which causes the task to restart. This is great, except that every instance of the task across the cluster will restart at more-or-less the same time. This bypasses the normal upgrade specification which, in my case, is set to do a rolling upgrade so the task never goes offline. Currently this results in ~10 seconds of downtime whenever a new version is deployed.

I’m yet to find a solution to this but please hit me up if you have one.

Conclusion

From a commit or merge into the main/master branch on the GitHub repository, the rest of the process is completely automated and each part of it will notify me if something goes wrong.

As a conclusion to this series of posts I can happily report that I’m very pleased with what I have here and have found that it rarely falls short of what I need it to be. I’m sure over time my requirements will evolve but I’m confident that these foundations will be able to keep up.