Patch Your EC2 Instances Automatically using Systems Manager and Terraform

11 min readDec 5, 2023

Overview

It’s nearing the end of the year and you know what that means. A new article!

This article is part of the Advent Calendar 2023 series for Akatsuki Games’. In case you’re interested, the previous article in this series was on ‘A New Year’s gift lottery machine using Raspberry Pi Pico’ by Eichi Ito (It’s in Japanese so please use auto-translate!)

Monitoring your instances for any security flaws due to outdated binaries and keeping them patched and up to date is very important. Linux already has several built-in tools to do this and many of you probably run scheduled cron tasks to do this already

Today we’ll be exploring AWS Patch Manager which lets us automate patching using the AWS ecosystem. This allows us to integrate it with other AWS products such as Cloudwatch, SNS etc giving you more flexibility and also allows you to target instance fleets without having to directly modify the instances themselves

As always, let’s skip the filler and jump right into the main story!

What we’ll do

We’ll create an EC2 instance and use AWS SSM (Systems Manager) Patch Manager to automatically apply security patches to the instance on a schedule
We’ll have a mechanism to only target specific instances
We’ll send the outputs to Cloudwatch for auditing and debugging purposes
We’ll also auto update the SSM agent
We’ll create the entire infrastructure in Terraform so that it’s reproducible and easy to maintain in future

Prerequisites

Before we start here are some prerequisites for those following along

Knowledge

Knowledge of Terraform
Working knowledge of AWS (but any another cloud provider is fine too)
Bonus - an open mind :)

Tools

Terraform
An AWS account and AWS CLI
Your favourite IDE or a Text Editor

Note you’ll be charged for any resources you create! So if you’re following along please remember to delete unneeded resources later!

If you want to skip all the explanation and just see the code, the complete project can be found on GitHub

I’ll be using a local state but the GitHub project contains a backend.tf file that you can uncomment and modify to save your state to S3 along with state locking

Let’s start!

Setup

I’ve already explained setting up Terraform and AWS CLI in my previous articles but I’ll summarize the steps again in case this is your first time

You can skip this section if you already have the Terraform and the AWS CLI setup

Terraform

I’ll be using tfenv to manage Terraform installations. It‘s a Terraform version manager that easily allows you to install and switch between Terraform versions so I highly recommend it. It’s not required however so if you don’t want to use it, please install Terraform directly

I specify the Terraform version I want to use by specifying it in the .terraform-version file and tfenv automatically installs it and sets it as the default for the project

# .terraform-version file
1.6.5

$ terraform --version
Terraform v1.6.5

AWS CLI

Install the latest AWS CLI and configure your credentials

$ aws configure

You’ll be asked for your AWS Access Key ID and Secret Access Key, which you’ll find here

Verify your credentials are setup correctly

$ aws sts get-caller-identity
{
    "Account": "123456789012", 
    "UserId": "AR#####:#####", 
    "Arn": "arn:aws:sts::123456789012:assumed-role/role-name/role-session-name"
}

If you get an error at this point, your credentials are not configured correctly so I’d recommend checking out the AWS documentation for more information

Providers and Versions

Create a directory to store all your terraform files (I’ll refer to this as the project directory from now on) and create a providers.tf file and add the terraform and provider version constraints

terraform {
  required_version = "~> 1.0"

  required_providers {
    aws = "~> 5.0"
  }
}

Next let’s initialise Terraform inside the project directory.

$ cd ~/examples/aws-autopatching-terraform
$ terraform init

Initializing the backend...

Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Installing hashicorp/aws v5.29.0...
- Installed hashicorp/aws v5.29.0 (signed by HashiCorp)

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Create a variables.tf file for all our project variables

variable "project" {
  description = "The Project name to use for naming and tagging the resources"
}

variable "region" {
  description = "The default region to deploy to"
}

variable "profile" {
  description = "The AWS profile to use for deployment"
}
variable "public_key" {
  description = "The public SSH key to use for the EC2 instance keypair"
}

Let’s define the values for the variables. Create a terraform.tfvars.json file for this

{
  "project": "patch-example",
  "region": "us-east-1",
  "profile": "default",
  "public_key": "<YOUR_PUBLIC_KEY>"
}

Make sure to replace the value of the public_key variable with your public key which will be attached to the EC2 instance as an AWS Keypair.

Also feel free to change the other variable values like the region as needed. Alternatively you can use EC2 Session Manager to do access the instance more securely

IAM Roles and Policies

As with most of the AWS resources, you’ll need certain permissions/policies in order to allow SSM to apply patches to the EC2 instance

This is usually the most tedious part IMHO. But since no way out of it let’s go ahead and set this up first :P

Create an iam.tf file with the following policies

data "aws_iam_policy_document" "ec2-assume-role-policy-doc" {
  statement {
    sid     = "AssumeServiceRole"
    actions = ["sts:AssumeRole"]
    effect  = "Allow"

    principals {
      type        = "Service"
      identifiers = ["ec2.amazonaws.com"]
    }
  }
}

resource "aws_iam_role" "patch-example" {
  name               = "PatchExampleRole"
  description        = "Custom role for the Patch Example server"
  assume_role_policy = data.aws_iam_policy_document.ec2-assume-role-policy-doc.json

  tags = {
    Project = var.project
  }
}

# To attach to the EC2 instance
resource "aws_iam_instance_profile" "patch-example" {
  name = "PatchExampleProfile"
  role = aws_iam_role.patch-example.name
}

resource "aws_iam_role_policy_attachment" "patch-example-ssm-maintenance-window" {
  role       = aws_iam_role.patch-example.id
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonSSMMaintenanceWindowRole"
}

resource "aws_iam_role_policy_attachment" "patch-example-ssm-managed-instance-core" {
  role       = aws_iam_role.patch-example.id
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

data "aws_iam_policy_document" "ssm-send-command-policy-doc" {
  statement {
    actions   = ["ssm:SendCommand"]
    effect    = "Allow"
    resources = ["arn:aws:ssm:us-east-1::document/AWS-RunRemoteScript"]
  }
}

resource "aws_iam_role_policy" "patch-example-ssm-send-command-role-policy" {
  name   = "PatchExampleSsmSendCommandRolePolicy"
  role   = aws_iam_role.patch-example.id
  policy = data.aws_iam_policy_document.ssm-send-command-policy-doc.json
}

# Allows running SSM remote commands on EC2 instances
data "aws_iam_role" "aws-service-role-for-amazon-ssm" {
  name = "AWSServiceRoleForAmazonSSM"
}

We need to attach a few SSM related policies to the EC2 instance itself which I’m attaching to the aws_iam_role.patch-example resource and which I’ll be attaching to the EC2 as the PatchExampleProfile instance profile later

We’re using the built-in AWSServiceRoleForAmazonSSM which makes life easier. We’ll attach this to the SSM resource later

Run terraform plan and if everything looks ok then you can go ahead and If you want to try this out you can run go ahead and run terraform applyas well. Otherwise you can skip this and all subsequent terraform apply steps

EC2 Resources

Now let’s create the EC2 instance that we’ll target for autopatching

Create an instances.tf file in the project directory

locals {
  patching = {
    amazon_linux = {
      tag         = "amazon-linux"
      description = "Security Patch Tag group to target Amazon Linux instances"
    }
  }
}

resource "aws_key_pair" "patch-example" {
  key_name   = "${var.project}-server-keypair"
  public_key = var.public_key
}

resource "aws_instance" "patch-example" {
  ami                         = "ami-012261b9035f8f938" # Amazon Linux 2023 AMI 2023.2.20231113.0 x86_64 HVM kernel-6.1
  instance_type               = "t2.micro"
  key_name                    = aws_key_pair.patch-example.key_name
  iam_instance_profile        = aws_iam_instance_profile.patch-example.name
  associate_public_ip_address = true
  disable_api_termination     = false
  monitoring                  = false
  root_block_device {
    volume_type           = "standard"
    volume_size           = 10
    encrypted             = true
    delete_on_termination = true
  }
  tags = {
    Name          = var.project
    Project       = var.project
    "Patch Group" = local.patching.amazon_linux.tag
    AutoPatch     = "true"
  }
}

I’m using Amazon Linux 2023 here but you can use a different OS

You’ll notice that I’m adding a couple of special tags here

“Patch Group” = local.patching.amazon_linux.tag, we’ll use this to target this EC2 instance for autopatching
AutoPatch = “true”, we’ll use this to auto update the SSM agent later (this is optional but it helps keeping the SSM agent updated)

Now let’s run terraform plan to verify there are no errors

$ terraform plan
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create
Terraform will perform the following actions:
  # aws_instance.patch-example will be created
  + resource "aws_instance" "patch-example" {
      + ami                                  = "ami-012261b9035f8f938"
      + arn                                  = (known after apply)
      + associate_public_ip_address          = true
      + availability_zone                    = (known after apply)
      + cpu_core_count                       = (known after apply)
      + cpu_threads_per_core                 = (known after apply)
      + disable_api_stop                     = (known after apply)
      + disable_api_termination              = false
      + ebs_optimized                        = (known after apply)
      + get_password_data                    = false
      + host_id                              = (known after apply)
      + host_resource_group_arn              = (known after apply)
      + iam_instance_profile                 = (known after apply)
      + id                                   = (known after apply)
      + instance_initiated_shutdown_behavior = (known after apply)
      + instance_lifecycle                   = (known after apply)
      + instance_state                       = (known after apply)
      + instance_type                        = "t2.micro"
      + ipv6_address_count                   = (known after apply)
      + ipv6_addresses                       = (known after apply)
      + key_name                             = "patch-example-server-keypair"
      + monitoring                           = false
      + outpost_arn                          = (known after apply)
      + password_data                        = (known after apply)
      + placement_group                      = (known after apply)
      + placement_partition_number           = (known after apply)
      + primary_network_interface_id         = (known after apply)
      + private_dns                          = (known after apply)
      + private_ip                           = (known after apply)
      + public_dns                           = (known after apply)
      + public_ip                            = (known after apply)
      + secondary_private_ips                = (known after apply)
      + security_groups                      = (known after apply)
      + source_dest_check                    = true
      + spot_instance_request_id             = (known after apply)
      + subnet_id                            = (known after apply)
      + tags                                 = {
          + "AutoPatch"   = "true"
          + "Name"        = "patch-example"
          + "Patch Group" = "amazon-linux"
          + "Project"     = "patch-example"
        }
      + tags_all                             = {
          + "AutoPatch"   = "true"
          + "Name"        = "patch-example"
          + "Patch Group" = "amazon-linux"
          + "Project"     = "patch-example"
        }
      + tenancy                              = (known after apply)
      + user_data                            = (known after apply)
      + user_data_base64                     = (known after apply)
      + user_data_replace_on_change          = false
      + vpc_security_group_ids               = (known after apply)
      + root_block_device {
          + delete_on_termination = true
          + device_name           = (known after apply)
          + encrypted             = true
          + iops                  = (known after apply)
          + kms_key_id            = (known after apply)
          + throughput            = (known after apply)
          + volume_id             = (known after apply)
          + volume_size           = 10
          + volume_type           = "standard"
        }
    }
  # aws_key_pair.patch-example will be created
  + resource "aws_key_pair" "patch-example" {
      + arn             = (known after apply)
      + fingerprint     = (known after apply)
      + id              = (known after apply)
      + key_name        = "patch-example-server-keypair"
      + key_name_prefix = (known after apply)
      + key_pair_id     = (known after apply)
      + key_type        = (known after apply)
      + public_key      = "<YOUR_PUBLIC_KEY>"
      + tags_all        = (known after apply)
    }
Plan: 2 to add, 0 to change, 0 to destroy.

The EC2 instance and it’s corresponding keypair will be created

$ terraform apply
An execution plan has been generated and is shown below
...
...
Plan: 2 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.
Enter a value:

AutoPatch Resources

And finally, let’s create the SSM Patch related resources

Create a new patch.tf file

# Auto Patch all targeted instances running Amazon Linux 2023
resource "aws_ssm_patch_baseline" "patch-example" {
  name             = "patch-example-baseline"
  description      = "Amazon Linux 2023 Patch Baseline"
  operating_system = "AMAZON_LINUX_2023"
  approval_rule {
    enable_non_security = false # Set to true to install non-security updates
    approve_after_days  = 7
    patch_filter {
      key    = "CLASSIFICATION"
      values = ["*"]
    }
  }
}

resource "aws_ssm_patch_group" "patch-example" {
  baseline_id = aws_ssm_patch_baseline.patch-example.id
  patch_group = local.patching.amazon_linux.tag
}

resource "aws_ssm_maintenance_window" "patch-example" {
  name        = "patch-example-maintenance-install"
  schedule    = "cron(0 0 ? * SUN *)" # Every Sunday at 12 AM UTC
  description = local.patching.amazon_linux.description
  duration    = 3
  cutoff      = 1
}

resource "aws_ssm_maintenance_window_target" "patch-example" {
  window_id     = aws_ssm_maintenance_window.patch-example.id
  resource_type = "INSTANCE"
  description   = local.patching.amazon_linux.description

  targets {
    key    = "tag:Patch Group"
    values = [local.patching.amazon_linux.tag]
  }
}

resource "aws_ssm_maintenance_window_task" "patch-example" {
  window_id        = aws_ssm_maintenance_window.patch-example.id
  description      = local.patching.amazon_linux.description
  task_type        = "RUN_COMMAND"
  task_arn         = "AWS-RunPatchBaseline"
  priority         = 1
  service_role_arn = data.aws_iam_role.aws-service-role-for-amazon-ssm.arn
  max_concurrency  = "100%"
  max_errors       = "100%"

  targets {
    key    = "WindowTargetIds"
    values = [aws_ssm_maintenance_window_target.patch-example.id]
  }

  task_invocation_parameters {
    run_command_parameters {
      comment          = "Amazon Linux 2023 Patch Baseline Install"
      document_version = "$LATEST"
      timeout_seconds  = 3600
      cloudwatch_config {
        cloudwatch_log_group_name = aws_cloudwatch_log_group.patch-example.id
        cloudwatch_output_enabled = true
      }
      parameter {
        name   = "Operation"
        values = ["Install"]
      }
    }
  }
}

resource "aws_cloudwatch_log_group" "patch-example" {
  name              = var.project
  retention_in_days = 7
  tags = {
    Project = var.project
  }
}

# Auto Update SSM agents on existing instances
resource "aws_ssm_association" "patch-example-ssm-agent-update" {
  name                = "AWS-UpdateSSMAgent"
  association_name    = "CustomAutoUpdateSSMAgent"
  schedule_expression = "cron(0 0 ? * SAT *)" // Every Saturday at 12 AM UTC */
  max_concurrency     = "100%"
  max_errors          = "100%"

  targets {
    key    = "tag:AutoPatch"
    values = ["true"]
  }
}

We first define a Patch Baseline aws_ssm_patch_baseline - This is basically the Patch version that we want to apply to the EC2. In this example I’m only applying Amazon Linux 2023 security patches but you can go ahead and include non-security patches too
Next we create a Patch Group aws_ssm_patch_group - This associates the Patch Baseline with a String tag that we’ll use to target specific instances
The maintenance window is defined next aws_ssm_maintenance_window - This defines the schedule to run the patch updates in cron format. It’s usually a good idea to run this during weekends or during off-business hours
We then define the targets we want to apply this patch to aws_ssm_maintenance_window_target - If you only want to select a few EC2 instances you can do it using this. The key = “tag:Patch Group” is the key here (literally and figuratively :P) and if you recall we added the tag “Patch Group” = local.patching.amazon_linux.tag to the EC2 instance earlier precisely so we could target it
And to tie it all together we use define the SSM task aws_ssm_maintenance_window_task - This brings together all the previously defined SSM resources to run the Patch on the targeted EC2 instance. I’m also sending the output to Cloudwatch which will help keep track of all the patches applied in the past. Note the retention is currently set to 1 week but you can customise it as needed
Finally as a bonus we create an SSM association aws_ssm_association - This is to keep the SSM agent auto updated as well. In this case I’m only targeting EC2 instances with the tag AutoPatch = true since there might be instances might be using the SSM agent for something other than installing Patches

Let’s do a final terraform plan followed by terraform apply to create the Patch resources!

That’s it!

Verify and Cleanup

Your Patch Resources should be visible under AWS Systems Manager -> Patch Manager where you’ll be able to view the Patch install history, targets, link to the Cloudwatch logs etc

And finally let’s cleanup!

$ terraform destroy

Summary

In this article we created an autopatching EC2 solution using AWS SSM Patch Manager and Terraform. The patches were run on a predefined schedule and the logs and past Patch installations were all managed directly on AWS

The source for this project can be found on GitHub.

Thanks and see you again in the next episode!