How to sync containers from GitHub Container Registry to AWS ECS

How to sync containers from GitHub Container Registry to AWS ECS

Back in June last year I wrote about syncing containers from DockerHub to AWS ECS.

How to sync containers to AWS ECS the easy way

This was used to cache containers locally within an AWS Account that was running GitHub Actions Self-Hosted Runners.

While I still need this functionality more containers are making their way into the GitHub Container Registry and of course, that requires a few tweaks to our Cloudformation Stack.

This work was prompted by the following issue in the super-linter public repo

The solution

The solution works exactly as before with DockerHub however to retrieve a container from a Public GitHub Repo you need to be authenticated via a PAT with the following privilege read:packages.

NOTE:- I had all sorts of fun with CodeBuild and the semicolons as you will see from the variable substitutions :)

The Cloudformation Stack

The stack requires two parameters

  • ImageRepoName - DockerHub Image Repo name - super-linter/super-linter
  • GitHubUserName - Username of GitHub User - jonesy1234
  • GitHubPat - Personal Access Token of GitHub User. Only requires read:packages permission - ghp_**********
  • GitHubCr - GitHub Container Registry host. Defaults to - ghcr.io
---
AWSTemplateFormatVersion: 2010-09-09
Description: Creates a CodeBuild project to sync containers from GitHub Container Registry

Parameters:

  ImageRepoName:
    Description: 'Target Image Repo, this name should be identical to the one hosted on GitHub i.e. super-linter/super-linter'
    Type: String

  GitHubUserName:
    Description: 'Username of GitHub User'
    Type: String

  GitHubPat:
    Description: 'Personal Access Token of GitHub User. Only requires read:packages permission'
    Type: String

  GitHubCr:
    Description: 'GitHub Container Registry host. Defaults to ghcr.io'
    Type: String
    Default: 'ghcr.io'

Resources:
  Repository:
    Type: AWS::ECR::Repository
    Properties:
      RepositoryName: !Ref ImageRepoName
      ImageScanningConfiguration:
        ScanOnPush: true

  CodeBuild:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Join
        - '-'
        - - !Ref 'AWS::StackName'
          - CodeBuild
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          -
            Action: 'sts:AssumeRole'
            Effect: Allow
            Principal:
              Service:
                - codebuild.amazonaws.com
      Policies:
        - PolicyName: ECR
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Action:
                  - ecr:BatchCheckLayerAvailability
                  - ecr:CompleteLayerUpload
                  - ecr:GetAuthorizationToken
                  - ecr:InitiateLayerUpload
                  - ecr:PutImage
                  - ecr:UploadLayerPart
                  - ecr:ListImages
                Effect: Allow
                Resource: '*'
        - PolicyName: LogGroup
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Action:
                  - logs:CreateLogGroup
                  - logs:CreateLogStream
                  - logs:PutLogEvents
                Effect: Allow
                Resource: !Sub 'arn:aws:logs:ap-southeast-2:${AWS::AccountId}:log-group:/aws/codebuild/*'

  GitHubCrSync:
    Type: AWS::CodeBuild::Project
    Properties:
      Artifacts:
        Type: NO_ARTIFACTS
      Source:
        Type: NO_SOURCE
        BuildSpec: |
            version: 0.2

            phases:
              pre_build:
                commands:
                  - echo "Logging in to Amazon ECR..."
                  - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
                  - echo "Logging in to GHCR"
                  - echo "${GITHUB_PAT}" | docker login "${GITHUB_CR}" --username ${GITHUB_USER} --password-stdin
              build:
                commands:
                  - COLON=$(python -c "print(chr(58))")
                  - echo "Retrieving Docker image tags..."
                  - EXISTING_TAGS=$(aws ecr list-images --repository-name "${IMAGE_REPO_NAME}" | jq -r '[.imageIds[].imageTag] | unique | join("\n")')
                  - echo "The following tags exist already in ECR - $EXISTING_TAGS"
                  - TOKEN=$(curl -u ${GITHUB_USER}:${GITHUB_PAT} https"${COLON}"//"${GITHUB_CR}"/token\?scope\="repository:${IMAGE_REPO_NAME}:pull" | jq -r .token)
                  - REMOTE_TAGS=$(curl https"${COLON}"//"${GITHUB_CR}"/v2/"${IMAGE_REPO_NAME}"/tags/list -H 'Authorization'"${COLON}"' Bearer '"${TOKEN}" | jq -r '.tags | sort_by(.) | join("\n")' | sed s'/"//g')
                  - echo "The following tags exist at the source - $REMOTE_TAGS"
                  - TAG_LIST=$(echo "${EXISTING_TAGS} ${REMOTE_TAGS}" | tr ' ' '\n' | sort | uniq -u)
                  - echo "Attempting to sync the following tags - latest $TAG_LIST"
                  - for TAG in $(echo latest $TAG_LIST); do echo "Syncing Image Tag - ${TAG}"; docker pull "${GITHUB_CR}"/"${IMAGE_REPO_NAME}":"${TAG}";done
                  - for TAG in $(echo latest $TAG_LIST); do echo "Pushing Image Tag - ${TAG}"; docker tag "${GITHUB_CR}"/"${IMAGE_REPO_NAME}":"${TAG}" $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$TAG;done
              post_build:
                commands:
                  - echo "Pushing image tags to ECR..."
                  - for TAG in $(echo latest $TAG_LIST); do docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$TAG;done
      Environment:
        ComputeType: "BUILD_GENERAL1_SMALL"
        Image: "aws/codebuild/standard:7.0"
        Type: "LINUX_CONTAINER"
        PrivilegedMode: true
        EnvironmentVariables:
          - Name: AWS_DEFAULT_REGION
            Type: PLAINTEXT
            Value: "ap-southeast-2"
          - Name: AWS_ACCOUNT_ID
            Type: PLAINTEXT
            Value: !Ref AWS::AccountId
          - Name: IMAGE_REPO_NAME
            Type: PLAINTEXT
            Value: !Ref ImageRepoName
          - Name: GITHUB_USER
            Type: PLAINTEXT
            Value: !Ref GitHubUserName
          - Name: GITHUB_PAT
            Type: PLAINTEXT
            Value: !Ref GitHubPat
          - Name: GITHUB_CR
            Type: PLAINTEXT
            Value: !Ref GitHubCr
      Description: !Sub 'Syncs ${ImageRepoName} image tags repo sync from Dockerhub to local ECR repo'
      ServiceRole: !GetAtt CodeBuild.Arn
      TimeoutInMinutes: 300

  LogGroup:
    Type: 'AWS::Logs::LogGroup'
    Properties:
      LogGroupName: !Sub '/aws/codebuild/${GitHubCrSync}-${ImageRepoName}'
      RetentionInDays: 3

  EventRole:
    Type: AWS::IAM::Role
    Properties:
      Description: IAM role to allow Amazon CloudWatch Events to trigger AWS CodeBuild build
      AssumeRolePolicyDocument:
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: events.amazonaws.com
      Policies:
        - PolicyDocument:
            Statement:
              - Action: codebuild:StartBuild
                Effect: Allow
                Resource: !GetAtt 'GitHubCrSync.Arn'
          PolicyName: !Join
            - '-'
            - - !Ref 'AWS::StackName'
              - CloudWatchEventPolicy
      RoleName: !Join
        - '-'
        - - !Ref 'AWS::StackName'
          - CloudWatchEventRule

  NightlyEvent:
    Type: AWS::Events::Rule
    Properties:
      Description: Rule for Amazon CloudWatch Events to trigger a build every night
      ScheduleExpression: "cron(0 0 * * ? *)"
      Name: !Join
        - '-'
        - - !Ref 'AWS::StackName'
          - NightlyBuild
      State: ENABLED
      Targets:
        - Arn: !GetAtt 'GitHubCrSync.Arn'
          Id: NightlyCheck
          RoleArn: !GetAtt 'EventRole.Arn'

Once the stack has been created, the CodeBuild job will kick off on the schedule and perform the sync.

The output will look something like this.

**NOTE:- Tags of latest will be synced on every run. Which is a little differnet from the functionality of the original DockerHub Stack.

[Container] 2023/07/25 00:00:54 Running command echo "Logging in to Amazon ECR..."
Logging in to Amazon ECR...

[Container] 2023/07/25 00:00:54 Running command aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

[Container] 2023/07/25 00:01:08 Running command echo "Logging in to GHCR"
Logging in to GHCR

[Container] 2023/07/25 00:01:08 Running command echo "${GITHUB_PAT}" | docker login "${GITHUB_CR}" --username ${GITHUB_USER} --password-stdin
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

[Container] 2023/07/25 00:01:09 Phase complete: PRE_BUILD State: SUCCEEDED
[Container] 2023/07/25 00:01:09 Phase context status code:  Message: 
[Container] 2023/07/25 00:01:09 Entering phase BUILD
[Container] 2023/07/25 00:01:09 Running command COLON=$(python -c "print(chr(58))")

[Container] 2023/07/25 00:01:11 Running command echo "Retrieving Docker image tags..."
Retrieving Docker image tags...

[Container] 2023/07/25 00:01:11 Running command EXISTING_TAGS=$(aws ecr list-images --repository-name "${IMAGE_REPO_NAME}" | jq -r '[.imageIds[].imageTag] | unique | join("\n")')

[Container] 2023/07/25 00:01:11 Running command echo "The following tags exist already in ECR - $EXISTING_TAGS"
The following tags exist already in ECR - 
latest
slim-latest
slim-v5
slim-v5.1.0
slim-v5.1.1
slim-v5.2.0
v5
v5.1.0
v5.1.1
v5.2.0

[Container] 2023/07/25 00:01:11 Running command TOKEN=$(curl -u ${GITHUB_USER}:${GITHUB_PAT} https"${COLON}"//"${GITHUB_CR}"/token\?scope\="repository:${IMAGE_REPO_NAME}:pull" | jq -r .token)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    69  100    69    0     0    215      0 --:--:-- --:--:-- --:--:--   216

[Container] 2023/07/25 00:01:13 Running command REMOTE_TAGS=$(curl https"${COLON}"//"${GITHUB_CR}"/v2/"${IMAGE_REPO_NAME}"/tags/list -H 'Authorization'"${COLON}"' Bearer '"${TOKEN}" | jq -r '.tags | sort_by(.) | join("\n")' | sed s'/"//g')
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   153  100   153    0     0    481      0 --:--:-- --:--:-- --:--:--   482

[Container] 2023/07/25 00:01:13 Running command echo "The following tags exist at the source - $REMOTE_TAGS"
The following tags exist at the source - latest
slim-latest
slim-v5
slim-v5.1.0
slim-v5.1.1
slim-v5.2.0
v5
v5.1.0
v5.1.1
v5.2.0

[Container] 2023/07/25 00:01:13 Running command TAG_LIST=$(echo "${EXISTING_TAGS} ${REMOTE_TAGS}" | tr ' ' '\n' | sort | uniq -u)

[Container] 2023/07/25 00:01:13 Running command echo "Attempting to sync the following tags - latest $TAG_LIST"
Attempting to sync the following tags - latest 

[Container] 2023/07/25 00:01:13 Running command for TAG in $(echo latest $TAG_LIST); do echo "Syncing Image Tag - ${TAG}"; docker pull "${GITHUB_CR}"/"${IMAGE_REPO_NAME}":"${TAG}";done
Syncing Image Tag - latest
latest: Pulling from super-linter/super-linter
31e352740f53: Pulling fs layer
a956beba7a42: Download complete
Digest: sha256:e87924a99c317abe51bbf8f0bae2dd99666c5c7b5ca95907f92de627a3afff2b
Status: Downloaded newer image for ghcr.io/super-linter/super-linter:latest
ghcr.io/super-linter/super-linter:latest

[Container] 2023/07/25 00:02:48 Running command for TAG in $(echo latest $TAG_LIST); do echo "Pushing Image Tag - ${TAG}"; docker tag "${GITHUB_CR}"/"${IMAGE_REPO_NAME}":"${TAG}" $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$TAG;done
Pushing Image Tag - latest

[Container] 2023/07/25 00:02:48 Phase complete: BUILD State: SUCCEEDED
[Container] 2023/07/25 00:02:48 Phase context status code:  Message: 
[Container] 2023/07/25 00:02:48 Entering phase POST_BUILD
[Container] 2023/07/25 00:02:48 Running command echo "Pushing image tags to ECR..."
Pushing image tags to ECR...

[Container] 2023/07/25 00:02:48 Running command for TAG in $(echo latest $TAG_LIST); do docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$TAG;done
The push refers to repository [894702234348.dkr.ecr.ap-southeast-2.amazonaws.com/super-linter/super-linter]
99b2b0665e02: Preparing
78a822fe2a2d: Layer already exists
latest: digest: sha256:e87924a99c317abe51bbf8f0bae2dd99666c5c7b5ca95907f92de627a3afff2b size: 7477

[Container] 2023/07/25 00:02:48 Phase complete: POST_BUILD State: SUCCEEDED
[Container] 2023/07/25 00:02:48 Phase context status code:  Message:

And here are our containers, in our private repo for consumption

ECR Repo

I could be more security conscious and put the PAT into secrets manager, however, that would only increase the cost and Cloudformation doesn't support Secure SSM params. In my defense the permissions are minimal.

All the code is available here

Hope this helps someone else

Cheers!

For more articles on ECS click here!

Did you find this article valuable?

Support Stephen Jones by becoming a sponsor. Any amount is appreciated!