Add Jenkins pipeline for airgap infrastructure deployment and testing by floatingman · Pull Request #498 · rancher/tests

floatingman · 2026-02-04T17:23:39Z

Implement a Jenkins pipeline to facilitate airgap infrastructure deployment and testing. This includes enhancements to the Dockerfile for Go testing, improvements in the Jenkinsfile for better logging and test handling, and the addition of stages for admin token injection and verification. The pipeline also integrates a retry mechanism for Ansible playbook executions and incorporates Qase reporting for test results.

rancher/qa-tasks#2125

Copilot

Pull request overview

This PR adds a new Jenkins pipeline to provision airgapped RKE2/Rancher infrastructure, inject an admin token, run Go-based validation tests, and publish results (including to Qase), plus a dedicated Docker image for these Go test runs.

Changes:

Introduces Jenkinsfile.airgap.go-tests to clone the tests and qa-infra repos, build an infra tools image with Go, deploy airgapped infra via Tofu/Ansible, inject an admin token, run Go tests with gotestsum, and optionally report to Qase.
Adds Dockerfile.airgap-go-tests to build an Alpine-based infra tools image that includes OpenTofu, Ansible, AWS CLI, Go, and gotestsum for use in the pipeline.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
validation/pipeline/Jenkinsfile.airgap.go-tests	Defines the end-to-end Jenkins pipeline for airgapped infra setup, admin token generation, Go test execution, artifact publishing, and optional Qase reporting.
validation/pipeline/Dockerfile.airgap-go-tests	Builds the infra tools Docker image with Go toolchain and `gotestsum` used by the new pipeline stages.

validation/pipeline/Jenkinsfile.airgap.go-tests

Copilot · 2026-02-04T17:32:59Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+    def testsBranch = env.GO_REPO_BRANCH ?: 'main'
+    def testsRepo = env.GO_REPO_URL ?: 'https://github.com/rancher/tests'
+    def qaInfraBranch = env.QA_INFRA_REPO_BRANCH ?: 'main'
+    def qaInfraRepo = env.QA_INFRA_REPO_URL ?: 'https://github.com/rancher/qa-infra-automation'


This pipeline introduces GO_REPO_BRANCH/GO_REPO_URL for the tests repo, which diverges from the established RANCHER_TEST_REPO_BRANCH/RANCHER_TEST_REPO_URL naming used by other airgap pipelines (for example validation/pipeline/Jenkinsfile.setup.airgap.rke2:13-14 and validation/pipeline/Jenkinsfile.destroy.airgap.rke2:11-12). Reusing the existing env var names (or at least falling back to them) would keep job configuration consistent and avoid confusion when wiring Jenkins jobs to different airgap pipelines.

This is a good point.

validation/pipeline/Jenkinsfile.airgap.go-tests

Copilot · 2026-02-04T17:33:00Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+            if ((env.DESTROY_ON_FAILURE ?: 'true').toBoolean() && workspaceName) {
+                echo 'DESTROY_ON_FAILURE is enabled. Cleaning up infrastructure...'
+                try {
+                    stage('Cleanup on Failure') {
+                        tofu.selectWorkspace(dir: tofuModulePath, name: workspaceName)
+                        tofu.destroy(dir: tofuModulePath, varFile: 'terraform.tfvars', autoApprove: true)
+                        tofu.deleteWorkspace(dir: tofuModulePath, name: workspaceName)
+                    }
+                } catch (cleanupErr) {
+                    echo "Cleanup failed: ${cleanupErr.message}"
+                }
+            }
+            throw err
+        } finally {
+            if (destroyAfterTests && workspaceName) {
+                echo 'Destroying infrastructure after tests (configured)'
+                try {
+                    stage('Destroy After Tests') {
+                        tofu.selectWorkspace(dir: tofuModulePath, name: workspaceName)
+                        tofu.destroy(dir: tofuModulePath, varFile: 'terraform.tfvars', autoApprove: true)
+                        tofu.deleteWorkspace(dir: tofuModulePath, name: workspaceName)


On failure the catch block already performs a best‑effort destroy/deleteWorkspace when DESTROY_ON_FAILURE is enabled, and then the finally block can run the same destroy logic again if DESTROY_AFTER_TESTS is also true. This double attempt to clean up the same workspace can create noisy errors in logs and makes it harder to reason about which flag controls teardown; consider centralizing the destroy logic in one place (or tracking whether cleanup has already succeeded) to keep the control flow clearer.

This is a good point.

Copilot · 2026-02-04T17:33:00Z

validation/pipeline/Dockerfile.airgap-go-tests

+
+# Install gotestsum for JUnit reporting
+ENV GOBIN=/usr/local/bin
+RUN go install gotest.tools/gotestsum@latest


Installing gotestsum with go install gotest.tools/gotestsum@latest pulls executable code from a mutable, third-party module reference at build time, which introduces a supply-chain risk. If the gotest.tools/gotestsum module or its distribution channel is ever compromised or a malicious version is published, the resulting binary will run inside this image (and thus in Jenkins jobs) with access to AWS credentials and other secrets used by the pipeline. Pin this dependency to a specific, vetted version (for example, a fixed tag or commit) and, if possible, enforce integrity verification via checksums or vendored binaries to prevent untrusted code from being pulled implicitly.

I'll look into this.

…and improve logging

…mline installation process

…nment and enhance test command arguments

… test command and improve test result handling

…ization with qaConfig and dockerPlatform, and update test command execution to use dynamic container name and infraToolsImage

…ialization and directly use config for dockerPlatform and infraToolsImage

…mage before building to ensure a clean build

… token generation script

…_PASSWORD

…password variable

…tory change

…tic output; remove build tag directive from inject-admin-token.go

…ken injection script

…use inventory file for admin token generation

…ion process

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…ion control and cleanup logic

slickwarren · 2026-02-13T18:36:21Z

validation/pipeline/Dockerfile.airgap-go-tests

+ARG GO_VERSION=1.25.5
+ARG GOTESTUM_VERSION=1.13.0
+
+FROM --platform=linux/amd64 alpine:3.22


we typically try to use suse based images

I think this still needs to be addressed

In fact, I think instead of using this file we probably either:
1- Use Dockerfile.infra and make it FROM registry.suse.com/bci/golang:1.25
2- Use Dockerfile.e2e

slickwarren · 2026-02-13T18:39:02Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+            stage('Verify Infra Tools Tooling') {
+                echo 'Verifying gotestsum availability inside infra tools image'
+                sh """
+                    docker run --rm --platform ${dockerPlatform} \
+                        ${infraToolsImage} \
+                        sh -c 'set -e; echo \"PATH=[1m$PATH\"; which gotestsum || true; ls -al /root/go/bin || true; ls -al /usr/local/bin/gotestsum || true'
+                """
+            }


this seems unnecessary, what use case would it not be available (unless the build failed outright)?

slickwarren · 2026-02-13T18:44:37Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+            stage('Configure SSH Key') {
+                infrastructure.writeSshKey(
+                    keyContent: env.AWS_SSH_PEM_KEY,
+                    keyName: env.AWS_SSH_PEM_KEY_NAME,
+                    dir: '.ssh'
+                )
+            }
+
+            stage('Configure Tofu Variables') {
+                echo 'Writing Terraform configuration'
+
+                def terraformConfig = infrastructure.parseAndSubstituteVars(
+                    content: env.TERRAFORM_CONFIG,
+                    envVars: [
+                        'AWS_ACCESS_KEY_ID': env.AWS_ACCESS_KEY_ID,
+                        'AWS_SECRET_ACCESS_KEY': env.AWS_SECRET_ACCESS_KEY,
+                        'HOSTNAME_PREFIX': env.HOSTNAME_PREFIX,
+                        'AWS_SSH_PEM_KEY_NAME': env.AWS_SSH_PEM_KEY_NAME
+                    ]
+                )
+
+                infrastructure.writeConfig(
+                    path: "${tofuModulePath}/terraform.tfvars",
+                    content: terraformConfig
+                )
+            }
+
+            stage('Initialize Tofu Backend') {
+                tofu.initBackend(
+                    dir: tofuModulePath,
+                    bucket: env.S3_BUCKET_NAME,
+                    key: env.S3_KEY_PREFIX,
+                    region: env.S3_BUCKET_REGION,
+                    backendInitScript: tofuBackendInitScript
+                )
+            }
+
+            stage('Create Workspace') {
+                workspaceName = infrastructure.generateWorkspaceName(
+                    prefix: 'jenkins_airgap_ansible_workspace',
+                    suffix: env.HOSTNAME_PREFIX,
+                    includeTimestamp: false
+                )


I get wanting more resolute logs, but now there are too many stages to be viewed on a standard screen (when looking at the jenkins job). Since each of these take less than 10 seconds, can you consolidate into 1 stage?

slickwarren · 2026-02-13T18:45:05Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+            stage('Setup SSH Keys on Nodes') {
+                retry(3) {
+                    ansible.runPlaybook(
+                        dir: ansiblePath,
+                        inventory: 'inventory/inventory.yml',
+                        playbook: 'playbooks/setup/setup-ssh-keys.yml'
+                    )
+                }
+            }
+
+            stage('Deploy RKE2 Cluster') {
+                retry(3) {
+                    ansible.runPlaybook(
+                        dir: ansiblePath,
+                        inventory: 'inventory/inventory.yml',
+                        playbook: 'playbooks/deploy/rke2-tarball-playbook.yml'
+                    )
+                }
+            }
+


if you consolidate these 2 with the previous stage, I think it would fit on 1 screen again.

slickwarren · 2026-02-13T18:46:12Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                }
+            }
+
+            stage('Deploy Rancher (Optional)') {


iirc, optional stages break the view in jenkins if the stage is actually non-existent when its not specified. In this case, it might be OK since there's the def + echo?

slickwarren · 2026-02-13T18:52:39Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+            }
+
+            stage('Deploy RKE2 Cluster') {
+                retry(3) {


clarifying question; for error handling, if it fails on the 3rd attempt, does it stop the jenkins job too (assuming so but just double checking)

slickwarren · 2026-02-13T18:54:19Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                // Run the Ansible playbook to generate and inject the admin token
+                // The playbook uses the rancher_token role which:
+                // - Reads external_lb_hostname from inventory (or uses explicit rancher_url)
+                // - Authenticates with Rancher API
+                // - Creates an API token with configurable TTL and description
+                // - Updates cattle-config.yaml with the generated token


since this is in the qa-infra-automation docs already and basically no one looks at jenkinsfiles, I think you should remove these comments.

validation/pipeline/Jenkinsfile.airgap.go-tests

slickwarren · 2026-02-13T18:57:15Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                    resultsJSON: 'gotestsum.json'
+                ])
+
+                if (testArgs && testArgs[-1]?.endsWith(';')) {


why is this necessary?

slickwarren · 2026-02-13T19:00:48Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                    stage('Cleanup on Failure') {
+                        tofu.selectWorkspace(dir: tofuModulePath, name: workspaceName)
+                        tofu.destroy(dir: tofuModulePath, varFile: 'terraform.tfvars', autoApprove: true)
+                        tofu.deleteWorkspace(dir: tofuModulePath, name: workspaceName)
+                        infrastructureCleaned = true
+                    }
+                } catch (cleanupErr) {
+                    echo "Cleanup failed: ${cleanupErr.message}"
+                }
+            }
+            throw err
+        } finally {
+            if (destroyAfterTests && workspaceName && !infrastructureCleaned) {
+                echo 'Destroying infrastructure after tests (configured)'
+                try {
+                    stage('Destroy After Tests') {


having conditional stages like this with different names will cause the job history to not show up properly in jenkins. It looks like 'cleanup on failure' and 'destroy after tests' are doing the same thing? in which case, if you use the same stage name in both cases you can avoid this problem.

If they do different things, can you update to persist each stage, even if there's nothing to do on a particular run for it? see the rancher comment for a potential solution.

Fixed to use the same name. The two stages were doing different things. One destroys the infrastructure on failure; the other destroys it at the end of the build in case you want to investigate something manually.

… and remove unused ARG

…n generation

…le-playbook

…ng ansible-playbook" This reverts commit 95efe45.

… for setup

…tests

…nsfile

hamistao · 2026-02-19T16:15:15Z

validation/pipeline/Dockerfile.airgap-go-tests

+ARG GO_VERSION=1.25.5
+ARG GOTESTUM_VERSION=1.13.0
+
+FROM --platform=linux/amd64 alpine:3.22


I think this still needs to be addressed

hamistao · 2026-02-19T16:20:28Z

validation/pipeline/Dockerfile.airgap-go-tests

+ARG GO_VERSION=1.25.5
+ARG GOTESTUM_VERSION=1.13.0
+
+FROM --platform=linux/amd64 alpine:3.22


In fact, I think instead of using this file we probably either:
1- Use Dockerfile.infra and make it FROM registry.suse.com/bci/golang:1.25
2- Use Dockerfile.e2e

hamistao · 2026-02-19T18:03:37Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+    property.useWithProperties(['AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY', 'AWS_SSH_PEM_KEY', 'AWS_SSH_PEM_KEY_NAME']) {
+        try {
+            stage('Checkout') {
+                deleteDir()


Surely this is me being dumb with Jenkins, but why is this here?

hamistao · 2026-02-19T20:10:20Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+            stage('Configure Tofu Variables') {
+                echo 'Writing Terraform configuration'


do we need both of these?

hamistao · 2026-02-19T20:27:31Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                def s3Path = "env:/${workspaceName}/terraform.tfvars"
+                def tfvarsPath = "${tofuModulePath}/terraform.tfvars"
+                def workspace = pwd()
+                sh """


Again possibly a dumb question, but why sh """?

hamistao · 2026-02-19T20:37:32Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                def deployRancher = env.ANSIBLE_VARIABLES?.contains('deploy_rancher: true')
+                if (deployRancher) {
+                    echo 'Deploying Rancher...'
+                    retry(3) {
+                        ansible.runPlaybook(
+                            dir: ansiblePath,
+                            inventory: 'inventory/inventory.yml',
+                            playbook: 'playbooks/deploy/rancher-helm-deploy-playbook.yml'
+                        )
+                    }
+                } else {
+                    echo 'Skipping Rancher deployment (not enabled in ANSIBLE_VARIABLES)'
+                }


I will use this oportunity to comment on something I was thinking of regarding the ansible stuff. And that is the fact that we have a bunch of variables like deploy_rancher that basically make the playbook that uses them useless when not set, so instead of setting deploy_rancher: false, why not just don't run the playbook?

Again, no implications for this PR, I just was thinking of this for a while.

hamistao · 2026-02-19T20:41:23Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                sh """
+                    docker run --rm --platform ${dockerPlatform} \
+                        --name generate-token \
+                        -v ${workspace}:/workspace \
+                        -w /workspace/${ansiblePath} \
+                        ${infraToolsImage} \
+                        ansible-playbook -i inventory/inventory.yml /workspace/qa-infra-automation/ansible/rancher/token/generate-admin-token.yml \
+                            -e rancher_token_password=${adminPassword} \
+                            -e rancher_cattle_config_file=${cattleConfigPath} \
+                            -e rancher_token_ttl=${tokenTtl} \
+                            -e rancher_token_description=${tokenDescription} \
+                            -e rancher_token_output_format=json \
+                            -e rancher_token_output_file=/workspace/rancher-token.json
+                """


Is it not possible to use runPlaybook here?

hamistao · 2026-02-19T20:45:43Z

validation/pipeline/Jenkinsfile.airgap.go-tests

+                    ]
+                }
+                // If Ansible variables defined a bootstrap password, default commonly used value
+                lines += ["RANCHER_BOOTSTRAP_PASSWORD=${env.RANCHER_BOOTSTRAP_PASSWORD ?: 'rancherrocks'}"]


Doesn't ansible already use this if nothing is set?

…dencies, streamline installation steps, and enhance error handling

floatingman requested review from Copilot, hamistao, lscalabrini01, rancher-max and slickwarren and removed request for hamistao, lscalabrini01, rancher-max and slickwarren February 4, 2026 17:24

Copilot started reviewing on behalf of floatingman February 4, 2026 17:24 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

floatingman self-assigned this Feb 12, 2026

floatingman added the team/pit-crew slack notifier for pit crew label Feb 12, 2026

floatingman requested review from hamistao, lscalabrini01, rancher-max and slickwarren February 12, 2026 02:28

floatingman added 13 commits February 13, 2026 10:36

Add Jenkins pipeline for airgap infrastructure deployment and testing

5a169b7

Add Dockerfile for Airgap Go testing and update Jenkinsfile to use it

7569a7b

Enhance Jenkinsfile for airgap tests: add cattle test config support …

015d27a

…and improve logging

Update Dockerfile for Airgap Go testing: upgrade Go version and strea…

91cb60e

…mline installation process

Refactor Jenkinsfile for airgap tests: streamline goTestPackage assig…

6a05f36

…nment and enhance test command arguments

Enhance Jenkinsfile for airgap tests: integrate container step for Go…

2f07cef

… test command and improve test result handling

Refactor Jenkinsfile for airgap tests: replace container step initial…

4abcab0

…ization with qaConfig and dockerPlatform, and update test command execution to use dynamic container name and infraToolsImage

Refactor Jenkinsfile for airgap tests: remove redundant qaConfig init…

744610d

…ialization and directly use config for dockerPlatform and infraToolsImage

Add verification stage for gotestsum in infra tools image

b8f7bf6

Refactor Jenkinsfile for airgap tests: force remove existing Docker i…

606c112

…mage before building to ensure a clean build

Add stage to inject admin token into cattle-config.yaml and implement…

38fd817

… token generation script

Add admin token injection stage with validation for RANCHER_BOOTSTRAP…

c56577a

…_PASSWORD

Add retry mechanism for Ansible playbook executions and update admin …

5da350d

…password variable

floatingman and others added 12 commits February 13, 2026 10:36

Refactor command execution in Jenkinsfile to remove unnecessary direc…

e7d4273

…tory change

Add diagnostic output for Go environment and build tags in Jenkinsfile

c8e9d78

Refactor Jenkinsfile to simplify command execution and remove diagnos…

797110e

…tic output; remove build tag directive from inject-admin-token.go

Add stage to create test config file and improve comments in admin to…

abc04b5

…ken injection script

Changed admin token generation to ansible

c25ad03

Enhance comments in Ansible playbook execution and update command to …

4c037ef

…use inventory file for admin token generation

Verify cattle config is modified

3d13bc5

Remove inject-admin-token.go script to streamline admin token generat…

d236935

…ion process

Add Qase reporting stage to Jenkins pipeline for test results

ade7105

Update token playbook path in Jenkinsfile for admin token generation

d80ed8d

Update validation/pipeline/Jenkinsfile.airgap.go-tests

c4fbfb7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update validation/pipeline/Jenkinsfile.airgap.go-tests

0457967

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

floatingman force-pushed the feature/convert-airgap-tfp-tests branch from 294090e to 0457967 Compare February 13, 2026 16:37

Update Dockerfile and Jenkinsfile for airgap Go tests to enhance vers…

67c718f

…ion control and cleanup logic

slickwarren reviewed Feb 13, 2026

View reviewed changes

floatingman added 13 commits February 13, 2026 13:28

Update token playbook path in Jenkinsfile for admin token generation

b61025c

Update GOTESTUM_VERSION to 1.12.3 for compatibility

1cd819a

Update Go and gotestsum versions in Dockerfile for airgap Go tests

cdeef10

Fix gotestsum installation version in Dockerfile for airgap Go tests

5b5eb00

Update Dockerfile for airgap Go tests to use latest gotestsum version…

b2fd813

… and remove unused ARG

Update playbook paths in Jenkinsfile for SSH key setup and admin toke…

89f5a00

…n generation

Add ansible_ssh_private_key_file to inventory vars in Jenkinsfile

5389dd3

Refactor SSH key setup in Jenkinsfile to use Docker for running ansib…

95efe45

…le-playbook

Revert "Refactor SSH key setup in Jenkinsfile to use Docker for runni…

f1fd377

…ng ansible-playbook" This reverts commit 95efe45.

Comment out SSH key setup stage in Jenkinsfile for airgap Go tests

61a6f97

Update SSH key setup stage in Jenkinsfile to use the correct playbook…

fc007cf

… for setup

Add stage to configure private registry in Jenkinsfile for airgap Go …

87d2bb3

…tests

Wrap test execution in catchError to handle unstable results in Jenki…

17b884f

…nsfile

hamistao reviewed Feb 19, 2026

View reviewed changes

Refactor Dockerfile and Jenkinsfile for airgap Go tests: update depen…

fb698e4

…dencies, streamline installation steps, and enhance error handling

		stage('Configure Tofu Variables') {
		echo 'Writing Terraform configuration'

Conversation

floatingman commented Feb 4, 2026 • edited by rancher-max Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

floatingman commented Feb 4, 2026 •

edited by rancher-max

Loading