Skip to content

security: avoid persisting GitHub SSH keys in PR pipelines#4477

Merged
ti-chi-bot[bot] merged 2 commits intomainfrom
ci-architect/audit-private-clone-cred-exposure
Apr 13, 2026
Merged

security: avoid persisting GitHub SSH keys in PR pipelines#4477
ti-chi-bot[bot] merged 2 commits intomainfrom
ci-architect/audit-private-clone-cred-exposure

Conversation

@wuhuizuo
Copy link
Copy Markdown
Contributor

@wuhuizuo wuhuizuo commented Apr 7, 2026

Summary

  • remove git.setSshKey(GIT_CREDENTIALS_ID) from PR / ghpr pipelines that run untrusted repository code
  • keep private-repo access scoped to the checkout step via prow.checkoutRefs(..., credentialsId = GIT_CREDENTIALS_ID, ...)
  • switch the affected TiFlash PR pipelines from empty checkout credentials to checkout-scoped SSH credentials so submodule/private fetch still works without leaving a reusable key on disk

Risk Being Fixed

The audited PR pipelines cloned private repositories or private submodules with the github-sre-bot-ssh credential and then persisted the SSH private key into ~/.ssh/id_rsa via git.setSshKey().

Once that happened, build/test scripts coming from the target PR repository could reuse the same key to:

  • print or copy the key material
  • clone additional private repositories
  • push new refs or delete remote refs that the key can access

This PR removes that persistent-key path from PR jobs and narrows credential availability to the checkout helper's sshagent scope.

Audit Notes

Confirmed safe patterns that were left unchanged:

  • prow.checkoutRefs(..., credentialsId = GIT_CREDENTIALS_ID, ...) because it scopes SSH auth to the checkout helper
  • component.checkout(...) / checkoutPRWithPreMerge(...) because they use Jenkins checkout or sshagent during checkout only
  • legacy jenkins/ GitSCM checkouts that use credentialsId without copying keys into ~/.ssh/id_rsa

Validation

  • git diff --check
  • verified no PR / ghpr pipeline under pipelines/ still contains git.setSshKey(...)
  • verified the TiFlash PR jobs that previously relied on git.setSshKey() now use checkout-scoped credentials instead of credentialsId = ''

Scope

This intentionally updates PR / ghpr pipelines only. Remaining git.setSshKey() usage is in merged / trusted jobs and can be reviewed separately if we want to remove that pattern repo-wide later.

Copy link
Copy Markdown

@ti-chi-bot ti-chi-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already done a preliminary review for you, and I hope to help you do a better job.

Summary

This PR removes the git.setSshKey(GIT_CREDENTIALS_ID) calls from untrusted pull request pipelines across multiple Groovy scripts to mitigate security risks associated with persisting SSH keys. Instead, the pipelines are updated to use checkout-scoped credentials via prow.checkoutRefs. The approach effectively limits SSH key usage to the checkout operation, preventing untrusted code from reusing the keys in subsequent steps. The changes are consistent, well-documented, and address a clear security concern without introducing additional complexity.


Critical Issues

None identified. The changes effectively mitigate the security risk of persisting SSH credentials in PR pipelines.


Code Improvements

  1. Error Logging for retry(2) Operations
    Files: All modified pipeline scripts
    Issue: The retry(2) block currently lacks explicit logging or error handling for failed retries. If checkout fails twice, the pipeline may terminate without clear visibility into the cause.
    Suggestion: Add error logging to provide better visibility into checkout failures.

    retry(2) {
        try {
            prow.checkoutRefs(REFS, credentialsId = GIT_CREDENTIALS_ID, timeout = 5, withSubmodule = true, gitBaseUrl = 'https://github.com')
        } catch (Exception e) {
            println "Checkout failed: ${e.message}"
            throw e
        }
    }
  2. Parameterization of Timeout Values
    Files: All modified pipeline scripts
    Issue: The timeout parameter in prow.checkoutRefs is hardcoded to 5. This could cause problems if longer repository fetch times are needed, especially for large submodules.
    Suggestion: Use a configurable timeout parameter to allow flexibility across environments.

    def checkoutTimeout = env.CHECKOUT_TIMEOUT ?: 5
    prow.checkoutRefs(REFS, credentialsId = GIT_CREDENTIALS_ID, timeout = checkoutTimeout, withSubmodule = true, gitBaseUrl = 'https://github.com')

Best Practices

  1. Testing Coverage for Updated Pipelines
    Files: All modified pipeline scripts
    Issue: The PR description mentions validation steps but does not specify whether automated tests were added to verify the updated pipeline behavior.
    Suggestion: Ensure automated tests cover scenarios where prow.checkoutRefs is used with checkout-scoped credentials, including edge cases with submodules and large repositories.

  2. Documentation for Credential Changes
    Files: All modified pipeline scripts
    Issue: While the PR description explains the rationale for removing git.setSshKey, the pipeline scripts themselves lack inline comments explaining the security improvements.
    Suggestion: Add comments to clarify why prow.checkoutRefs is used instead of git.setSshKey.

    // Use prow.checkoutRefs with credentials scoped to the checkout operation to avoid persisting SSH keys.
    prow.checkoutRefs(REFS, credentialsId = GIT_CREDENTIALS_ID, timeout = 5, withSubmodule = true, gitBaseUrl = 'https://github.com')
  3. Consistency in Credential Usage
    Files: pipelines/pingcap/tiflash/latest/pull_integration_next_gen.groovy, pipelines/pingcap/tiflash/latest/pull_unit_test.groovy, etc.
    Issue: Some TiFlash pipelines previously used credentialsId = '' for checkout, which could cause issues if private submodules require authentication. This PR fixes the issue by using GIT_CREDENTIALS_ID, but it's worth confirming consistency across all pipelines to avoid regressions.
    Suggestion: Audit any remaining pipelines where credentialsId = '' might still be in use outside this PR scope.


Conclusion

The PR addresses a critical security issue by removing persistent SSH key usage and implementing checkout-scoped credentials. The changes are solid, but adding error logging, parameterizing timeout values, improving testing coverage, and enhancing inline documentation would further strengthen the implementation.

@wuhuizuo wuhuizuo requested a review from dillon-zheng April 7, 2026 11:35
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors Git credential handling across numerous Jenkins pipeline files for TiDB and TiFlash. It removes redundant git.setSshKey(GIT_CREDENTIALS_ID) calls and updates prow.checkoutRefs to use GIT_CREDENTIALS_ID directly, ensuring consistent credential management across various release branches. I have no feedback to provide.

@wuhuizuo
Copy link
Copy Markdown
Contributor Author

wuhuizuo commented Apr 8, 2026

/retest

Copy link
Copy Markdown

@ti-chi-bot ti-chi-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already done a preliminary review for you, and I hope to help you do a better job.

Summary

This PR removes the usage of git.setSshKey(GIT_CREDENTIALS_ID) from PR pipelines across multiple Groovy script files to prevent persistent SSH key storage in untrusted PR jobs. Instead, it scopes SSH credentials to the checkout step by using prow.checkoutRefs with credentialsId. The approach enhances security by limiting the exposure of sensitive credentials and ensures private repo access works without leaving reusable SSH keys on disk. The changes are straightforward and appear well-tested, with no obvious regressions introduced.


Critical Issues

No critical issues detected in the changes.


Code Improvements

  • File: pipelines/pingcap/tiflash/latest/pull_integration_test.groovy (and similar TiFlash pipeline files)
    • Issue: The credentialsId = '' in prow.checkoutRefs was replaced with credentialsId = GIT_CREDENTIALS_ID, which is a safer approach. However, the PR does not clarify whether credentialsId is properly scoped to the checkout step and prevents further misuse within the pipeline.
    • Suggestion: Ensure that prow.checkoutRefs with credentialsId is using proper scoping mechanisms such as sshagent internally to avoid accidental credential leakage beyond the checkout step.

Best Practices

  1. Testing Coverage Gaps

    • Files Affected: All modified pipeline files.
    • Issue: While the PR mentions validation was done (git diff --check and manual verification), there is no indication of automated tests for these pipeline changes. Any subtle changes in Groovy pipeline behavior could lead to build/test failures.
    • Suggestion: Add automated tests (mock jobs) to verify that pipelines execute successfully with these changes and that SSH keys are not accessible beyond the checkout step.
  2. Documentation Updates

    • Files Affected: All modified pipeline files.
    • Issue: There is no inline documentation explaining the rationale for removing git.setSshKey or how prow.checkoutRefs ensures scoped authentication.
    • Suggestion: Add comments in the pipeline files to clarify:
      // Removed git.setSshKey(GIT_CREDENTIALS_ID) to prevent persistent SSH key storage.
      // prow.checkoutRefs ensures SSH credentials are scoped to this step only.
  3. Consistency in Credential Handling

    • Files Affected: pipelines/pingcap/tiflash/latest/pull_integration_test.groovy and others with TiFlash-specific changes.
    • Issue: TiFlash pipelines previously used credentialsId = '', which may indicate an intentional lack of credentials. This change introduces credentialsId = GIT_CREDENTIALS_ID, which assumes that SSH credentials are required everywhere. The PR does not justify why credentials are now necessary in these specific scenarios.
    • Suggestion: Review TiFlash-specific pipelines to confirm that credentialsId = GIT_CREDENTIALS_ID is truly required and does not introduce unnecessary credential dependencies.

Conclusion

The PR significantly improves security by removing persistent SSH key storage from untrusted pipelines, but it lacks clarity about the testing and credential scoping mechanisms in the checkout step. Addressing these gaps and adding inline documentation will ensure robustness and maintainability of these changes.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 13, 2026

@wuhuizuo: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-replay-jenkins-pipelines dfdab37 link false /test pull-replay-jenkins-pipelines

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@wuhuizuo
Copy link
Copy Markdown
Contributor Author

wuhuizuo commented Apr 13, 2026

Jenkins Replay Status

  • pipelines/pingcap-inc/tidb/release-8.5/pull_build.groovy -> pingcap-inc/tidb/release-8.5/pull_build
  • pipelines/pingcap/tiflash/latest/pull_unit_test.groovy -> pingcap/tiflash/pull_unit_test
    • status: skipped
    • replay: unavailable
    • note: Jenkins job has no historical builds (builds_count=0, nextBuildNumber=1), so replay has no source build

@wuhuizuo
Copy link
Copy Markdown
Contributor Author

/approve

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wuhuizuo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Apr 13, 2026
@ti-chi-bot ti-chi-bot bot merged commit bbfd59c into main Apr 13, 2026
5 of 6 checks passed
@ti-chi-bot ti-chi-bot bot deleted the ci-architect/audit-private-clone-cred-exposure branch April 13, 2026 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

1 participant