CI: Velox Staging branch Creation FollowUP#156
Conversation
…branch on target repo
ected_file in cudf directory check
|
Wait... does this PR have the right root in GH? I thought it was just one commit to update the README file, but it's showing all the original implementation commits too... |
| type: boolean | ||
| required: false | ||
| default: true | ||
| pr_label: |
There was a problem hiding this comment.
Nit: pr_labels, since this can be multiple labels.
The same comment applies to other files.
| needs: sync-and-aggregate | ||
| if: inputs.build_and_run_tests != false # Defaults to true for scheduled runs | ||
| uses: ./.github/workflows/velox-test.yml | ||
| velox-create-staging: |
There was a problem hiding this comment.
Per my previous comment, we should also take in parameters for additional non-PR repos/branches that will be merged in. This will be needed (at least, temporarily) for the cuDF exchange changes.
There was a problem hiding this comment.
Added, merge conflict occurs when I tried to pass IBM/velox, ibm-research-preview as additional repo and branch. Relevant CI Run.
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
|
|
||
| # Presto Staging Branch Creator |
There was a problem hiding this comment.
I think this verbose comment should be a --help description.
Same comment applies to other files.
There was a problem hiding this comment.
Out of curiosity, did you consider doing this in python? It may make sense for this level of complexity and allow for easier unit testing.
… on velox-create-staging workflow
mattgara
left a comment
There was a problem hiding this comment.
LGTM, just had a few comments related to quality-of-life. We can move any of them to a follow up PR if they would be too complex to implement in this PR.
| auto_fetch_prs: | ||
| description: Auto-fetch PRs with 'ready-to-merge' label (disable to manually specify PRs) | ||
| description: >- | ||
| Auto-fetch non-draft PRs with 'cudf' label |
There was a problem hiding this comment.
Can this work for a general base repository (for example in the case of Presto?)
We can do this in a follow up PR if not.
There was a problem hiding this comment.
Yes, the auto-fetch PR occurs on the repository that is passed as base_repository, not specific to facebookincubator/velox.
| Target path: ${DEFAULT_TARGET_PATH} | ||
|
|
||
| Examples: | ||
| ./presto/scripts/create_staging.sh --manual-pr-numbers "1,2,3" # Merge specific PRs (required) |
There was a problem hiding this comment.
Typically we have the convention that we expect to call these scripts from their base folder (as in ./create_staging.sh).
However, this script may have to be called from the base of the git repo to work correctly? If so, please document what the expectation is around where the script is to be called from.
There was a problem hiding this comment.
Good catch, but the script could be called from any directory and it should work without any issues. I made examples of calling this script from project root to differentiate between Velox and Presto create_staging.sh scripts.
|
|
||
| exec "${PARENT_SCRIPT}" \ | ||
| --target-path "${DEFAULT_TARGET_PATH}" \ | ||
| --base-repository "prestodb/presto" \ |
There was a problem hiding this comment.
hmm can you please explain why this hard-coded call here?
There was a problem hiding this comment.
Currently, we are only considering PRs raised against the upstream repositories facebookincubator/velox and presto/prestodb. Guess there might be no change in this unless there is a specific requirement, so I hardcoded the base_repo and base_branch inputs in the sub-scripts.
If we want to add an extra layer on top of base_branch, we can pass the branch and repository where it belongs as additional_repository and additional_branch.
| [[ -n "${value}" ]] || die "Missing ${name}. Run the appropriate prior step to set it." | ||
| } | ||
|
|
||
| retry() { |
There was a problem hiding this comment.
Hmm can you please explain which commands we expect might fail initially but suceeed on a retry?
There was a problem hiding this comment.
All the git fetch commands (helps to grab the PRs from upstream remote) are guarded with retry. This should avoid entire workflow to fail if a single git fetch command failed because of network failure which might likely to occur in a runner machine.
There was a problem hiding this comment.
might likely to occur in a runner machine.
Oh interesting, I've never seen a git fetch command fail for me locally, but on a github runner this seems to be a possibility. Good to know, thanks!
| log "All PRs can merge cleanly with ${BASE_BRANCH}." | ||
| } | ||
|
|
||
| test_pairwise_compatibility() { |
There was a problem hiding this comment.
I believe at one point in this PRs history it would output the conflicts formatted as a table/matrix + a separate table that showed the PR #, link to PR and the PR author.
Could we add this as an optional pretty-print please? This would really help visualize the conflicts almost immediately.
| return 0 | ||
| fi | ||
|
|
||
| maybe_confirm_reset |
There was a problem hiding this comment.
Can we document clearly what state we expect the local git repo to be when using this tool? For example, can you have dirty/unchecked files (probably a common case.)
There was a problem hiding this comment.
Good point, documented this point. It expects the target repo in clean state, if not then it automatically notifies the user about the dirty file and instruct them to stash those before continuing.
|
@mattgara I have addressed all of your review comments. Let me know if anything was missed. |
paul-aiyedun
left a comment
There was a problem hiding this comment.
I had a few minor comments, but changes overall look good to me.
| required: false | ||
| default: false | ||
| additional_repository: | ||
| description: 'Additional repository to merge from (e.g., rapidsai/cudf)' |
There was a problem hiding this comment.
Remove the rapidsai/cudf reference?
|
|
||
| additional_merge: | ||
| {{ADDITIONAL_MERGE_SECTION}} | ||
| merged_prs: |
There was a problem hiding this comment.
Nit: Add a new line before this line.
| BASE_REPO="facebookincubator/velox" | ||
| BASE_BRANCH="main" | ||
| TARGET_REPO="" | ||
| TARGET_BRANCH="staging" | ||
| TARGET_PATH="" | ||
| WORK_DIR="velox" |
There was a problem hiding this comment.
If we intend on using this script for both Velox and Presto, then we should not have Velox default values.
| additional_repository: | ||
| description: 'Additional repository to merge from (e.g., IBM/velox for cuDF exchange)' | ||
| type: string | ||
| required: false | ||
| default: 'rapidsai/velox' | ||
| additional_branch: | ||
| description: 'Branch from additional repository to merge (e.g., ibm-research-preview for cuDF exchange)' |
There was a problem hiding this comment.
Remove IBM references.
|
@paul-aiyedun Thanks for your review, I have addressed all of your comments! |
Summary
This PR enhances the staging branch creation workflows with support for merging additional repositories/branches before applying upstream PRs. This enables integration of external feature branches (e.g., cuDF exchange code) into staging branches alongside selected PRs.
Workflow Order