fix(ci): switch pytest distribution from loadscope to worksteal#413
fix(ci): switch pytest distribution from loadscope to worksteal#413dmihalcik-virtru wants to merge 8 commits intoopentdf:mainfrom
Conversation
setup-uv already handles Python installation, making the separate setup-python action redundant. This also removes the pip install step in resolve-versions, which will use uv instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Add shfmt configuration for consistent shell script formatting: bash dialect, 2-space indent, space redirects, switch case indent. Apply formatting to all existing shell scripts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Rewrite AGENTS.md as comprehensive agent guide with test framework overview, key concepts, debugging workflow, and best practices. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Add new Python package for managing Docker services, platform, KAS instances, health checks, log viewing, and environment export. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Add new SDK management package for version resolution, artifact installation, and checkout. Refactor shell scripts to thin wrappers delegating to otdf-sdk-mgr. Update CI workflows with step IDs, conditional guards, and otdf-sdk-mgr version resolution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
Better CPU utilization for parallel test execution by using worksteal distribution strategy instead of loadscope. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: David Mihalcik <dmihalcik@virtru.com>
|
Summary of ChangesHello @dmihalcik-virtru, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the OpenTDF test infrastructure by migrating core environment and SDK management logic from disparate shell scripts to unified Python-based command-line tools, Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Ignored Files
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a significant amount of new code, including two new CLI tools (otdf-local and otdf-sdk-mgr) and major refactoring of the development and CI workflows. While these changes are a great improvement, the PR's title, fix(ci): switch pytest distribution from loadscope to worksteal, is misleading and does not reflect the vast scope of the changes. A PR of this magnitude should be broken down into smaller, more focused PRs to facilitate easier review and maintain a clean, understandable git history. My review focuses on the code as presented, but I strongly recommend splitting up such large changes in the future.
| if dist_dir.exists(): | ||
| shutil.rmtree(dist_dir) | ||
| # Exit with error so caller knows to fall back to source build | ||
| sys.exit(1) |
There was a problem hiding this comment.
Calling sys.exit(1) from within a library function like install_java_release is generally considered bad practice. It makes the function difficult to test and reuse in other contexts where an immediate exit is not desirable. It would be better to raise a custom exception (e.g., ArtifactNotFoundError) and let the calling CLI code handle the exception and determine the appropriate exit code.
| try: | ||
| import requests | ||
|
|
||
| platform = get_platform_service(settings) | ||
| if platform.is_running(): | ||
| resp = requests.get( | ||
| f"{settings.platform_url}/.well-known/opentdf-configuration", | ||
| timeout=5, | ||
| ) | ||
| if resp.ok: | ||
| config = resp.json() | ||
| if "version" in config: | ||
| env_vars["PLATFORM_VERSION"] = config["version"] | ||
| except Exception: | ||
| # If we can't get the version, that's okay | ||
| pass |
There was a problem hiding this comment.
The requests library is imported and used here, but httpx is already a project dependency and is used elsewhere for HTTP requests. For consistency and to avoid an unnecessary dependency, httpx should be used here instead of requests. Additionally, the import should be moved to the top of the file.
| try: | |
| import requests | |
| platform = get_platform_service(settings) | |
| if platform.is_running(): | |
| resp = requests.get( | |
| f"{settings.platform_url}/.well-known/opentdf-configuration", | |
| timeout=5, | |
| ) | |
| if resp.ok: | |
| config = resp.json() | |
| if "version" in config: | |
| env_vars["PLATFORM_VERSION"] = config["version"] | |
| except Exception: | |
| # If we can't get the version, that's okay | |
| pass | |
| try: | |
| platform = get_platform_service(settings) | |
| if platform.is_running(): | |
| with httpx.Client() as client: | |
| resp = client.get( | |
| f"{settings.platform_url}/.well-known/opentdf-configuration", | |
| timeout=5, | |
| ) | |
| if resp.is_success: | |
| config = resp.json() | |
| if "version" in config: | |
| env_vars["PLATFORM_VERSION"] = config["version"] | |
| except Exception: | |
| # If we can't get the version, that's okay | |
| pass |
| def _do_clean(settings, keep_logs: bool) -> None: | ||
| """Perform cleanup.""" | ||
| import shutil |
There was a problem hiding this comment.
The shutil module is imported inside the _do_clean function. According to Python's style guide (PEP 8), imports should be placed at the top of the file. This improves readability and makes it easier to see the module's dependencies at a glance.
| def _do_clean(settings, keep_logs: bool) -> None: | |
| """Perform cleanup.""" | |
| import shutil | |
| import shutil | |
| def _do_clean(settings, keep_logs: bool) -> None: | |
| """Perform cleanup.""" |
| def resolve(sdk: str, version: str, infix: str | None) -> ResolveResult: | ||
| """Resolve a version spec to a concrete SHA and tag.""" | ||
| sdk_url = SDK_GIT_URLS[sdk] | ||
| try: | ||
| repo = Git() | ||
| if version == "main" or version == "refs/heads/main": | ||
| all_heads = [r.split("\t") for r in repo.ls_remote(sdk_url, heads=True).split("\n")] | ||
| sha, _ = [tag for tag in all_heads if "refs/heads/main" in tag][0] | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": True, | ||
| "sha": sha, | ||
| "tag": "main", | ||
| } | ||
|
|
||
| if re.match(SHA_REGEX, version): | ||
| ls_remote = [r.split("\t") for r in repo.ls_remote(sdk_url).split("\n")] | ||
| matching_tags = [(sha, tag) for (sha, tag) in ls_remote if sha.startswith(version)] | ||
| if not matching_tags: | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version[:7], | ||
| "sha": version, | ||
| "tag": version, | ||
| } | ||
| if len(matching_tags) > 1: | ||
| for sha, tag in matching_tags: | ||
| if tag.startswith("refs/pull/"): | ||
| pr_number = tag.split("/")[2] | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": True, | ||
| "sha": sha, | ||
| "tag": f"pull-{pr_number}", | ||
| } | ||
| for sha, tag in matching_tags: | ||
| mq_match = re.match(MERGE_QUEUE_REGEX, tag) | ||
| if mq_match: | ||
| to_branch = mq_match.group("branch") | ||
| pr_number = mq_match.group("pr_number") | ||
| if to_branch and pr_number: | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": True, | ||
| "pr": pr_number, | ||
| "sha": sha, | ||
| "tag": f"mq-{to_branch}-{pr_number}", | ||
| } | ||
| suffix = tag.split("refs/heads/gh-readonly-queue/")[-1] | ||
| flattag = "mq--" + suffix.replace("/", "--") | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": True, | ||
| "sha": sha, | ||
| "tag": flattag, | ||
| } | ||
| head = False | ||
| if tag.startswith("refs/heads/"): | ||
| head = True | ||
| tag = tag.split("refs/heads/")[-1] | ||
| flattag = tag.replace("/", "--") | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": head, | ||
| "sha": sha, | ||
| "tag": flattag, | ||
| } | ||
|
|
||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "err": ( | ||
| f"SHA {version} points to multiple tags, unable to differentiate: " | ||
| f"{', '.join(tag for _, tag in matching_tags)}" | ||
| ), | ||
| } | ||
| (sha, tag) = matching_tags[0] | ||
| if tag.startswith("refs/tags/"): | ||
| tag = tag.split("refs/tags/")[-1] | ||
| if infix: | ||
| tag = tag.split(f"{infix}/")[-1] | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "sha": sha, | ||
| "tag": tag, | ||
| } | ||
|
|
||
| if version.startswith("refs/pull/"): | ||
| merge_heads = [ | ||
| r.split("\t") for r in repo.ls_remote(sdk_url).split("\n") if r.endswith(version) | ||
| ] | ||
| pr_number = version.split("/")[2] | ||
| if not merge_heads: | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "err": f"pull request {pr_number} not found in {sdk_url}", | ||
| } | ||
| sha, _ = merge_heads[0] | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": True, | ||
| "pr": pr_number, | ||
| "sha": sha, | ||
| "tag": f"pull-{pr_number}", | ||
| } | ||
|
|
||
| remote_tags = [r.split("\t") for r in repo.ls_remote(sdk_url).split("\n")] | ||
| all_listed_tags = [ | ||
| (sha, tag.split("refs/tags/")[-1]) for (sha, tag) in remote_tags if "refs/tags/" in tag | ||
| ] | ||
|
|
||
| all_listed_branches = { | ||
| tag.split("refs/heads/")[-1]: sha | ||
| for (sha, tag) in remote_tags | ||
| if tag.startswith("refs/heads/") | ||
| } | ||
|
|
||
| if version in all_listed_branches: | ||
| sha = all_listed_branches[version] | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "head": True, | ||
| "sha": sha, | ||
| "tag": version, | ||
| } | ||
|
|
||
| if infix and version.startswith(f"{infix}/"): | ||
| version = version.split(f"{infix}/")[-1] | ||
|
|
||
| listed_tags = all_listed_tags | ||
| if infix: | ||
| listed_tags = [ | ||
| (sha, tag.split(f"{infix}/")[-1]) | ||
| for (sha, tag) in listed_tags | ||
| if f"{infix}/" in tag | ||
| ] | ||
| semver_regex = r"v?\d+\.\d+\.\d+$" | ||
| listed_tags = [(sha, tag) for (sha, tag) in listed_tags if re.search(semver_regex, tag)] | ||
| listed_tags.sort(key=lambda item: list(map(int, item[1].strip("v").split(".")))) | ||
| alias = version | ||
| matching_tags = [] | ||
| if version == "latest": | ||
| # For Java, check if CLI artifacts are available and fall back to source build if not | ||
| if sdk == "java": | ||
| from otdf_sdk_mgr.registry import list_java_github_releases | ||
|
|
||
| gh_releases = list_java_github_releases() | ||
| # Find the latest version with CLI artifact | ||
| versions_with_cli = [r for r in gh_releases if r.get("has_cli", False)] | ||
| if versions_with_cli: | ||
| # Use the latest version that has CLI | ||
| latest_with_cli_tag = versions_with_cli[-1]["version"] | ||
| matching_tags = [ | ||
| (sha, tag) | ||
| for (sha, tag) in listed_tags | ||
| if tag in [latest_with_cli_tag, latest_with_cli_tag.lstrip("v")] | ||
| ] | ||
| if not matching_tags: | ||
| # No versions with CLI found, fall back to building latest from source | ||
| sha, tag = listed_tags[-1] | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": alias, | ||
| "head": True, # Mark as head to trigger source checkout | ||
| "sha": sha, | ||
| "tag": tag, | ||
| } | ||
| else: | ||
| matching_tags = listed_tags[-1:] | ||
| else: | ||
| if version == "lts": | ||
| version = LTS_VERSIONS[sdk] | ||
| matching_tags = [ | ||
| (sha, tag) for (sha, tag) in listed_tags if tag in [version, f"v{version}"] | ||
| ] | ||
| if not matching_tags: | ||
| raise ValueError(f"Tag [{version}] not found in [{sdk_url}]") | ||
| sha, tag = matching_tags[-1] | ||
| release = tag | ||
| if infix: | ||
| release = f"{infix}/{release}" | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": alias, | ||
| "release": release, | ||
| "sha": sha, | ||
| "tag": tag, | ||
| } | ||
| except Exception as e: | ||
| return { | ||
| "sdk": sdk, | ||
| "alias": version, | ||
| "err": f"Error resolving version {version} for {sdk}: {e}", | ||
| } |
There was a problem hiding this comment.
The resolve function is quite long and complex, handling many different cases for version resolution. To improve readability and maintainability, consider refactoring this function into several smaller, more focused helper functions. For example, you could have separate helpers for resolving SHAs, branches, and special tags like 'latest' or 'lts'.
| #!/usr/bin/env python3 | ||
| # Use: python3 resolve-version.py <sdk> <tag...> | ||
| # | ||
| # Tag can be: | ||
| # main: the main branch | ||
| # latest: the latest release of the app (last tag) | ||
| # lts: one of a list of hard-coded 'supported' versions | ||
| # <sha>: a git SHA | ||
| # v0.1.2: a git tag that is a semantic version | ||
| # refs/pull/1234: a pull request ref | ||
| # | ||
| # The script will resolve the tags to their git SHAs and return it and other metadata in a JSON formatted list of objects. | ||
| # Fields of the object will be: | ||
| # sdk: the SDK name | ||
| # alias: the tag that was requested | ||
| # head: true if the tag is a head of a live branch | ||
| # tag: the resolved tag or branch name, if found | ||
| # sha: the current git SHA of the tag | ||
| # err: an error message if the tag could not be resolved, or resolved to multiple items | ||
| # pr: if set, the pr number associated with the tag | ||
| # release: if set, the release page for the tag | ||
| # | ||
| # The script will also check for duplicate SHAs and remove them from the output. | ||
| # | ||
| # Sample Input: | ||
| # | ||
| # python3 resolve-version.py go 0.15.0 latest decaf01 unreleased-name | ||
| # | ||
| # Sample Output: | ||
| # ```json | ||
| # [ | ||
| # { | ||
| # "sdk": "go", | ||
| # "alias": "0.15.0", | ||
| # "env": "ADDITIONAL_OPTION=per build metadata", | ||
| # "release": "v0.15.0", | ||
| # "sha": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0", | ||
| # "tag": "v0.15.0" | ||
| # }, | ||
| # { | ||
| # "sdk": "go", | ||
| # "alias": "latest", | ||
| # "release": "v0.15.1", | ||
| # "sha": "c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0a1b2", | ||
| # "tag": "v0.15.1" | ||
| # }, | ||
| # { | ||
| # "sdk": "go", | ||
| # "alias": "decaf01", | ||
| # "head": true, | ||
| # "pr": "1234", | ||
| # "sha": "decaf016g7h8i9j0k1l2m3n4o5p6q7r8s9t0a1b2", | ||
| # "tag": "refs/pull/1234/head" | ||
| # }, | ||
| # { | ||
| # "sdk": "go", | ||
| # "err": "not found", | ||
| # "tag": "unreleased-name" | ||
| # } | ||
| # ] | ||
| # ``` | ||
| """Backward-compatible wrapper. Use `otdf-sdk-mgr versions resolve` instead.""" |
There was a problem hiding this comment.
The docstring for this file is now inaccurate. The file has been refactored into a backward-compatible wrapper that calls the new otdf-sdk-mgr tool. The docstring should be updated to reflect this change, indicating that it's a wrapper and that otdf-sdk-mgr versions resolve is the preferred command for new use cases.




Summary
--dist loadscopeto--dist workstealin 3 pytest invocations in xtest.ymlParent PRs
feat/otdf-sdk-mgr)Test plan
loadscopereferences in xtest.yml🤖 Generated with Claude Code
Part of stacked PR series decomposing
chore/the-claudiest-day-tmux