Skip to content

Conversation

@frostebite
Copy link
Member

@frostebite frostebite commented Sep 8, 2025

Changes

  • New Features
    • Support for LocalStack (local AWS emulator) and service-specific AWS endpoints.
    • Experimental rclone storage provider with built-in cache/build hooks.
    • Windows execution path for local runs; Kubernetes endpoint normalization.
    • Experimental plugin system for loading cloud runner provider from executable file or git repository
  • Improvements
    • More robust AWS interactions (retry/backoff), logging, caching, and repo checkout.
    • Container-aware workflow behavior and consistent environment handling.
  • Documentation
    • Added guides for LocalStack usage and rclone storage.
  • Chores
    • New integrity workflows; removed legacy Cloud Runner pipeline.
    • CI uses dedicated Jest config and script.
    • Lint configuration update.

Related Issues

  • ...

Related PRs

  • ...

Successful Workflow Run Link

PRs don't have access to secrets so you will need to provide a link to a successful run of the workflows from your own
repo.

  • ...

Checklist

  • Read the contribution guide and accept the
    code of conduct
  • Docs (If new inputs or outputs have been added or changes to behavior that should be documented. Please make a PR
    in the documentation repo)
  • Readme (updated or not needed)
  • Tests (added, updated or not needed)

Summary by CodeRabbit

  • New Features

    • Dynamic provider loading with local caching, provider-source parsing, centralized AWS client factory, rclone-backed cache hooks, Windows command handling, and CI-friendly test runner.
  • Bug Fixes / Reliability

    • LocalStack-aware routing, AWS retry/throttling resilience, backend-agnostic workspace locking, disk-space-aware caching, richer pod diagnostics, and optional/non-fatal hook handling.
  • Tests

    • CI-specific Jest config, environment-gated unit/e2e tests, and boolean BuildSucceeded reporting.
  • Documentation

    • Provider loader README added.
  • Chores

    • Legacy CI pipeline removed.

✏️ Tip: You can customize this high-level summary in your review settings.

- Implemented a primary attempt to pull LFS files using GIT_PRIVATE_TOKEN.
- Added a fallback mechanism to use GITHUB_TOKEN if the initial attempt fails.
- Configured git to replace SSH and HTTPS URLs with token-based authentication for the fallback.
- Improved error handling to log specific failure messages for both token attempts.

This change ensures more robust handling of LFS file retrieval in various authentication scenarios.
- Added permissions for packages, pull-requests, statuses, and id-token to enhance workflow capabilities.
- This change improves the CI pipeline's ability to manage pull requests and access necessary resources.
…ation

- Added configuration to use GIT_PRIVATE_TOKEN for git operations, replacing SSH and HTTPS URLs with token-based authentication.
- Improved error handling to ensure GIT_PRIVATE_TOKEN availability before attempting to pull LFS files.
- This change streamlines the process of pulling LFS files in environments requiring token authentication.
…entication

- Enhanced the process of configuring git to use GIT_PRIVATE_TOKEN and GITHUB_TOKEN by clearing existing URL configurations before setting new ones.
- Improved the clarity of the URL replacement commands for better readability and maintainability.
- This change ensures a more robust setup for pulling LFS files in environments requiring token authentication.
… pipeline

- Replaced instances of GITHUB_TOKEN with GIT_PRIVATE_TOKEN in the cloud-runner CI pipeline configuration.
- This change ensures consistent use of token-based authentication across various jobs in the workflow, enhancing security and functionality.
…L unsetting

- Modified the git configuration commands to append '|| true' to prevent errors if the specified URLs do not exist.
- This change enhances the reliability of the URL clearing process in the RemoteClient class, ensuring smoother execution during token-based authentication setups.
…tion

- Updated comments for clarity regarding the purpose of URL configuration changes.
- Simplified the git configuration commands by removing redundant lines while maintaining functionality for HTTPS token-based authentication.
- This change enhances the readability and maintainability of the RemoteClient class's git setup process.
# Conflicts:
#	dist/index.js
#	dist/index.js.map
#	jest.config.js
#	yarn.lock
…logs; tests: retained workspace AWS assertion (#381)
…nd log management; update builder path logic based on provider strategy
…sed on provider strategy and credentials; update binary files
…ained markers; hooks: include AWS S3 hooks on aws provider
…t:ci script; fix(windows): skip grep-based version regex tests; logs: echo CACHE_KEY/retained markers; hooks: include AWS hooks on aws provider
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (2)

13-19: Wrap events fetch in try-catch to prevent masking pod failures.

If listNamespacedEvent fails (e.g., RBAC restrictions, API timeout), the exception will propagate and obscure the actual pod failure. Since events are diagnostic-only, a failure here shouldn't prevent reporting the pod's failed state.

This issue was previously identified in past review comments. Consider implementing the suggested fix:

-      const events = (await kubeClient.listNamespacedEvent(namespace)).body.items
-        .filter((x) => x.involvedObject?.name === podName)
-        .map((x) => ({
-          message: x.message || '',
-          reason: x.reason || '',
-          type: x.type || '',
-        }));
+      let events: Array<{ message: string; reason: string; type: string }> = [];
+      try {
+        events = (await kubeClient.listNamespacedEvent(namespace)).body.items
+          .filter((x) => x.involvedObject?.name === podName)
+          .map((x) => ({
+            message: x.message || '',
+            reason: x.reason || '',
+            type: x.type || '',
+          }));
+      } catch (eventError) {
+        CloudRunnerLogger.logWarning(`Failed to fetch pod events: ${eventError}`);
+      }

35-84: Multi-container logic flaw: only the last container's exit status is retained.

The loop at lines 39-57 overwrites containerExitCode and containerSucceeded on each iteration. Consequently, the check at line 70 (containerSucceeded && containerExitCode === 0) only verifies whether the last container succeeded, ignoring failures in earlier containers.

Scenario: If a pod has multiple containers where the last one exits with code 0 but an earlier one failed, the code will incorrectly return false at line 83, treating the pod as successful and bypassing the error throw at line 130.

This issue was previously identified in past review comments. Consider the suggested fix to check all containers:

-      let containerExitCode: number | undefined;
-      let containerSucceeded = false;
+      let allContainersSucceeded = true;
+      let anyContainerExitCode: number | undefined;
       
       if (containerStatuses.length > 0) {
         containerStatuses.forEach((cs, idx) => {
           if (cs.state?.waiting) {
             errorDetails.push(
               `Container ${idx} (${cs.name}) waiting: ${cs.state.waiting.reason} - ${cs.state.waiting.message || ''}`,
             );
           }
           if (cs.state?.terminated) {
             const exitCode = cs.state.terminated.exitCode;
-            containerExitCode = exitCode;
+            anyContainerExitCode = exitCode;
             if (exitCode === 0) {
-              containerSucceeded = true;
+              // Keep checking all containers
+            } else {
+              allContainersSucceeded = false;
             }
             errorDetails.push(
               `Container ${idx} (${cs.name}) terminated: ${cs.state.terminated.reason} - ${
                 cs.state.terminated.message || ''
               } (exit code: ${exitCode})`,
             );
           }
         });
       }

Then update line 70 to: if (allContainersSucceeded && anyContainerExitCode === 0)

src/model/cloud-runner/remote-client/caching.ts (4)

153-171: retrySucceeded flag is unreachable noise; simplify the retry block.

If the retry tar command throws, the inner catch already throws a new error, so execution never reaches the if (!retrySucceeded) check. If it succeeds, retrySucceeded is true, so the condition is false. The flag and subsequent if are therefore dead code.

You can simplify to just a try/catch:

-              // Retry the tar operation once after cleanup
-              let retrySucceeded = false;
               try {
                 await CloudRunnerSystem.Run(
                   `tar -cf ${cacheArtifactName}.tar${compressionSuffix} "${path.basename(sourceFolder)}"`,
                 );
-                // If retry succeeds, mark it - we'll continue normally without throwing
-                retrySucceeded = true;
               } catch (retryError: any) {
                 throw new Error(
                   `Failed to create cache archive after cleanup. Original error: ${errorMessage}. Retry error: ${
                     retryError?.message || retryError
                   }`,
                 );
               }
-              // If retry succeeded, don't throw the original error - let execution continue after catch block
-              if (!retrySucceeded) {
-                throw error;
-              }
-              // If we get here, retry succeeded - execution will continue after the catch block

96-107: Restrict cache cleanup find scope to avoid deleting unrelated .tar files.

Using cacheParent = path.dirname(cacheFolder) (and later cacheRoot = path.resolve(cacheParent, '..')) means the cleanup can target . or even /, and the pattern -name "*.tar*" will match any .tar-like file under those trees. On shared runners this risks deleting non-cache archives or other .tar data.

Consider tightening this by:

  • Limiting cleanup to a well-defined cache root (e.g., cacheFolder or a dedicated cache base), not its parent or grandparent.
  • Using a stricter filename pattern (e.g., known cache prefixes / suffixes) instead of "*.tar*" so only your cache archives are affected.
  • Optionally making the TTLs (360/60/30 minutes) configurable rather than hard-coded.

Also applies to: 135-150


104-107: Shell commands interpolate unvalidated paths/artifact names, leaving command‑injection risk.

cacheArtifactName, cacheFolder, cacheParent, and cacheRoot flow from CLI/env and are interpolated directly into shell commands (rm, tar, find, mv) without robust validation. The .replace(' ', '') on cacheArtifactName does not prevent shell metacharacters like ;, &, |, $, or backticks. A crafted artifact name or cache path could break out of the intended command and execute arbitrary shell.

To harden this, either:

  • Prefer Node APIs (fs.promises.rm, fs.promises.readdir/stat, a tar library, etc.) with path.join/path.resolve instead of shell rm/find/tar/mv, or
  • Introduce strict validation/quoting for any value interpolated into shell strings (e.g., whitelisting [A-Za-z0-9._/-] and rejecting anything else, or wrapping via a dedicated quoting helper).

For example, a minimal mitigation pattern:

+const safeShellArg = (value: string): string => {
+  if (!/^[A-Za-z0-9._\/-]+$/.test(value)) {
+    throw new Error(`Invalid shell argument: ${value}`);
+  }
+  return value;
+};
...
-        await CloudRunnerSystem.Run(`rm -f ${cacheArtifactName}.tar${compressionSuffix} 2>/dev/null || true`);
+        await CloudRunnerSystem.Run(
+          `rm -f ${safeShellArg(cacheArtifactName)}.tar${compressionSuffix} 2>/dev/null || true`,
+        );
...
-              `find ${cacheParent} -name "*.tar*" -type f -mmin +360 -delete 2>/dev/null || true`,
+              `find ${safeShellArg(cacheParent)} -name "*.tar*" -type f -mmin +360 -delete 2>/dev/null || true`,

Similar treatment should be applied to the other find, tar, and mv invocations that include user-controlled values (including existing ones outside this diff).

Also applies to: 117-120, 125-127, 135-150, 155-157


129-132: Disk‑space classification based only on error message strings is brittle.

The check:

if (errorMessage.includes('No space left') || errorMessage.includes('Wrote only')) {

depends on specific tar/shell message text, which may vary by platform, locale, or implementation. This risks missing genuine ENOSPC conditions or misclassifying other errors.

Prefer also keying off structured error data where available (e.g., exit code or error.code === 'ENOSPC', or an explicit flag from CloudRunnerSystem.Run), and include other stable tokens like 'ENOSPC' in the message check.

What error properties and exit-code semantics does Node.js `child_process` expose for ENOSPC conditions from a failed `tar` command, and how stable are these across platforms?
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c216e3b and f783857.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (2)
  • src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (1 hunks)
  • src/model/cloud-runner/remote-client/caching.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: WebGL on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: iOS on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2022.3.13f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: Cloud Runner Integrity / Cloud Runner Tests (K8s)
  • GitHub Check: Cloud Runner Integrity / Cloud Runner Tests (LocalStack)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (11)
src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (3)

13-19: Wrap events fetch in try-catch to prevent masking pod failures.

If listNamespacedEvent fails (e.g., RBAC restrictions, API timeout), the exception will propagate and obscure the actual pod failure. Since events are diagnostic-only, a failure here shouldn't prevent reporting the pod's failed state.

Apply this diff to wrap the events fetch:

-      const events = (await kubeClient.listNamespacedEvent(namespace)).body.items
-        .filter((x) => x.involvedObject?.name === podName)
-        .map((x) => ({
-          message: x.message || '',
-          reason: x.reason || '',
-          type: x.type || '',
-        }));
+      let events: Array<{ message: string; reason: string; type: string }> = [];
+      try {
+        events = (await kubeClient.listNamespacedEvent(namespace)).body.items
+          .filter((x) => x.involvedObject?.name === podName)
+          .map((x) => ({
+            message: x.message || '',
+            reason: x.reason || '',
+            type: x.type || '',
+          }));
+      } catch (eventError) {
+        CloudRunnerLogger.logWarning(`Failed to fetch pod events: ${eventError}`);
+      }

35-58: Multi-container logic flaw: only the last container's exit status is retained.

The loop iterates over all container statuses but overwrites containerExitCode and containerSucceeded on each iteration. If a pod has multiple containers and the last one succeeded but an earlier one failed, the code will incorrectly treat the pod as successful and skip error throwing at line 147.

Consider checking if all containers succeeded:

-      let containerExitCode: number | undefined;
-      let containerSucceeded = false;
+      let allContainersSucceeded = true;
+      let anyContainerExitCode: number | undefined;
+      let anyContainerFailed = false;
       
       if (containerStatuses.length > 0) {
         containerStatuses.forEach((cs, idx) => {
           if (cs.state?.waiting) {
             errorDetails.push(
               `Container ${idx} (${cs.name}) waiting: ${cs.state.waiting.reason} - ${cs.state.waiting.message || ''}`,
             );
           }
           if (cs.state?.terminated) {
             const exitCode = cs.state.terminated.exitCode;
-            containerExitCode = exitCode;
+            anyContainerExitCode = exitCode;
             if (exitCode === 0) {
-              containerSucceeded = true;
+              // Keep checking all containers
+            } else {
+              allContainersSucceeded = false;
+              anyContainerFailed = true;
             }
             errorDetails.push(
               `Container ${idx} (${cs.name}) terminated: ${cs.state.terminated.reason} - ${
                 cs.state.terminated.message || ''
               } (exit code: ${exitCode})`,
             );
           }
         });
       }

Then update line 71 to use allContainersSucceeded && !anyContainerFailed and refactor the subsequent logic accordingly.


89-123: Retry logic only checks the first container, inconsistent with multi-container handling.

Line 101 explicitly accesses updatedPod.status.containerStatuses[0], checking only the first container. This is inconsistent with the earlier loop (lines 39-57) that iterates over all containers, and it assumes single-container pods.

Issue: In multi-container pods, if the first container succeeds (exit code 0) but another container fails, the retry logic at line 108 will return false, incorrectly treating the pod as successful.

Consider iterating over all container statuses in the retry logic:

           const updatedPod = (
             await kubeClient.listNamespacedPod(namespace)
           ).body.items.find((x) => podName === x.metadata?.name);
           if (updatedPod?.status?.containerStatuses && updatedPod.status.containerStatuses.length > 0) {
-            const updatedContainerStatus = updatedPod.status.containerStatuses[0];
-            if (updatedContainerStatus.state?.terminated) {
-              const updatedExitCode = updatedContainerStatus.state.terminated.exitCode;
-              if (updatedExitCode === 0) {
-                CloudRunnerLogger.logWarning(
-                  `Pod ${podName} container succeeded (exit code 0) after waiting. PreStopHook failure is non-fatal.`,
-                );
-                return false; // Pod is not running, but container succeeded
-              } else {
-                CloudRunnerLogger.log(
-                  `Pod ${podName} container failed with exit code ${updatedExitCode} after waiting.`,
-                );
-                errorDetails.push(`Container terminated after wait: exit code ${updatedExitCode}`);
-                containerExitCode = updatedExitCode;
-                containerSucceeded = false;
-                break;
+            let allSucceeded = true;
+            let anyTerminated = false;
+            for (const cs of updatedPod.status.containerStatuses) {
+              if (cs.state?.terminated) {
+                anyTerminated = true;
+                const exitCode = cs.state.terminated.exitCode;
+                if (exitCode !== 0) {
+                  allSucceeded = false;
+                  errorDetails.push(
+                    `Container (${cs.name}) terminated after wait: exit code ${exitCode}`,
+                  );
+                }
               }
             }
+            if (anyTerminated) {
+              if (allSucceeded) {
+                CloudRunnerLogger.logWarning(
+                  `Pod ${podName} all containers succeeded after waiting. PreStopHook failure is non-fatal.`,
+                );
+                return false;
+              }
+              break; // At least one container failed
+            }
           }
.github/workflows/cloud-runner-integrity.yml (7)

274-292: Replace GITHUB_TOKEN override with default job token (LocalStack job: all 10 test steps).

All test steps in the localstack job override GITHUB_TOKEN with ${{ secrets.GIT_PRIVATE_TOKEN }}, violating least-privilege. Keep GIT_PRIVATE_TOKEN as a separate variable; let the code's fallback logic use it. Replace across all steps at lines 291–292, 315–316, 339–340, 363–364, 387–388, 411–412, 435–436, 459–460, 483–484, and 507–508.

Example diff for the first test step:

       - name: Run cloud-runner-end2end-locking test
         timeout-minutes: 60
         run: yarn run test "cloud-runner-end2end-locking" --detectOpenHandles --forceExit --runInBand
         env:
           UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
           UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
           UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}
           PROJECT_PATH: test-project
           TARGET_PLATFORM: StandaloneWindows64
           cloudRunnerTests: true
           versioning: None
           KUBE_STORAGE_CLASS: local-path
           PROVIDER_STRATEGY: aws
           AWS_ACCESS_KEY_ID: test
           AWS_SECRET_ACCESS_KEY: test
           AWS_ENDPOINT: http://localhost:4566
           AWS_ENDPOINT_URL: http://localhost:4566
           GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
-          GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+          GITHUB_TOKEN: ${{ github.token }}

Apply the same change to all remaining test steps in the localstack job.


274-292: Add AWS_S3_FORCE_PATH_STYLE and AWS_EC2_METADATA_DISABLED to all LocalStack test steps.

The k8s job (lines 157–158, 197–198, 237–238) correctly includes these variables for LocalStack compatibility. All 10 test steps in the localstack job are missing them, leading to inconsistent behavior. Add to all test steps in the localstack job.

Starting with the first test step (cloud-runner-end2end-locking):

       - name: Run cloud-runner-end2end-locking test
         timeout-minutes: 60
         run: yarn run test "cloud-runner-end2end-locking" --detectOpenHandles --forceExit --runInBand
         env:
           UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
           UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
           UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}
           PROJECT_PATH: test-project
           TARGET_PLATFORM: StandaloneWindows64
           cloudRunnerTests: true
           versioning: None
           KUBE_STORAGE_CLASS: local-path
           PROVIDER_STRATEGY: aws
           AWS_ACCESS_KEY_ID: test
           AWS_SECRET_ACCESS_KEY: test
           AWS_ENDPOINT: http://localhost:4566
           AWS_ENDPOINT_URL: http://localhost:4566
+          AWS_S3_FORCE_PATH_STYLE: 'true'
+          AWS_EC2_METADATA_DISABLED: 'true'
           GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
           GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}

Apply the same additions (AWS_S3_FORCE_PATH_STYLE: 'true' and AWS_EC2_METADATA_DISABLED: 'true') to the env blocks of all remaining test steps in the localstack job (lines 298–316, 322–340, 346–364, 370–388, 394–412, 418–436, 442–460, 466–484, 490–508).


135-160: Replace GITHUB_TOKEN override with default job token (k8s job: first test).

Setting GITHUB_TOKEN to ${{ secrets.GIT_PRIVATE_TOKEN }} overrides the runner-provided token and exposes a broader-scope PAT. Keep GIT_PRIVATE_TOKEN separate if private-repo access is needed; let the code's fallback logic handle it.

Apply this diff:

       - name: Run cloud-runner-end2end-caching test
         timeout-minutes: 60
         run: yarn run test "cloud-runner-end2end-caching" --detectOpenHandles --forceExit --runInBand
         env:
           UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
           UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
           UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}
           PROJECT_PATH: test-project
           TARGET_PLATFORM: StandaloneWindows64
           cloudRunnerTests: true
           versioning: None
           KUBE_STORAGE_CLASS: local-path
           PROVIDER_STRATEGY: k8s
           containerCpu: '512'
           containerMemory: '512'
           AWS_ACCESS_KEY_ID: test
           AWS_SECRET_ACCESS_KEY: test
           AWS_S3_ENDPOINT: http://localhost:4566
           AWS_ENDPOINT: http://localhost:4566
           INPUT_AWSS3ENDPOINT: http://localhost:4566
           INPUT_AWSENDPOINT: http://localhost:4566
           AWS_S3_FORCE_PATH_STYLE: 'true'
           AWS_EC2_METADATA_DISABLED: 'true'
           GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
-          GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+          GITHUB_TOKEN: ${{ github.token }}

176-200: Replace GITHUB_TOKEN override in retaining test (k8s job).

Same token concern as caching test. Use the default job token instead.

Apply this diff:

       - name: Run cloud-runner-end2end-retaining test
         timeout-minutes: 60
         run: yarn run test "cloud-runner-end2end-retaining" --detectOpenHandles --forceExit --runInBand
         env:
           UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
           UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
           UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}
           PROJECT_PATH: test-project
           TARGET_PLATFORM: StandaloneWindows64
           cloudRunnerTests: true
           versioning: None
           KUBE_STORAGE_CLASS: local-path
           PROVIDER_STRATEGY: k8s
           containerCpu: '512'
           containerMemory: '512'
           AWS_ACCESS_KEY_ID: test
           AWS_SECRET_ACCESS_KEY: test
           AWS_S3_ENDPOINT: http://localhost:4566
           AWS_ENDPOINT: http://localhost:4566
           INPUT_AWSS3ENDPOINT: http://localhost:4566
           INPUT_AWSENDPOINT: http://localhost:4566
           AWS_S3_FORCE_PATH_STYLE: 'true'
           AWS_EC2_METADATA_DISABLED: 'true'
           GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
-          GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+          GITHUB_TOKEN: ${{ github.token }}

216-240: Replace GITHUB_TOKEN override in hooks test (k8s job).

Same token concern. Use the default job token instead.

Apply this diff:

       - name: Run cloud-runner-hooks test
         timeout-minutes: 60
         run: yarn run test "cloud-runner-hooks" --detectOpenHandles --forceExit --runInBand
         env:
           UNITY_EMAIL: ${{ secrets.UNITY_EMAIL }}
           UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
           UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}
           PROJECT_PATH: test-project
           TARGET_PLATFORM: StandaloneWindows64
           cloudRunnerTests: true
           versioning: None
           KUBE_STORAGE_CLASS: local-path
           PROVIDER_STRATEGY: k8s
           containerCpu: '512'
           containerMemory: '512'
           AWS_ACCESS_KEY_ID: test
           AWS_SECRET_ACCESS_KEY: test
           AWS_S3_ENDPOINT: http://localhost:4566
           AWS_ENDPOINT: http://localhost:4566
           INPUT_AWSS3ENDPOINT: http://localhost:4566
           INPUT_AWSENDPOINT: http://localhost:4566
           AWS_S3_FORCE_PATH_STYLE: 'true'
           AWS_EC2_METADATA_DISABLED: 'true'
           GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
-          GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+          GITHUB_TOKEN: ${{ github.token }}

70-73: Add region-aware S3 bucket creation for LocalStack.

S3 requires LocationConstraint outside us-east-1. The current code doesn't account for this and will silently fail.

Apply this diff:

       - name: Create S3 bucket for tests (host LocalStack)
         run: |
-          awslocal s3 mb s3://$AWS_STACK_NAME || aws --endpoint-url=http://localhost:4566 s3 mb s3://$AWS_STACK_NAME || true
+          if [ "${AWS_REGION:-us-east-1}" = "us-east-1" ]; then
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" || aws --endpoint-url=http://localhost:4566 s3api create-bucket --bucket "$AWS_STACK_NAME" || true
+          else
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" --create-bucket-configuration LocationConstraint="${AWS_REGION}" || aws --endpoint-url=http://localhost:4566 s3api create-bucket --bucket "$AWS_STACK_NAME" --create-bucket-configuration LocationConstraint="${AWS_REGION}" || true
+          fi
           awslocal s3 ls || aws --endpoint-url=http://localhost:4566 s3 ls || echo "S3 bucket listing completed"

40-43: Pin k3d version and avoid curl | bash installation.

Line 42 uses an unpinned install script, which poses a supply-chain risk. Replace with a pinned binary download:

      - name: Install k3d
        run: |
-         curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
+         set -euo pipefail
+         K3D_VERSION=v5.8.3
+         curl -fsSL -o /usr/local/bin/k3d "https://github.com/k3d-io/k3d/releases/download/${K3D_VERSION}/k3d-linux-amd64"
+         chmod +x /usr/local/bin/k3d
          k3d version | cat

Consider storing K3D_VERSION in the workflow env section for easier updates.

src/model/cloud-runner/remote-client/caching.ts (1)

96-188: Command injection risk, over‑broad cleanup scope, fragile ENOSPC detection, and dead retrySucceeded logic.

Several concerns in this block:

  1. Command injection & unsafe interpolation

    • Interpolated values like cacheFolder, cacheArtifactName, compressionSuffix, cacheParent, and cacheRoot are inserted directly into shell strings for find, rm, tar, mkdir -p, etc. (e.g., Lines 100–107, 119, 125–127, 139–142, 147–149).
    • They ultimately come from CLI/options/environment and can contain spaces or shell metacharacters, making this a command‑injection risk. The current .replace(' ', '') only strips the first space and doesn’t prevent other dangerous characters.

    Suggestion: Refactor this logic to avoid building raw shell command strings:

    • Prefer Node’s fs.promises and path for deletes and directory cleanup instead of find/rm.
    • If you absolutely must call shell utilities, expose a CloudRunnerSystem.runCommand(cmd: string, args: string[]) (or similar) that uses spawn/execFile without going through a shell, and validate all user‑controlled inputs with a strict whitelist (e.g. /^[A-Za-z0-9._-]+$/ for artifact names and a vetted set of cache base directories).
    • At minimum, validate cacheFolder/cacheParent/cacheRoot against an allow‑listed base path before using them in destructive commands.
  2. Over‑broad cleanup scope → potential data loss

    • Cleanup uses cacheParent = path.dirname(cacheFolder) and cacheRoot = path.resolve(cacheParent, '..'), then runs find ${cacheParent} ... -delete and find ${cacheRoot} ... -delete on "*.tar*" (Lines 100–107, 139–142, 144–149).
    • If cacheFolder is relative (e.g., 'cache'), cacheParent may be '.' and cacheRoot may be the project root, so you may end up deleting any .tar* file in the working directory and its parent, not just cache artifacts.
    • This is the same “overly broad cleanup scope” concern raised in the previous review and remains unresolved.

    Suggestion: Restrict deletion strictly to the configured cache directory tree and the expected filename pattern:

    • Use cacheFolder (not its parent) as the search root.
    • Match a more specific pattern like "${cacheArtifactName}.tar${compressionSuffix}" or a well‑defined prefix (e.g., unity-cache-*.tar${compressionSuffix}), not all *.tar*.
    • Consider a configurable retention policy with a clear cap on the number/age of cache entries.
  3. Fragile disk‑space error detection

    • The retry path checks only errorMessage.includes('No space left') || errorMessage.includes('Wrote only') (Line 131). Different platforms/tar implementations may use different messages, so real ENOSPC errors could slip through or be misclassified.

    Suggestion: Make the detection more robust by also checking standard error codes/errno:

    • Include checks like error?.code === 'ENOSPC' or errorMessage.includes('ENOSPC') and (if available) process exit codes, in addition to the current string patterns.
  4. Dead / unnecessary retrySucceeded flag

    • retrySucceeded is initialized to false, set to true if the retry succeeds, and the retry’s catch unconditionally throws (Lines 153–166).
    • This means the if (!retrySucceeded) { throw error; } branch (Lines 168–170) is effectively unreachable: either the retry succeeds (flag true, condition false) or the retry throws (you never reach the if).

    You can simplify this block:

-              // Retry the tar operation once after cleanup
-              let retrySucceeded = false;
-              try {
-                await CloudRunnerSystem.Run(
-                  `tar -cf ${cacheArtifactName}.tar${compressionSuffix} "${path.basename(sourceFolder)}"`,
-                );
-                // If retry succeeds, mark it - we'll continue normally without throwing
-                retrySucceeded = true;
-              } catch (retryError: any) {
-                throw new Error(
-                  `Failed to create cache archive after cleanup. Original error: ${errorMessage}. Retry error: ${
-                    retryError?.message || retryError
-                  }`,
-                );
-              }
-              // If retry succeeded, don't throw the original error - let execution continue after catch block
-              if (!retrySucceeded) {
-                throw error;
-              }
-              // If we get here, retry succeeded - execution will continue after the catch block
+              // Retry the tar operation once after cleanup
+              try {
+                await CloudRunnerSystem.Run(
+                  `tar -cf ${cacheArtifactName}.tar${compressionSuffix} "${path.basename(sourceFolder)}"`,
+                );
+                // If we get here, retry succeeded and execution will continue after the catch block
+              } catch (retryError: any) {
+                throw new Error(
+                  `Failed to create cache archive after cleanup. Original error: ${errorMessage}. Retry error: ${
+                    retryError?.message || retryError
+                  }`,
+                );
+              }

These issues were previously flagged in earlier review comments and still apply to the updated implementation.

🧹 Nitpick comments (1)
src/model/cloud-runner/remote-client/caching.ts (1)

82-95: Disk-usage parsing is fragile but acceptable; consider tightening.

df . 2>/dev/null || df /data ... plus /(\d+)%/ works in many environments but is somewhat brittle (locale/format differences, multiple filesystems). Optional improvement: use df -P for a more stable parse format and explicitly pick the last line’s %use (e.g., split by newline, then whitespace) instead of the first regex match.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad9f2d3 and 459b929.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (3)
  • .github/workflows/cloud-runner-integrity.yml (1 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (1 hunks)
  • src/model/cloud-runner/remote-client/caching.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: Cloud Runner Integrity / Cloud Runner Tests (LocalStack)
  • GitHub Check: Cloud Runner Integrity / Cloud Runner Tests (K8s)
  • GitHub Check: Tests
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: Android on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
🔇 Additional comments (1)
src/model/cloud-runner/remote-client/caching.ts (1)

192-196: Good defensive check for cacheFolder before mv.

Re‑ensuring cacheFolder exists before moving the archive (in case cleanup removed empty directories) is a solid defensive addition and should prevent intermittent mv failures.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
.github/workflows/cloud-runner-integrity.yml (2)

40-43: Unsafe k3d installation pattern remains; pin version instead.

The workflow still uses curl | bash to install k3d, which has supply-chain risk and poor reproducibility. A previous review flagged this (marked as "Addressed in commit 69731ba") but the unsafe pattern persists.

Replace with a pinned release download:

-      - name: Install k3d
+      - name: Install k3d (pinned)
         run: |
+          set -euo pipefail
+          K3D_VERSION=v5.7.3
-          curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
+          mkdir -p /tmp/k3d && cd /tmp/k3d
+          curl -fsSL -o k3d "https://github.com/k3d-io/k3d/releases/download/${K3D_VERSION}/k3d-linux-amd64"
+          chmod +x k3d
+          sudo mv k3d /usr/local/bin/k3d
           k3d version | cat

164-165: Stop overriding GITHUB_TOKEN with GIT_PRIVATE_TOKEN across all test steps.

The workflow sets GITHUB_TOKEN to ${{ secrets.GIT_PRIVATE_TOKEN }} in 13 test steps across both jobs (k8s: 3 steps; localstack: 10 steps). This violates least-privilege by unnecessarily exposing broader-scope PATs and overrides GitHub's automatically-provided workflow token.

Replace all occurrences with:

- GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+ GITHUB_TOKEN: ${{ github.token }}

Keep GIT_PRIVATE_TOKEN as a separate variable if private repo access is truly required.

Locations: 164-165, 204-205, 244-245, 296-297, 320-321, 344-345, 368-369, 392-393, 416-417, 440-441, 464-465, 488-489, 512-513

🧹 Nitpick comments (4)
.github/workflows/cloud-runner-integrity.yml (2)

73-76: S3 bucket creation should use region-aware API for robustness.

The current approach uses awslocal s3 mb with an AWS CLI fallback, which doesn't handle region-aware bucket creation. For regions other than us-east-1, S3 requires a LocationConstraint in the CreateBucket request.

Replace with region-aware logic:

-          awslocal s3 mb s3://$AWS_STACK_NAME || aws --endpoint-url=http://localhost:4566 s3 mb s3://$AWS_STACK_NAME || true
-          awslocal s3 ls || aws --endpoint-url=http://localhost:4566 s3 ls || echo "S3 bucket listing completed"
+          # Use s3api with region-aware LocationConstraint
+          REGION=${AWS_REGION:-us-east-1}
+          if [ "$REGION" = "us-east-1" ]; then
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" 2>/dev/null || true
+          else
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" \
+              --create-bucket-configuration LocationConstraint="$REGION" 2>/dev/null || true
+          fi
+          awslocal s3 ls || echo "S3 bucket listing completed"

118-139: Consider extracting repeated K8s cleanup logic into a reusable step or composite action.

The K8s job repeats the cleanup logic (delete jobs, pods, PVCs, secrets, cache, prune Docker) three times verbatim. This creates maintenance burden—changes must be made in three places.

Consider extracting to a composite action or parameterized step for maintainability:

- name: Clean up K8s resources and disk
  uses: ./.github/actions/cleanup-k8s-test-resources
  with:
    namespace: default

For now, this is a minor maintainability observation; functionality is correct.

Also applies to: 166-180, 206-220

src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts (2)

66-66: Termination grace vs preStop hook look consistent; preStop command is dense but OK

Setting terminationGracePeriodSeconds: 90 matches the preStop hook doing a sleep 60 plus a short license‑return script, so the pod should have enough time for the hook to complete before Kubernetes force‑kills it. That alignment looks good.

The preStop shell command itself is quite compact and uses &&/|| chaining, which is correct but a bit hard to read/maintain. If you touch this again, consider extracting the full command string into a named constant or slightly simplifying it (e.g., dropping the cd if /steps/return_license.sh is always absolute), but there’s no immediate functional issue here.

Also applies to: 120-130


25-49: Reconsider hard-coded k3d endpoint rewriting without cluster detection

The AWS endpoint rewriting for k3d is sensible for the local dev case, but the implementation has one concern:

  • The host.k3d.internal mapping is applied unconditionally to AWS endpoint vars whenever localhost URLs are detected. However, line 138 shows that the code can detect CLOUD_RUNNER_MINIKUBE environments. Other clusters (minikube, kind, EKS) use different gateway hostnames or may not need this rewrite at all. Consider either:
    • Adding a similar detection flag for k3d and gating the rewrite logic
    • Making the gateway hostname configurable (e.g., CLOUD_RUNNER_HOST_GATEWAY env var)

The object literal mapping on line 48 is fine as-is; CloudRunnerEnvironmentVariable has only name and value properties, so no fields are being dropped.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 459b929 and bfac73b.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (2)
  • .github/workflows/cloud-runner-integrity.yml (1 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Tests
  • GitHub Check: WebGL on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2022.3.13f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: Android on 2021.3.32f1
🔇 Additional comments (1)
.github/workflows/cloud-runner-integrity.yml (1)

12-16: Minor: Consider re-adding id-token: write if OIDC operations are needed.

The permissions were correctly tightened per prior feedback, removing actions: write and pull-requests: write. However, the original suggestion included id-token: write. If this workflow uses OIDC or assume-role operations, restore it. If not needed, this is fine.

Comment on lines +294 to +297
AWS_ENDPOINT: http://localhost:4566
AWS_ENDPOINT_URL: http://localhost:4566
GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add AWS_S3_FORCE_PATH_STYLE and AWS_EC2_METADATA_DISABLED to all LocalStack job test steps.

The k8s job correctly includes AWS_S3_FORCE_PATH_STYLE: 'true' and AWS_EC2_METADATA_DISABLED: 'true' for LocalStack compatibility, but the localstack job omits these across all 10 test steps. This creates inconsistent behavior and may cause unexpected LocalStack interactions.

Add these variables to every test step in the localstack job. For example, lines 294–297:

         AWS_ENDPOINT: http://localhost:4566
         AWS_ENDPOINT_URL: http://localhost:4566
+        AWS_S3_FORCE_PATH_STYLE: 'true'
+        AWS_EC2_METADATA_DISABLED: 'true'
         GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
         GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}

Apply the same additions to all other test steps in the localstack job (lines 318–321, 342–345, 366–369, 390–393, 414–417, 438–441, 462–465, 486–489, 510–513).

Also applies to: 318-321, 342-345, 366-369, 390-393, 414-417, 438-441, 462-465, 486-489, 510-513

🤖 Prompt for AI Agents
.github/workflows/cloud-runner-integrity.yml around lines 294-297 (and similarly
at 318-321, 342-345, 366-369, 390-393, 414-417, 438-441, 462-465, 486-489,
510-513): the localstack job's test steps are missing LocalStack-specific env
vars; add AWS_S3_FORCE_PATH_STYLE: 'true' and AWS_EC2_METADATA_DISABLED: 'true'
to the environment block of every test step in this localstack job so they match
the k8s job and ensure consistent LocalStack behavior across all listed steps.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (3)

13-19: Wrap events fetch in try-catch to prevent masking pod failures.

If listNamespacedEvent fails (e.g., RBAC restrictions, API timeout), the exception will propagate and obscure the actual pod failure. Since events are diagnostic-only, a failure here shouldn't prevent reporting the pod's failed state.

-      const events = (await kubeClient.listNamespacedEvent(namespace)).body.items
-        .filter((x) => x.involvedObject?.name === podName)
-        .map((x) => ({
-          message: x.message || '',
-          reason: x.reason || '',
-          type: x.type || '',
-        }));
+      let events: Array<{ message: string; reason: string; type: string }> = [];
+      try {
+        events = (await kubeClient.listNamespacedEvent(namespace)).body.items
+          .filter((x) => x.involvedObject?.name === podName)
+          .map((x) => ({
+            message: x.message || '',
+            reason: x.reason || '',
+            type: x.type || '',
+          }));
+      } catch (eventError) {
+        CloudRunnerLogger.logWarning(`Failed to fetch pod events: ${eventError}`);
+      }

35-58: Multi-container logic flaw: only the last container's exit status is retained.

The loop iterates over all container statuses but overwrites containerExitCode and containerSucceeded on each iteration (lines 47-49). If a pod has multiple containers and the last one succeeded but an earlier one failed, the code will incorrectly treat the pod as successful and skip error throwing at line 159.

Consider checking if all containers succeeded:

-      let containerExitCode: number | undefined;
-      let containerSucceeded = false;
+      let allContainersSucceeded = true;
+      let anyContainerExitCode: number | undefined;
       
       if (containerStatuses.length > 0) {
         containerStatuses.forEach((cs, idx) => {
           if (cs.state?.waiting) {
             errorDetails.push(
               `Container ${idx} (${cs.name}) waiting: ${cs.state.waiting.reason} - ${cs.state.waiting.message || ''}`,
             );
           }
           if (cs.state?.terminated) {
             const exitCode = cs.state.terminated.exitCode;
-            containerExitCode = exitCode;
+            anyContainerExitCode = exitCode;
             if (exitCode === 0) {
-              containerSucceeded = true;
+              // Keep checking all containers
+            } else {
+              allContainersSucceeded = false;
             }
             errorDetails.push(
               `Container ${idx} (${cs.name}) terminated: ${cs.state.terminated.reason} - ${
                 cs.state.terminated.message || ''
               } (exit code: ${exitCode})`,
             );
           }
         });
       }

Then update the logic at lines 71, 89, 126, 138, and 148 to use allContainersSucceeded and anyContainerExitCode.


89-123: Retry logic only checks the first container, inconsistent with multi-container handling.

Line 101 explicitly accesses updatedPod.status.containerStatuses[0], checking only the first container. This is inconsistent with the earlier loop (lines 39-57) that iterates over all containers, and assumes single-container pods.

Issue: In multi-container pods, if the first container succeeds (exit code 0) but another container fails, the retry logic at line 108 will return false, incorrectly treating the pod as successful.

Consider iterating over all container statuses in the retry logic:

           const updatedPod = (
             await kubeClient.listNamespacedPod(namespace)
           ).body.items.find((x) => podName === x.metadata?.name);
           if (updatedPod?.status?.containerStatuses && updatedPod.status.containerStatuses.length > 0) {
-            const updatedContainerStatus = updatedPod.status.containerStatuses[0];
-            if (updatedContainerStatus.state?.terminated) {
-              const updatedExitCode = updatedContainerStatus.state.terminated.exitCode;
-              if (updatedExitCode === 0) {
-                CloudRunnerLogger.logWarning(
-                  `Pod ${podName} container succeeded (exit code 0) after waiting. PreStopHook failure is non-fatal.`,
-                );
-                return false; // Pod is not running, but container succeeded
-              } else {
-                CloudRunnerLogger.log(
-                  `Pod ${podName} container failed with exit code ${updatedExitCode} after waiting.`,
-                );
-                errorDetails.push(`Container terminated after wait: exit code ${updatedExitCode}`);
-                containerExitCode = updatedExitCode;
-                containerSucceeded = false;
-                break;
+            let allSucceeded = true;
+            let anyTerminated = false;
+            for (const cs of updatedPod.status.containerStatuses) {
+              if (cs.state?.terminated) {
+                anyTerminated = true;
+                const exitCode = cs.state.terminated.exitCode;
+                if (exitCode !== 0) {
+                  allSucceeded = false;
+                  errorDetails.push(
+                    `Container (${cs.name}) terminated after wait: exit code ${exitCode}`,
+                  );
+                }
               }
             }
+            if (anyTerminated) {
+              if (allSucceeded) {
+                CloudRunnerLogger.logWarning(
+                  `Pod ${podName} all containers succeeded after waiting. PreStopHook failure is non-fatal.`,
+                );
+                return false;
+              }
+              break; // At least one container failed
+            }
           }
.github/workflows/cloud-runner-integrity.yml (3)

48-51: Use region-aware S3 API to support non-us-east-1 regions.

S3 requires LocationConstraint outside us-east-1. The current awslocal s3 mb command will fail for non-us-east-1 regions. This was flagged in prior feedback as accepted but not yet applied.

Apply this diff to support all AWS regions:

       - name: Create S3 bucket for tests (host LocalStack)
         run: |
+          if [ "${AWS_REGION:-us-east-1}" = "us-east-1" ]; then
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" || true
+          else
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" \
+              --create-bucket-configuration LocationConstraint="${AWS_REGION}" || true
+          fi
-          awslocal s3 mb s3://$AWS_STACK_NAME || true
           awslocal s3 ls

252-267: Add AWS_S3_FORCE_PATH_STYLE and AWS_EC2_METADATA_DISABLED to all 10 LocalStack test steps.

The k8s job correctly includes AWS_S3_FORCE_PATH_STYLE: 'true' and AWS_EC2_METADATA_DISABLED: 'true', but all 10 test steps in the localstack job omit these required environment variables. This creates inconsistent LocalStack behavior between jobs.

Add these variables to every test step's env block in the localstack job:

         PROVIDER_STRATEGY: aws
         AWS_ACCESS_KEY_ID: test
         AWS_SECRET_ACCESS_KEY: test
         AWS_ENDPOINT: http://localhost:4566
         AWS_ENDPOINT_URL: http://localhost:4566
+        AWS_S3_FORCE_PATH_STYLE: 'true'
+        AWS_EC2_METADATA_DISABLED: 'true'
         GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
         GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}

Apply to: cloud-runner-end2end-locking, cloud-runner-end2end-caching, cloud-runner-end2end-retaining, cloud-runner-caching, cloud-runner-environment, cloud-runner-image, cloud-runner-hooks, cloud-runner-local-persistence, cloud-runner-locking-core, and cloud-runner-locking-get-locked tests.


40-43: Pin k3d version and avoid curl | bash to reduce supply-chain risk.

The workflow currently downloads and executes an unpinned install script from the main branch, which is unverifiable and poses a supply-chain risk. The official k3d documentation recommends downloading the binary directly instead.

Apply this diff to pin the version and download the binary with checksum verification:

-      - name: Install k3d
+      - name: Install k3d (pinned)
         run: |
+          set -euo pipefail
+          K3D_VERSION=v5.7.3
+          curl -fsSL -o /usr/local/bin/k3d "https://github.com/k3d-io/k3d/releases/download/${K3D_VERSION}/k3d-linux-amd64"
+          chmod +x /usr/local/bin/k3d
-          curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
           k3d version | cat
🧹 Nitpick comments (1)
.github/workflows/cloud-runner-integrity.yml (1)

88-110: Extract repetitive K8s cleanup logic into a reusable step or helper.

The K8s cleanup logic is duplicated verbatim across three cleanup steps (lines 88–110, 136–150, 176–190). This violates DRY and makes future maintenance harder.

Consider extracting this into a composite action or a separate shell script that can be sourced. Example using a composite action:

- name: Clean up K8s test resources
  uses: ./.github/actions/k8s-cleanup

Then create .github/actions/k8s-cleanup/action.yml with the shared cleanup logic.

Also applies to: 136-150, 176-190

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bfac73b and f0730fa.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (2)
  • .github/workflows/cloud-runner-integrity.yml (1 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Cloud Runner Integrity / Cloud Runner Tests (LocalStack)
  • GitHub Check: Cloud Runner Integrity / Cloud Runner Tests (K8s)
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: iOS on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: Android on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1

Comment on lines +134 to +135
GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stop overriding GITHUB_TOKEN with GIT_PRIVATE_TOKEN across 13 test steps.

The workflow sets GITHUB_TOKEN to ${{ secrets.GIT_PRIVATE_TOKEN }} in all test steps (k8s job: lines 135, 175, 215; localstack job: lines 267, 291, 315, 339, 363, 387, 411, 435, 459, 483). This:

  • Exposes a broader-scope PAT unnecessarily to each job
  • Breaks the intended fallback pattern in the code (try GIT_PRIVATE_TOKEN first, then fall back to GITHUB_TOKEN)
  • Violates principle of least privilege

Replace the pattern across all test steps to use the default runner-provided token:

  GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
- GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+ GITHUB_TOKEN: ${{ github.token }}

Keep GIT_PRIVATE_TOKEN as a separate variable if private repo access is required; the code has proper fallback logic to use it.

Also applies to: 175-175, 215-215, 267-267, 291-291, 315-315, 339-339, 363-363, 387-387, 411-411, 435-435, 459-459, 483-483

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
.github/workflows/cloud-runner-integrity.yml (4)

40-43: Replace unsafe curl | bash with pinned k3d release.

The workflow still uses an unpinned curl | bash invocation, which poses supply-chain risk and reproducibility concerns. This was flagged in previous reviews.

Apply this diff to download a specific k3d release:

-      - name: Install k3d
+      - name: Install k3d (pinned)
         if: ${{ matrix.provider == 'k8s' }}
         run: |
-          curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
+          set -euo pipefail
+          K3D_VERSION=v5.7.3
+          curl -fsSL -o /usr/local/bin/k3d "https://github.com/k3d-io/k3d/releases/download/${K3D_VERSION}/k3d-linux-amd64"
+          chmod +x /usr/local/bin/k3d
           k3d version | cat

48-51: Use region-aware S3 bucket creation for LocalStack.

The workflow creates the S3 bucket without specifying a region, which fails for non-us-east-1 deployments. S3 requires LocationConstraint outside us-east-1. This was flagged in previous reviews.

Apply this diff to handle region-aware bucket creation:

       - name: Create S3 bucket for tests (host LocalStack)
         run: |
-          awslocal s3 mb s3://$AWS_STACK_NAME || true
+          if [ "${AWS_REGION:-us-east-1}" = "us-east-1" ]; then
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" || true
+          else
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" \
+              --create-bucket-configuration LocationConstraint="${AWS_REGION}" || true
+          fi
           awslocal s3 ls

Ensure AWS_REGION is set in the env block (currently lines 17–25), or default to us-east-1 if not set.


147-147: Stop overriding GITHUB_TOKEN with GIT_PRIVATE_TOKEN across 13 test steps.

The workflow sets GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }} in all test steps, which:

  • Exposes a broader-scope PAT unnecessarily to each job
  • Breaks the intended fallback pattern in the code (try GIT_PRIVATE_TOKEN first, then fall back to GITHUB_TOKEN)
  • Violates the principle of least privilege

This was flagged in previous reviews and remains unaddressed.

Replace the pattern across all test steps to use the default runner-provided token:

  GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
- GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+ GITHUB_TOKEN: ${{ github.token }}

Keep GIT_PRIVATE_TOKEN as a separate variable if private repo access is required; the code has proper fallback logic to use it.

Also applies to: 187-187, 227-227, 279-279, 303-303, 327-327, 351-351, 375-375, 399-399, 423-423, 447-447, 471-471, 495-495


276-279: Add AWS_S3_FORCE_PATH_STYLE and AWS_EC2_METADATA_DISABLED to all LocalStack test steps.

The k8s job correctly includes AWS_S3_FORCE_PATH_STYLE: 'true' and AWS_EC2_METADATA_DISABLED: 'true' (lines 144–145, 184–185, 224–225) for LocalStack compatibility, but the localstack job omits these across all 10 test steps. This creates inconsistent behavior and was flagged in previous reviews.

Add these variables to every test step in the localstack job. For example, lines 276–279 should become:

         AWS_ENDPOINT: http://localhost:4566
         AWS_ENDPOINT_URL: http://localhost:4566
+        AWS_S3_FORCE_PATH_STYLE: 'true'
+        AWS_EC2_METADATA_DISABLED: 'true'
         GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}

Apply the same additions to all other test steps in the localstack job (lines 300–303, 324–327, 348–351, 372–375, 396–399, 420–423, 444–447, 468–471, 492–495).

Also applies to: 300-303, 324-327, 348-351, 372-375, 396-399, 420-423, 444-447, 468-471, 492-495

🧹 Nitpick comments (3)
.github/workflows/cloud-runner-integrity.yml (1)

74-81: Remove duplicate cleanup code in k8s job.

Lines 74–81 duplicate the cleanup logic from lines 74–81 (disk cleanup and docker system prune). Consolidate or remove the redundant block to improve readability.

-          # Clean up disk space on the k3d node to prevent evictions
-          echo "Cleaning up disk space on k3d nodes..."
-          docker exec k3d-unity-builder-agent-0 sh -c "df -h && docker system prune -af --volumes || true" || true
-          docker system prune -af --volumes || true
-          # Clean up disk space on the node to prevent evictions
-          echo "Cleaning up disk space on k3d nodes..."
-          docker exec k3d-unity-builder-agent-0 sh -c "df -h && docker system prune -af --volumes || true" || true
-          docker system prune -af --volumes || true
src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (2)

56-65: Consider clarifying the fallback logic and using a constant for error matching.

The fallback recovery path is well-designed, but the condition at line 57 is complex. Consider:

  1. Adding a comment explaining that both isRunning (pre-error) and continueStreaming (post-error) are checked to ensure the pod wasn't running before and after the error occurred.
  2. Extracting the error message string 'previous terminated container' to a named constant to improve maintainability and make the error detection more explicit.

Example refactor:

+  // kubectl error message when previous container logs are unavailable
+  static readonly PREVIOUS_CONTAINER_NOT_FOUND = 'previous terminated container';
+
   static async runTask(
     ...
   ) {
     ...
       } catch (error: any) {
         await new Promise((resolve) => setTimeout(resolve, 3000));
         const continueStreaming = await KubernetesPods.IsPodRunning(podName, namespace, kubeClient);
         CloudRunnerLogger.log(`K8s logging error ${error} ${continueStreaming}`);
-        // If pod is not running and we tried --previous but it failed, try without --previous
-        if (!isRunning && !continueStreaming && error?.message?.includes('previous terminated container')) {
+        // If pod wasn't running initially and still isn't running after error, 
+        // and we got a "previous container not found" error, try fetching current container logs
+        if (!isRunning && !continueStreaming && error?.message?.includes(KubernetesTaskRunner.PREVIOUS_CONTAINER_NOT_FOUND)) {

74-78: Verify non-fatal handling is appropriate for all scenarios.

Treating the "previous terminated container" error as non-fatal improves resilience, but ensure this behavior is acceptable for all use cases where previous container logs might be critical for debugging.

Also consider using the same constant suggested in the previous comment for consistency.

-        if (error?.message?.includes('previous terminated container')) {
+        if (error?.message?.includes(KubernetesTaskRunner.PREVIOUS_CONTAINER_NOT_FOUND)) {
           CloudRunnerLogger.logWarning(`Could not fetch previous container logs, but continuing...`);
           break;
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0730fa and f61478b.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (3)
  • .github/workflows/cloud-runner-integrity.yml (1 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts (5 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: iOS on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: StandaloneLinux64 on 2023.2.2f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneLinux64 on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneLinux64 on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: WebGL on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
🔇 Additional comments (2)
.github/workflows/cloud-runner-integrity.yml (1)

513-513: Update deprecated aws-actions/configure-aws-credentials@v1 to v4 in commented AWS job.

Although the AWS job is currently commented out (line 496), the action reference at line 513 uses the deprecated v1, which will fail on current runners if/when the job is enabled.

Apply this diff to future-proof the commented job:

     - name: Configure AWS Credentials
-      uses: aws-actions/configure-aws-credentials@v1
+      uses: aws-actions/configure-aws-credentials@v4
       with:
         aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (1)

31-35: Good optimization: caching pod status check.

Caching the IsPodRunning result improves clarity and avoids redundant checks when determining log flags. The separate check at line 54 after an error is appropriate since pod state may have changed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (1)

55-90: Fallback and retry logic looks robust; consider clarifying retry counter semantics

The new fallback when previous‑container logs are unavailable, plus the special‑case handling of "previous terminated container" errors, makes the log streaming much more resilient and avoids failing builds just because prior container logs are missing.

At the same time, retriesAfterFinish is now used as a general retry counter for several cases (fallback attempts, generic log errors, and “no end marker” scenarios), not only “after finish”. The name is a bit misleading and makes it harder to reason about when the loop will actually stop.

To improve readability and future maintenance, consider:

  • Renaming retriesAfterFinish to something more generic like logRetryCount or logErrorRetries.
  • Adding a short comment near its declaration documenting that it governs both fallback and general log‑error retries, and that exceeding maxRetry for "previous terminated container" errors is treated as non‑fatal, while other errors are re‑thrown.

This doesn’t change behavior but makes the control flow easier to understand for future contributors.

Also applies to: 100-107

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f61478b and 192cb2e.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (1)
  • src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (3)
src/model/cloud-runner/services/core/cloud-runner-logger.ts (1)
  • error (24-26)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
src/model/cloud-runner/services/core/follow-log-stream-service.ts (1)
  • FollowLogStreamService (7-57)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: StandaloneLinux64 on 2023.2.2f1
  • GitHub Check: WebGL on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: tvOS on 2021.3.32f1

Comment on lines +31 to 34
const isRunning = await KubernetesPods.IsPodRunning(podName, namespace, kubeClient);
let extraFlags = ``;
extraFlags += (await KubernetesPods.IsPodRunning(podName, namespace, kubeClient))
? ` -f -c ${containerName} -n ${namespace}`
: ` --previous -n ${namespace}`;
extraFlags += isRunning ? ` -f -c ${containerName} -n ${namespace}` : ` --previous -n ${namespace}`;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing -c flag for --previous logs can break multi‑container pods

In the non‑running case, extraFlags does not specify the container name:

const isRunning = await KubernetesPods.IsPodRunning(podName, namespace, kubeClient);
let extraFlags = ``;
extraFlags += isRunning ? ` -f -c ${containerName} -n ${namespace}` : ` --previous -n ${namespace}`;

For pods with multiple containers, kubectl logs <pod> --previous without -c may either fail or select the wrong container, even though containerName is available.

Consider including -c ${containerName} in the --previous branch for consistency and correctness:

-      extraFlags += isRunning ? ` -f -c ${containerName} -n ${namespace}` : ` --previous -n ${namespace}`;
+      extraFlags += isRunning
+        ? ` -f -c ${containerName} -n ${namespace}`
+        : ` --previous -c ${containerName} -n ${namespace}`;
🤖 Prompt for AI Agents
In src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts around lines
31 to 34, the branch that builds extraFlags for non-running pods omits the -c
container flag which can break multi-container pods; change the ternary so both
branches include `-c ${containerName}` (i.e., use ` -f -c ${containerName} -n
${namespace}` for running and ` --previous -c ${containerName} -n ${namespace}`
for non‑running) to ensure the correct container is selected.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (3)
src/model/cloud-runner/remote-client/index.ts (2)

282-282: Fix contradictory LFS folder assertion as previously flagged.

The assertion checks that the LFS folder exists (fs.existsSync() returns true), but the message states it "should not exist before caching." This contradiction was flagged in a previous review.

Either invert the check to !fs.existsSync(path.join('.git', 'lfs')) or update the message to match the actual assertion.


242-268: Fix unused catch bindings as previously flagged.

This issue was already identified in a previous review. Remove unused catch bindings (_error, _error2, _error3) and use parameterless catch blocks where the error is not used, or name the caught error appropriately when it is thrown.

Based on past review comments, the recommended fix is to use catch without parameters for unused errors, and preserve the error variable where it's re-thrown.

src/model/cloud-runner/workflows/build-automation-workflow.ts (1)

201-206: Non-containerized providers do not execute the build.

The non-containerized code path never copies the build scripts or executes /entrypoint.sh, so no actual Unity build occurs. The workflow only logs markers and runs post-build tasks against non-existent build outputs.

This confirms the issue flagged in the previous review: "Local provider no longer executes build entrypoint... local runs will immediately log success without invoking any build logic, so local provider builds will silently do nothing."

The non-containerized path should stage and execute the build scripts similar to the containerized paths. Apply this structure:

 return `
     echo "game ci start"
     echo "game ci start" >> "$LOG_FILE"
+    mkdir -p ${`${CloudRunnerFolders.ToLinuxFolder(CloudRunnerFolders.projectBuildFolderAbsolute)}/build`}
+    cd ${CloudRunnerFolders.ToLinuxFolder(CloudRunnerFolders.projectPathAbsolute)}
+    cp -r "${CloudRunnerFolders.ToLinuxFolder(path.join(distFolder, 'default-build-script'))}" "/UnityBuilderAction"
+    cp -r "${CloudRunnerFolders.ToLinuxFolder(path.join(ubuntuPlatformsFolder, 'entrypoint.sh'))}" "/entrypoint.sh"
+    cp -r "${CloudRunnerFolders.ToLinuxFolder(path.join(ubuntuPlatformsFolder, 'steps'))}" "/steps"
+    chmod -R +x "/entrypoint.sh"
+    chmod -R +x "/steps"
+    /entrypoint.sh 2>&1 | tee -a "$LOG_FILE" &
     timeout 3s node ${builderPath} -m remote-cli-log-stream --logFile "$LOG_FILE" || true
     node ${builderPath} -m remote-cli-post-build`;
🧹 Nitpick comments (2)
src/model/cloud-runner/remote-client/index.ts (1)

128-134: Consider consolidating logging calls.

The success message is logged three separate ways (CloudRunnerLogger, stdout, console). If CloudRunnerLogger already outputs to stdout/console, this may be redundant.

src/model/cloud-runner/workflows/build-automation-workflow.ts (1)

95-106: Consider preserving git error output for troubleshooting.

The multi-tier fallback strategy (checking remote branch, then falling back to known branches) is well-designed. However, redirecting stderr to /dev/null on line 98 may hide authentication failures or network issues that would help diagnose clone problems.

Consider logging git errors while still allowing the fallback chain to proceed:

-if [ -n "$(git ls-remote --heads \"$REPO\" \"$BRANCH\" 2>/dev/null)" ]; then
+if git ls-remote --heads "$REPO" "$BRANCH" 2>&1 | grep -q "refs/heads/$BRANCH"; then
   git clone -q -b "$BRANCH" "$REPO" "$DEST"
 else
   echo "Remote branch $BRANCH not found in $REPO; falling back to a known branch"
-  git clone -q -b cloud-runner-develop "$REPO" "$DEST" \
-    || git clone -q -b main "$REPO" "$DEST" \
-    || git clone -q "$REPO" "$DEST"
+  git clone -q -b cloud-runner-develop "$REPO" "$DEST" 2>&1 \
+    || git clone -q -b main "$REPO" "$DEST" 2>&1 \
+    || git clone -q "$REPO" "$DEST" 2>&1 \
+    || { echo "ERROR: All git clone attempts failed for $REPO"; exit 1; }
 fi
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 192cb2e and 46e3ba8.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (2)
  • src/model/cloud-runner/remote-client/index.ts (3 hunks)
  • src/model/cloud-runner/workflows/build-automation-workflow.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-30T18:01:50.339Z
Learnt from: MichaelBuhler
Repo: game-ci/unity-builder PR: 685
File: .github/workflows/build-tests-ubuntu.yml:146-146
Timestamp: 2025-01-30T18:01:50.339Z
Learning: The build steps duplication in the workflow is intentional to implement a retry mechanism with different delay intervals between attempts.

Applied to files:

  • src/model/cloud-runner/workflows/build-automation-workflow.ts
🧬 Code graph analysis (1)
src/model/cloud-runner/workflows/build-automation-workflow.ts (1)
src/model/index.ts (1)
  • CloudRunner (27-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2022.3.13f1
  • GitHub Check: StandaloneLinux64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: WebGL on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: iOS on 2021.3.45f1
🔇 Additional comments (9)
src/model/cloud-runner/remote-client/index.ts (4)

66-68: LGTM: Enhanced observability for cache key.

The explicit cache key logging aids debugging and test assertions.


70-89: LGTM: Defensive cache push guards.

The existence and content checks prevent pushing empty or missing directories, improving robustness.


91-110: LGTM: Consistent defensive pattern.

Build cache push uses the same guarded pattern as Library cache, maintaining consistency.


113-120: LGTM: Safe cleanup guard.

Checking both path variants before cleanup prevents spurious errors.

src/model/cloud-runner/workflows/build-automation-workflow.ts (5)

53-60: Container-aware path resolution looks correct.

The isContainerized flag appropriately distinguishes between containerized providers (aws, k8s, local-docker) and non-containerized ones, with the builderPath correctly resolving to absolute paths for containers and process.cwd() for local execution.


64-82: Conditional environment setup is well-structured.

The workflow correctly skips package installation for local-docker (where the Docker image provides tooling) and non-containerized providers, while maintaining appropriate paths for workspace and log files based on the execution context.


189-198: Containerized build execution is correct.

The non-local-docker containerized path properly stages the Unity build scripts, executes /entrypoint.sh, and streams logs through the remote CLI. The build logic is intact for these providers.


150-161: Git LFS handling is appropriately defensive.

The conditional Git LFS setup correctly checks for a .git directory before attempting LFS operations and gracefully handles environments where Git LFS is unavailable or unnecessary. The error tolerance (|| true) is appropriate since LFS is optional for many repositories.


182-185: Cache mirroring supports testing workflows.

The cache mirroring from /data/cache back into the workspace enables test assertions to validate cache behavior in local-docker runs. The error tolerance is appropriate since this is a diagnostic aid rather than a critical build step.


return `export GIT_DISCOVERY_ACROSS_FILESYSTEM=1
${cloneBuilderCommands}
echo "log start" >> /home/job-log.txt
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -nP --type=ts 'LOG_FILE|job-log\.txt' src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 1146


🏁 Script executed:

sed -n '75,90p' src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 1110


🏁 Script executed:

sed -n '108,126p' src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 923


🏁 Script executed:

sed -n '165,175p' src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 1335


🏁 Script executed:

sed -n '190,210p' src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 1354


Replace hardcoded log path with $LOG_FILE variable for consistency.

Line 119 should use $LOG_FILE instead of hardcoding /home/job-log.txt. Although the hardcoded path matches the containerized value set on line 82, the variable is already defined and should be used consistently throughout the script, as seen in lines 130, 204, and 205.

🤖 Prompt for AI Agents
In src/model/cloud-runner/workflows/build-automation-workflow.ts around line
119, the script echoes to the hardcoded path "/home/job-log.txt" instead of
using the existing $LOG_FILE variable; change that echo command to append to
"$LOG_FILE" so the script consistently uses the configured log path (matching
usage on lines 82, 130, 204, 205).

Comment on lines +128 to +131
return `export GIT_DISCOVERY_ACROSS_FILESYSTEM=1
mkdir -p "$(dirname "$LOG_FILE")"
echo "log start" >> "$LOG_FILE"
echo "CACHE_KEY=$CACHE_KEY"`;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd src/model/cloud-runner/workflows && wc -l build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 97


🏁 Script executed:

cat -n src/model/cloud-runner/workflows/build-automation-workflow.ts | sed -n '80,140p'

Repository: game-ci/unity-builder

Length of output: 3331


🏁 Script executed:

# Search for how builderPath is used
rg -n 'builderPath' src/model/cloud-runner/workflows/build-automation-workflow.ts -A 3 -B 1

Repository: game-ci/unity-builder

Length of output: 3882


🏁 Script executed:

# Search for non-containerized provider strategy usage
rg -n 'providerStrategy.*local' src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -C 3

Repository: game-ci/unity-builder

Length of output: 11083


🏁 Script executed:

# Look for initialization of builder in non-containerized scenarios
rg -n 'isContainerized\s*=\s*false|!isContainerized' src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -C 5

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

# Search for how "local" provider is configured and whether builder is pre-installed
rg -n "providerStrategy.*=.*['\"]local['\"]" src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -B 5 -A 10

Repository: game-ci/unity-builder

Length of output: 3008


🏁 Script executed:

# Look for any documentation or setup instructions for local provider
fd -e md -e txt | xargs grep -l "local provider\|local strategy" 2>/dev/null | head -5

Repository: game-ci/unity-builder

Length of output: 107


🏁 Script executed:

# Search for how the builder is expected to exist before setupCommands is called
rg -n "process\.cwd\(\)" src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -B 3 -A 3

Repository: game-ci/unity-builder

Length of output: 6027


🏁 Script executed:

# Check if there's any initialization for non-containerized paths
rg -n "dist.*index\.js\|builderPath" src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -B 2 -A 2 | head -80

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

cat -n src/model/cloud-runner/providers/README.md

Repository: game-ci/unity-builder

Length of output: 8318


🏁 Script executed:

# Look for local provider implementation and initialization
fd "local" src/model/cloud-runner/providers/ -t f | head -10

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

# Search for where the builder dist folder might be copied/prepared for local provider
rg -n "local.*provider\|LocalProvider" src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -B 3 -A 10 | head -100

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

# Search for where the builder might be built/compiled before running in non-containerized mode
rg -n "npm run build\|npm run package\|dist/" src/ -t ts --type-add='ts:*.ts' | grep -E "local|setup|build" | head -40

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

# Check if there's a build step that creates dist/index.js before running workflows
rg -n "compile\|package\|build" src/index.ts src/main.ts -B 3 -A 3

Repository: game-ci/unity-builder

Length of output: 116


🏁 Script executed:

# Look at how the workflow is actually executed - check the main entry point
fd "index.ts|main.ts" src/ -type f | head -5

Repository: game-ci/unity-builder

Length of output: 235


🏁 Script executed:

# Find package.json to understand entry point and build setup
cat package.json | grep -A 5 -B 5 '"main"\|"bin"\|"dist"'

Repository: game-ci/unity-builder

Length of output: 379


🏁 Script executed:

# Check if dist/index.js is pre-built or built at runtime
ls -la dist/ 2>/dev/null | head -20

Repository: game-ci/unity-builder

Length of output: 867


🏁 Script executed:

# Look at the root index.ts or entry point
git ls-files | grep -E "^(src/)?index\.(ts|js)$|^main\.(ts|js)$"

Repository: game-ci/unity-builder

Length of output: 77


🏁 Script executed:

# Check the actual structure of src directory
git ls-files src/ | head -20

Repository: game-ci/unity-builder

Length of output: 666


🏁 Script executed:

# Check how local and local-system providers are actually used
rg -n "local-system\|providerStrategy.*local" src/model/cloud-runner/ -t ts --type-add='ts:*.ts' -B 2 -A 5 | head -80

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

# Verify if there's any documentation about how non-containerized mode works
fd -e md | xargs grep -l "local.*provider\|non-containerized" 2>/dev/null

Repository: game-ci/unity-builder

Length of output: 107


🏁 Script executed:

# Check the git history or comments around the setupCommands method
git log --oneline -n 20 -- src/model/cloud-runner/workflows/build-automation-workflow.ts 2>/dev/null | head -10

Repository: game-ci/unity-builder

Length of output: 84


🏁 Script executed:

# Check if there are any comments or documentation about why non-containerized doesn't clone
rg -n "non-containerized\|isContainerized" src/model/cloud-runner/workflows/build-automation-workflow.ts -B 5 -A 2 | grep -E "//|/\*"

Repository: game-ci/unity-builder

Length of output: 47


🏁 Script executed:

# Verify that node ${builderPath} commands are actually used in both paths
rg -n "node \${builderPath}" src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 173


🏁 Script executed:

# Search for node command with builderPath
rg -n 'node.*builderPath' src/model/cloud-runner/workflows/build-automation-workflow.ts

Repository: game-ci/unity-builder

Length of output: 837


Non-containerized setup doesn't clone the builder repository while assuming it exists downstream.

The non-containerized code path (lines 128–131) only initializes logging but never clones the Unity builder repository. Since builderPath for non-containerized mode points to process.cwd()/dist/index.js, subsequent node ${builderPath} commands (lines 205–206) will fail unless the builder is already present in the working directory. This creates a critical inconsistency: containerized providers clone the builder (lines 108–126), while non-containerized providers do not, yet both expect the builder to exist.

Clarify whether non-containerized providers are expected to have the builder pre-installed, and if so, document this requirement. Otherwise, the non-containerized path should include clone logic similar to the containerized path.

Comment on lines +167 to +170
if ! command -v node > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/node && chmod +x /usr/local/bin/node; fi
if ! command -v npm > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/npm && chmod +x /usr/local/bin/npm; fi
if ! command -v n > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/n && chmod +x /usr/local/bin/n; fi
if ! command -v yarn > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/yarn && chmod +x /usr/local/bin/yarn; fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stub binaries may mask missing tool dependencies.

Creating no-op stubs for node, npm, n, and yarn when they're missing (lines 167-170) will cause any scripts that genuinely require these tools to silently fail. If the local-docker image is expected to provide these tools, their absence likely indicates a misconfiguration that should be diagnosed rather than hidden.

Consider at minimum logging a warning before creating stubs:

+echo "WARNING: Creating stub for missing 'node' command - build may fail if node is required"
 if ! command -v node > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/node && chmod +x /usr/local/bin/node; fi
+echo "WARNING: Creating stub for missing 'npm' command - build may fail if npm is required"
 if ! command -v npm > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/npm && chmod +x /usr/local/bin/npm; fi
+echo "WARNING: Creating stub for missing 'n' command - build may fail if n is required"
 if ! command -v n > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/n && chmod +x /usr/local/bin/n; fi
+echo "WARNING: Creating stub for missing 'yarn' command - build may fail if yarn is required"
 if ! command -v yarn > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/yarn && chmod +x /usr/local/bin/yarn; fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if ! command -v node > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/node && chmod +x /usr/local/bin/node; fi
if ! command -v npm > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/npm && chmod +x /usr/local/bin/npm; fi
if ! command -v n > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/n && chmod +x /usr/local/bin/n; fi
if ! command -v yarn > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/yarn && chmod +x /usr/local/bin/yarn; fi
echo "WARNING: Creating stub for missing 'node' command - build may fail if node is required"
if ! command -v node > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/node && chmod +x /usr/local/bin/node; fi
echo "WARNING: Creating stub for missing 'npm' command - build may fail if npm is required"
if ! command -v npm > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/npm && chmod +x /usr/local/bin/npm; fi
echo "WARNING: Creating stub for missing 'n' command - build may fail if n is required"
if ! command -v n > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/n && chmod +x /usr/local/bin/n; fi
echo "WARNING: Creating stub for missing 'yarn' command - build may fail if yarn is required"
if ! command -v yarn > /dev/null 2>&1; then printf '#!/bin/sh\nexit 0\n' > /usr/local/bin/yarn && chmod +x /usr/local/bin/yarn; fi
🤖 Prompt for AI Agents
In src/model/cloud-runner/workflows/build-automation-workflow.ts around lines
167-170, the current behavior silently creates no-op stubs for node/npm/n/yarn
when those binaries are missing; change this to avoid masking missing tool
dependencies by (1) removing the silent no-op replacements and instead emitting
a clear warning/error that the expected tool is absent (log to stderr or your
workflow logger with the binary name and suggested fix), and (2) if stubs are
still required for a specific compatibility case, only create them after logging
the warning and documenting the condition in code comments; ensure the workflow
exits with a non-zero status or fails the job when a genuine runtime dependency
is missing unless an explicit opt-in flag allows stubbing.

tar -cf "/data/cache/$CACHE_KEY/build/build-$BUILD_GUID.tar" --files-from /dev/null || touch "/data/cache/$CACHE_KEY/build/build-$BUILD_GUID.tar"
fi
# Run post-build tasks - ensure output is captured even if command fails
node ${builderPath} -m remote-cli-post-build || echo "Post-build command completed with warnings"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Post-build error suppression may hide critical failures.

The || echo "Post-build command completed with warnings" pattern (line 180) converts all post-build failures into a success state with a misleading message. If remote-cli-post-build fails while uploading artifacts or persisting cache, the build will appear to succeed despite incomplete outputs.

Consider either:

  1. Propagating the failure if post-build tasks are critical, or
  2. Logging the actual error and exit code:
-node ${builderPath} -m remote-cli-post-build || echo "Post-build command completed with warnings"
+node ${builderPath} -m remote-cli-post-build || { EXIT_CODE=$?; echo "WARNING: Post-build command failed with exit code $EXIT_CODE"; exit $EXIT_CODE; }

Or if post-build failures are truly non-critical:

-node ${builderPath} -m remote-cli-post-build || echo "Post-build command completed with warnings"
+node ${builderPath} -m remote-cli-post-build || { EXIT_CODE=$?; echo "WARNING: Post-build command failed with exit code $EXIT_CODE, continuing anyway"; }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
node ${builderPath} -m remote-cli-post-build || echo "Post-build command completed with warnings"
node ${builderPath} -m remote-cli-post-build || { EXIT_CODE=$?; echo "WARNING: Post-build command failed with exit code $EXIT_CODE"; exit $EXIT_CODE; }
🤖 Prompt for AI Agents
In src/model/cloud-runner/workflows/build-automation-workflow.ts around line
180, the shell line `node ${builderPath} -m remote-cli-post-build || echo
"Post-build command completed with warnings"` hides failures; change it to
either let the command fail (remove the `||` fallback) if post-build steps are
critical, or capture the exit code and log the error and exit code before
exiting non-zero (e.g., run the command, check `$?`, log context and exit with
that code), or — only if truly non-critical — replace the echo with a detailed
warning that includes the captured exit code and stderr output so failures are
visible for debugging.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
.github/workflows/cloud-runner-integrity.yml (4)

56-59: Use region-aware S3 API for bucket creation.

The past review suggested using s3api create-bucket with LocationConstraint for non-us-east-1 regions, which was accepted but not applied. S3 requires this constraint outside us-east-1.

       - name: Create S3 bucket for tests (host LocalStack)
         run: |
-          awslocal s3 mb s3://$AWS_STACK_NAME || true
+          if [ "${AWS_REGION:-us-east-1}" = "us-east-1" ]; then
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" || true
+          else
+            awslocal s3api create-bucket --bucket "$AWS_STACK_NAME" \
+              --create-bucket-configuration LocationConstraint="${AWS_REGION:-us-east-1}" || true
+          fi
           awslocal s3 ls

162-163: Stop overriding GITHUB_TOKEN with GIT_PRIVATE_TOKEN.

The workflow overrides GITHUB_TOKEN with GIT_PRIVATE_TOKEN across 13 test steps. This violates least-privilege by exposing a broader-scope PAT and breaks the intended fallback logic (try private token first, then fall back to runner token).

Use the default runner-provided token and keep GIT_PRIVATE_TOKEN as a separate variable only if private repo access is truly needed:

  GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
- GITHUB_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}
+ GITHUB_TOKEN: ${{ github.token }}

Apply this fix to all 13 test steps: lines 162–163 (k8s), 202–203 (k8s), 242–243 (k8s), and lines 294–295, 318–319, 342–343, 366–367, 390–391, 414–415, 438–439, 462–463, 486–487, 510–511 (localstack job).


288-295: Add LocalStack-specific environment variables to all localstack job test steps.

The k8s job correctly includes AWS_S3_FORCE_PATH_STYLE: 'true' and AWS_EC2_METADATA_DISABLED: 'true' for LocalStack compatibility, but the localstack job omits these across all 10 test steps. This creates inconsistent behavior and may cause unexpected LocalStack interactions.

Add these variables to every test step in the localstack job. For example, lines 288–295:

          AWS_ENDPOINT: http://localhost:4566
          AWS_ENDPOINT_URL: http://localhost:4566
+         AWS_S3_FORCE_PATH_STYLE: 'true'
+         AWS_EC2_METADATA_DISABLED: 'true'
          GIT_PRIVATE_TOKEN: ${{ secrets.GIT_PRIVATE_TOKEN }}

Apply the same additions to all other test steps: cloud-runner-end2end-caching, cloud-runner-end2end-retaining, cloud-runner-caching, cloud-runner-environment, cloud-runner-image, cloud-runner-hooks, cloud-runner-local-persistence, cloud-runner-locking-core, and cloud-runner-locking-get-locked.


40-43: Pin k3d version to avoid supply-chain risk.

The code installs k3d from the main branch without version pinning, which creates a supply-chain risk. Download a specific k3d release directly with version pinning instead.

      - name: Install k3d
        run: |
-         curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
+         set -euo pipefail
+         K3D_VERSION=v5.7.3
+         curl -fsSL -o /usr/local/bin/k3d "https://github.com/k3d-io/k3d/releases/download/${K3D_VERSION}/k3d-linux-amd64"
+         chmod +x /usr/local/bin/k3d
          k3d version | cat
🧹 Nitpick comments (1)
.github/workflows/cloud-runner-integrity.yml (1)

512-553: Update commented AWS job to use aws-actions/configure-aws-credentials@v4.

The commented AWS job references aws-actions/configure-aws-credentials@v1 (line 529), which is deprecated and will fail on current runners. If this job is intended for future use, update to v4:

     - name: Configure AWS Credentials
-      uses: aws-actions/configure-aws-credentials@v1
+      uses: aws-actions/configure-aws-credentials@v4
       with:
         aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
         aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
         aws-region: ${{ env.AWS_REGION }}

If this job will remain unused, consider removing the block entirely to reduce maintenance overhead.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46e3ba8 and 939aa6b.

📒 Files selected for processing (1)
  • .github/workflows/cloud-runner-integrity.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: Tests
  • GitHub Check: StandaloneLinux64 on 2023.2.2f1
  • GitHub Check: WebGL on 2023.2.2f1
  • GitHub Check: WebGL on 6000.0.36f1
  • GitHub Check: StandaloneLinux64 on 2023.2.2f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2022.3.13f1
  • GitHub Check: StandaloneLinux64 on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
src/model/cloud-runner/remote-client/index.ts (4)

288-289: Align LFS folder assertion with its message.

The assertion currently checks that .git/lfs exists while the message says “LFS folder should not exist before caching”, which is contradictory:

assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');

Either invert the condition or update the message; based on the surrounding flow (clone without LFS and only then pull from cache), the intent seems to be to assert non-existence:

-    assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');
+    assert(!fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');

248-285: Fix unused catch bindings and improve checkout fallback clarity.

The new SHA/branch/PR fetch and checkout flow is good, but the catch bindings (_error, _error2, _error3, later _error) are either unused or only rethrown, which is what ESLint had already flagged in earlier reviews. You can simplify the try/catch structure and remove unused bindings while preserving semantics:

-    // Ensure refs exist (tags and PR refs)
-    await CloudRunnerSystem.Run(`git fetch --all --tags || true`);
+    // Ensure refs exist (tags and PR refs)
+    await CloudRunnerSystem.Run(`git fetch --all --tags || true`);
...
-    if (targetSha) {
-      try {
-        await CloudRunnerSystem.Run(`git checkout ${targetSha}`);
-      } catch (_error) {
-        try {
-          await CloudRunnerSystem.Run(`git fetch origin ${targetSha} || true`);
-          await CloudRunnerSystem.Run(`git checkout ${targetSha}`);
-        } catch (_error2) {
-          RemoteClientLogger.logWarning(`Falling back to branch checkout; SHA not found: ${targetSha}`);
-          try {
-            await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
-          } catch (_error3) {
-            if ((targetBranch || '').startsWith('pull/')) {
-              await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
-            } else {
-              throw _error2;
-            }
-          }
-        }
-      }
-    } else {
-      try {
-        await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
-      } catch (_error) {
-        if ((targetBranch || '').startsWith('pull/')) {
-          await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
-        } else {
-          throw _error;
-        }
-      }
-      RemoteClientLogger.log(`buildParameter Git Sha is empty`);
-    }
+    if (targetSha) {
+      try {
+        await CloudRunnerSystem.Run(`git checkout ${targetSha}`);
+      } catch {
+        try {
+          await CloudRunnerSystem.Run(`git fetch origin ${targetSha} || true`);
+          await CloudRunnerSystem.Run(`git checkout ${targetSha}`);
+        } catch (error) {
+          RemoteClientLogger.logWarning(`Falling back to branch checkout; SHA not found: ${targetSha}`);
+          try {
+            await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
+          } catch {
+            if ((targetBranch || '').startsWith('pull/')) {
+              await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
+            } else {
+              throw error;
+            }
+          }
+        }
+      }
+    } else {
+      try {
+        await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
+      } catch (error) {
+        if ((targetBranch || '').startsWith('pull/')) {
+          await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
+        } else {
+          throw error;
+        }
+      }
+      RemoteClientLogger.log(`buildParameter Git Sha is empty`);
+    }

Also, to satisfy lines-around-comment, consider adding a blank line before the // Ensure refs exist... comment (after the preceding log call).


309-371: LFS pull strategies never fall back due to suppressError and no-op catch.

In pullLatestLFS, the multi-strategy LFS pull is conceptually great (plain pull → GIT_PRIVATE_TOKENGITHUB_TOKEN → warning), but as written it won’t actually fall back:

  • All git lfs pull calls use CloudRunnerSystem.Run(..., true), which sets suppressError = true, so non‑zero exit codes do not throw.
  • The first try block always logs success and returns, even if git lfs pull failed.
  • The catch (_error) { void 0; } is effectively unreachable for LFS failures and still trips ESLint (unused binding + void 0 no‑op).

This means token-based strategies and the final warning will never be exercised on LFS failures.

You can fix both correctness and style with something like:

   await CloudRunnerSystem.Run(`git config --global filter.lfs.process "git-lfs filter-process"`);
   if (CloudRunner.buildParameters.skipLfs) {
     RemoteClientLogger.log(`Skipping LFS pull (skipLfs=true)`);

     return;
   }

-    // Best effort: try plain pull first (works for public repos or pre-configured auth)
-    try {
-      await CloudRunnerSystem.Run(`git lfs pull`, true);
-      await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
-      RemoteClientLogger.log(`Pulled LFS files without explicit token configuration`);
-
-      return;
-    } catch (_error) {
-      /* no-op: best-effort git lfs pull without tokens may fail */
-      void 0;
-    }
+    // Best effort: try plain pull first (works for public repos or pre-configured auth)
+    try {
+      await CloudRunnerSystem.Run(`git lfs pull`);
+      await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
+      RemoteClientLogger.log(`Pulled LFS files without explicit token configuration`);
+
+      return;
+    } catch {
+      // no-op here: fall through to token-based strategies
+    }
...
-    // Try with GIT_PRIVATE_TOKEN
-    try {
+    // Try with GIT_PRIVATE_TOKEN
+    try {
       const gitPrivateToken = process.env.GIT_PRIVATE_TOKEN;
       if (gitPrivateToken) {
         RemoteClientLogger.log(`Attempting to pull LFS files with GIT_PRIVATE_TOKEN...`);
...
-        await CloudRunnerSystem.Run(`git lfs pull`, true);
-        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
+        await CloudRunnerSystem.Run(`git lfs pull`);
+        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
...
-    // Try with GITHUB_TOKEN
-    try {
+    // Try with GITHUB_TOKEN
+    try {
       const githubToken = process.env.GITHUB_TOKEN;
       if (githubToken) {
...
-        await CloudRunnerSystem.Run(`git lfs pull`, true);
-        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
+        await CloudRunnerSystem.Run(`git lfs pull`);
+        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);

This way, failures in each strategy actually trigger the next path and, if all fail, you hit the final warning log and continue without crashing.


377-408: Retained workspace SHA/branch fallback is good; clean up catch bindings.

The retained-workspace flow now:

  • Logs the cache key (helpful).
  • Fetches tags and PR refs.
  • Does reset --hard and checkout by SHA, with branch and origin/branch fallbacks.

Functionally this is solid, but the catch bindings repeat the earlier ESLint issues:

      try {
        await CloudRunnerSystem.Run(`git reset --hard "${sha}"`);
        await CloudRunnerSystem.Run(`git checkout ${sha}`);
      } catch (_error) {
        RemoteClientLogger.logWarning(...);
        try {
          await CloudRunnerSystem.Run(`git checkout ${branch}`);
        } catch (_error2) {
          if ((branch || '').startsWith('pull/')) {
            await CloudRunnerSystem.Run(`git checkout origin/${branch}`);
          } else {
            throw _error2;
          }
        }
      }

You can remove unused bindings and keep the rethrow behavior:

-      try {
-        await CloudRunnerSystem.Run(`git reset --hard "${sha}"`);
-        await CloudRunnerSystem.Run(`git checkout ${sha}`);
-      } catch (_error) {
+      try {
+        await CloudRunnerSystem.Run(`git reset --hard "${sha}"`);
+        await CloudRunnerSystem.Run(`git checkout ${sha}`);
+      } catch {
         RemoteClientLogger.logWarning(`Retained workspace: SHA not found, falling back to branch ${branch}`);
-        try {
-          await CloudRunnerSystem.Run(`git checkout ${branch}`);
-        } catch (_error2) {
-          if ((branch || '').startsWith('pull/')) {
-            await CloudRunnerSystem.Run(`git checkout origin/${branch}`);
-          } else {
-            throw _error2;
-          }
-        }
+        try {
+          await CloudRunnerSystem.Run(`git checkout ${branch}`);
+        } catch (error) {
+          if ((branch || '').startsWith('pull/')) {
+            await CloudRunnerSystem.Run(`git checkout origin/${branch}`);
+          } else {
+            throw error;
+          }
+        }
       }
🧹 Nitpick comments (2)
src/model/cloud-runner/remote-client/index.ts (2)

44-52: Ensure log file writes include newlines in k8s path.

In the k8s branch, fs.appendFileSync(logFile, element); (and the lingering line write) omit a newline, so the log file can end up with concatenated lines, unlike the stdout/logging behavior.

Consider appending \n to keep the file line-oriented and consistent:

-        if (CloudRunnerOptions.providerStrategy === 'k8s') {
-          fs.appendFileSync(logFile, element);
-          // Write to stdout so kubectl logs can capture it
-          process.stdout.write(`${element}\n`);
+        if (CloudRunnerOptions.providerStrategy === 'k8s') {
+          fs.appendFileSync(logFile, `${element}\n`);
+          // Write to stdout so kubectl logs can capture it
+          process.stdout.write(`${element}\n`);
...
-      if (CloudRunnerOptions.providerStrategy === 'k8s') {
-        if (lingeringLine) {
-          fs.appendFileSync(logFile, lingeringLine);
-          process.stdout.write(`${lingeringLine}\n`);
-        }
+      if (CloudRunnerOptions.providerStrategy === 'k8s') {
+        if (lingeringLine) {
+          fs.appendFileSync(logFile, `${lingeringLine}\n`);
+          process.stdout.write(`${lingeringLine}\n`);
+        }

Also applies to: 57-66


72-127: Post-build cache guards look good; minor robustness tweaks optional.

The new guards for Library/Build cache pushes and the retained-workspace cleanup check are solid: they avoid throwing on missing/empty folders and log why cache writes are skipped. Logging the cache key is also helpful for debugging.

If you want to harden this further (optional), you could:

  • Include libraryFolderHost / buildFolderHost in the skip messages to aid remote debugging.
  • Normalize the cache destination paths via CloudRunnerFolders.libraryCacheFolderFull rather than manual /Library and /build suffixes, to centralize path logic.

No blocking issues from this block; behaviorally this is an improvement.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 939aa6b and 7c89090.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (1)
  • src/model/cloud-runner/remote-client/index.ts (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/model/cloud-runner/remote-client/index.ts (8)
src/model/cloud-runner/remote-client/remote-client-logger.ts (1)
  • RemoteClientLogger (7-93)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • CloudRunnerFolders (6-90)
src/model/cloud-runner/remote-client/caching.ts (1)
  • Caching (15-290)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
src/model/input.ts (2)
  • gitPrivateToken (202-204)
  • branch (56-64)
src/model/github.ts (1)
  • sha (28-30)
src/model/cloud-runner/options/cloud-runner-options.ts (1)
  • branch (88-98)
src/model/versioning.ts (1)
  • branch (19-21)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: Android on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
🔇 Additional comments (1)
src/model/cloud-runner/remote-client/index.ts (1)

134-141: Activation success signaling is clear and k8s-friendly.

Writing "Activation successful" to both stdout and CloudRunnerLogger aligns with the k8s log-streaming behavior and test assertions.

No changes requested here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
src/model/cloud-runner/workflows/build-automation-workflow.ts (5)

81-83: Use $LOG_FILE consistently instead of hardcoded /home/job-log.txt

You set LOG_FILE at line 82, but several places still hardcode /home/job-log.txt (log start in setup, game ci start logs, and both remote-cli-log-stream invocations). This splits log writes between two paths and makes changing the log location harder. It would be cleaner and less error‑prone to route all log writes and log‑stream invocations through $LOG_FILE (and ensure its directory exists in the containerized branch as you already do in the non‑container branch).

Also applies to: 117-121, 171-171, 197-199, 207-208


90-132: Non‑containerized setup assumes a prebuilt local dist/index.js without validation

In setupCommands, the clone/fallback logic only runs when isContainerized is true; the non‑container branch just sets up logging. Combined with builderPath = process.cwd()/dist/index.js for non‑container runs, this assumes the builder has already been built and is present in the current working directory. If that invariant isn’t strictly guaranteed (e.g., for local/local-system or future non‑container providers), node ${builderPath} calls will fail late and opaquely. Either reuse the clone logic (or a variant) for non‑container providers, or at least validate that ${builderPath} exists and fail fast with a clear message / documented requirement.


162-170: Stub node/npm/n/yarn binaries can hide real dependency issues

Creating no‑op stubs for node, npm, n, and yarn when they’re missing will cause any step that truly depends on these tools (including node ${builderPath} ...) to silently do nothing yet still “succeed”. If the local-docker image is misconfigured and lacks these tools, that’s likely something you want to surface, not mask. Prefer failing fast with a clear error, or at minimum log a loud warning and gate stub creation behind an explicit opt‑in flag intended only for specific test scenarios.


179-180: Post‑build error handling currently masks remote-cli-post-build failures

Both containerized branches treat remote-cli-post-build as best‑effort: in the local‑docker path via || echo "Post-build command completed with warnings" and in the standard containerized path via ... || true. If remote-cli-post-build is responsible for artifact upload, cache persistence, or other critical steps, these constructs will make the overall job succeed despite incomplete outputs. Consider capturing the exit code and either:

  • Failing the job when post‑build is critical, or
  • Logging a detailed warning with the exit code (and ideally stderr) while explicitly documenting that post‑build is optional in these modes.

This keeps behavior intentional and debuggable instead of silently swallowing real failures.

Also applies to: 198-199


205-209: Non‑containerized BuildCommands no longer stages or runs the build entrypoint

In the non‑container branch you now only:

  • Echo "game ci start",
  • Run remote-cli-log-stream for 3 seconds, and
  • Invoke remote-cli-post-build.

There’s no longer any staging of the default build script or /entrypoint.sh, so local/non‑container providers won’t execute actual build steps via this workflow and may appear to “succeed” while doing nothing. If the intention is still to drive the Unity build from here, you likely need to reintroduce the entrypoint/steps setup and execution similar to the containerized path, or otherwise clearly document that the non‑container provider is expected to handle the build entirely outside this workflow.

🧹 Nitpick comments (1)
src/model/cloud-runner/workflows/build-automation-workflow.ts (1)

53-61: Make isContainerized derivation future‑proof for new provider strategies

Hard‑coding the containerized modes to 'aws' | 'k8s' | 'local-docker' means any new or plugin‑backed provider that still runs inside a container will silently fall into the non‑container path (different builderPath, no apt/tooling, different logging). Consider centralizing this decision (e.g., via a provider capability flag or helper like CloudRunner.isContainerizedStrategy(providerStrategy)) so new strategies don’t have to remember to update this switch.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c89090 and 4f18c9c.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (1)
  • src/model/cloud-runner/workflows/build-automation-workflow.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-30T18:01:50.339Z
Learnt from: MichaelBuhler
Repo: game-ci/unity-builder PR: 685
File: .github/workflows/build-tests-ubuntu.yml:146-146
Timestamp: 2025-01-30T18:01:50.339Z
Learning: The build steps duplication in the workflow is intentional to implement a retry mechanism with different delay intervals between attempts.

Applied to files:

  • src/model/cloud-runner/workflows/build-automation-workflow.ts
🧬 Code graph analysis (1)
src/model/cloud-runner/workflows/build-automation-workflow.ts (2)
src/model/index.ts (1)
  • CloudRunner (27-27)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • CloudRunnerFolders (6-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/model/cloud-runner/remote-client/index.ts (3)

265-305: Unused catch bindings and contradictory LFS assertion still present.

The issues flagged in previous reviews remain unaddressed:

  1. Line 275: _error is unused — use parameterless catch
  2. Line 279: _error2 should be renamed to error per ESLint conventions
  3. Line 283: _error3 is unused — use parameterless catch
  4. Line 305: The assertion message says "should not exist" but the code asserts the folder does exist
-      } catch (_error) {
+      } catch {
         try {
           await CloudRunnerSystem.Run(`git fetch origin ${targetSha} || true`);
           await CloudRunnerSystem.Run(`git checkout ${targetSha}`);
-        } catch (_error2) {
+        } catch (error) {
           RemoteClientLogger.logWarning(`Falling back to branch checkout; SHA not found: ${targetSha}`);
           try {
             await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
-          } catch (_error3) {
+          } catch {
             if ((targetBranch || '').startsWith('pull/')) {
               await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
             } else {
-              throw _error2;
+              throw error;
             }
           }
         }
       }

And fix the assertion:

-    assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');
+    assert(!fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');

339-342: Unused catch binding and unconventional no-op still present.

The previous review flagged this: the catch binding _error at line 339 is unused, and void 0; is an unconventional way to express a no-op.

-    } catch (_error) {
-      /* no-op: best-effort git lfs pull without tokens may fail */
-      void 0;
-    }
+    } catch {
+      // no-op: best-effort git lfs pull without tokens may fail
+    }

411-425: Unused catch bindings in retained workspace logic still present.

The previous review flagged these issues:

  1. Line 414: _error is unused — use parameterless catch
  2. Line 418: _error2 should be renamed to error per ESLint conventions
       try {
         await CloudRunnerSystem.Run(`git reset --hard "${sha}"`);
         await CloudRunnerSystem.Run(`git checkout ${sha}`);
-      } catch (_error) {
+      } catch {
         RemoteClientLogger.logWarning(`Retained workspace: SHA not found, falling back to branch ${branch}`);
         try {
           await CloudRunnerSystem.Run(`git checkout ${branch}`);
-        } catch (_error2) {
+        } catch (error) {
           if ((branch || '').startsWith('pull/')) {
             await CloudRunnerSystem.Run(`git checkout origin/${branch}`);
           } else {
-            throw _error2;
+            throw error;
           }
         }
       }
🧹 Nitpick comments (2)
src/model/cloud-runner/remote-client/index.ts (2)

34-38: Misleading comment: setDefaultEncoding doesn't control buffering.

The comment states this ensures stdout is "unbuffered," but setDefaultEncoding only sets the character encoding for write operations. Node.js stdout buffering is controlled by the stream's internal state and cannot be disabled this way. The encoding setting itself is harmless but the comment is incorrect.

Consider updating the comment to reflect what the code actually does:

-    // For K8s, ensure stdout is unbuffered so messages are captured immediately
-    if (CloudRunnerOptions.providerStrategy === 'k8s') {
-      process.stdout.setDefaultEncoding('utf8');
-    }
+    // For K8s, ensure consistent UTF-8 encoding for stdout
+    if (CloudRunnerOptions.providerStrategy === 'k8s') {
+      process.stdout.setDefaultEncoding('utf8');
+    }

152-155: Ineffective flush attempt.

Writing an empty string to stdout does not trigger a flush in Node.js. The isTTY check and empty write are essentially no-ops. If immediate flushing is critical, consider using a synchronous write approach or removing the dead code.

    // Write to stdout first so kubectl logs can capture it
    process.stdout.write(`${successMessage}\n`);
-    // Force flush stdout to ensure it's captured
-    if (process.stdout.isTTY === false) {
-      process.stdout.write(''); // Trigger flush
-    }
    // Also log via CloudRunnerLogger for GitHub Actions
    CloudRunnerLogger.log(successMessage);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f18c9c and c61c9f8.

📒 Files selected for processing (1)
  • src/model/cloud-runner/remote-client/index.ts (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/model/cloud-runner/remote-client/index.ts (3)
src/model/cloud-runner/remote-client/remote-client-logger.ts (1)
  • RemoteClientLogger (7-93)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • CloudRunnerFolders (6-90)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: WebGL on 2022.3.13f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: Android on 2022.3.13f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: Android on 2021.3.32f1
🔇 Additional comments (3)
src/model/cloud-runner/remote-client/index.ts (3)

88-107: LGTM: Guarded cache push with proper existence and content checks.

The defensive checks for folder existence and non-empty content before pushing to cache are well implemented. The error handling appropriately warns without failing the build.


109-128: LGTM: Consistent defensive pattern for build cache.

Same well-structured defensive checks as the Library cache push.


131-138: LGTM: Defensive cleanup with existence check.

Checking both path formats before attempting removal prevents unnecessary errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (23)
src/model/cloud-runner/tests/cloud-runner-rclone-steps.test.ts (2)

26-52: Duplicate: Environment gating logic already flagged.

A previous review already addressed stabilizing the gating logic to always probe rclone/bash availability regardless of CI status.


80-80: Duplicate: Excessive timeout already flagged.

A previous review recommended reducing the ~11.6 day timeout to a sane upper bound (15–30 minutes).

src/model/cloud-runner/tests/providers/provider-git-manager.test.ts (1)

1-5: Fix import order and remove unused mocks to avoid ESLint noise

  • import/order: put Node built-ins before relative imports and keep all imports before jest.mock calls, e.g. import * as fs from 'fs'; followed by import type { GitHubUrlInfo } ... and import { ProviderGitManager } ....
  • no-unused-vars: mockExecAsync (Line 18) and mockFs (Line 33) are never used; either wire them into the tests (e.g., stubbing fs.existsSync for getProviderModulePath scenarios) or remove them.

Example adjustment:

-import { GitHubUrlInfo } from '../../providers/provider-url-parser';
-import * as fs from 'fs';
+import * as fs from 'fs';
+import type { GitHubUrlInfo } from '../../providers/provider-url-parser';
@@
-const mockExecAsync = jest.fn();
@@
-const mockFs = fs as jest.Mocked<typeof fs>;

This should clear the earlier import/first and no-unused-vars issues.

Also applies to: 15-20, 33-34

src/model/cloud-runner/remote-client/index.ts (4)

312-312: Fix contradictory LFS folder assertion message.

The code asserts that the LFS folder EXISTS (fs.existsSync returns true), but the message says "should not exist before caching". This is contradictory.

Since git lfs install was called at line 268, the LFS folder should exist at this point. The assertion message should be corrected to match the check.

Apply this diff:

-    assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');
+    assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder exists after git lfs install');

340-349: Remove unconventional void 0 statement.

Line 348 contains an unconventional void 0; statement in the catch block. Since this is a no-op catch, the statement is unnecessary. The comment alone is sufficient to document the intent.

Apply this diff:

     } catch {
       /* no-op: best-effort git lfs pull without tokens may fail */
-      void 0;
     }

49-60: Add newline when writing to log file.

Line 51 appends element to the log file without a newline, while line 55 writes to stdout with a newline. This causes log entries in the file to be concatenated together without line breaks, making the file difficult to read.

Apply this diff:

-          fs.appendFileSync(logFile, element);
+          fs.appendFileSync(logFile, `${element}\n`);

64-75: Add newline when writing lingering line to log file.

Line 66 has the same issue as line 51: it appends lingeringLine to the log file without a newline, while line 69 writes to stdout with a newline.

Apply this diff:

-          fs.appendFileSync(logFile, lingeringLine);
+          fs.appendFileSync(logFile, `${lingeringLine}\n`);
src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (1)

31-34: Missing -c flag for --previous logs can break multi-container pods.

The --previous branch omits the container name flag. For multi-container pods, this may fail or select the wrong container.

-      extraFlags += isRunning ? ` -f -c ${containerName} -n ${namespace}` : ` --previous -n ${namespace}`;
+      extraFlags += isRunning
+        ? ` -f -c ${containerName} -n ${namespace}`
+        : ` --previous -c ${containerName} -n ${namespace}`;
src/model/cloud-runner/remote-client/caching.ts (2)

97-117: Proactive cleanup scope may be broader than intended.

If cacheFolder is 'cache', then cacheParent becomes '.' (current directory), and find . -name "*.tar*" could match unrelated files. Consider restricting cleanup to the actual cache directory.


159-180: Dead code: retrySucceeded check is unreachable.

The conditional at lines 175-178 can never execute with !retrySucceeded being true:

  • If retry succeeds, retrySucceeded = true (line 166)
  • If retry throws, the catch block (lines 167-173) throws, so line 175 is never reached

This was flagged in a previous review as addressed, but the dead code remains.

               // Retry the tar operation once after cleanup
-              let retrySucceeded = false;
               try {
                 await CloudRunnerSystem.Run(
                   `tar -cf ${cacheArtifactName}.tar${compressionSuffix} "${path.basename(sourceFolder)}"`,
                 );
-                // If retry succeeds, mark it - we'll continue normally without throwing
-                retrySucceeded = true;
               } catch (retryError: any) {
                 throw new Error(
                   `Failed to create cache archive after cleanup. Original error: ${errorMessage}. Retry error: ${
                     retryError?.message || retryError
                   }`,
                 );
               }
-              // If retry succeeded, don't throw the original error - let execution continue after catch block
-              if (!retrySucceeded) {
-                throw error;
-              }
-              // If we get here, retry succeeded - execution will continue after the catch block
src/model/cloud-runner/providers/aws/services/task-service.ts (2)

109-111: DescribeTasksCommand has a 100-task limit.

DescribeTasksCommand accepts a maximum of 100 task ARNs per request. If a cluster has more than 100 running tasks, this call will fail.

       if (taskArns.length > 0) {
-        const describeInput = { tasks: taskArns, cluster: element };
-        const describeList = (await ecs.send(new DescribeTasksCommand(describeInput))).tasks || [];
+        const describeList: Task[] = [];
+        for (let i = 0; i < taskArns.length; i += 100) {
+          const batch = taskArns.slice(i, i + 100);
+          const response = await ecs.send(new DescribeTasksCommand({ tasks: batch, cluster: element }));
+          describeList.push(...(response.tasks || []));
+        }
         if (describeList.length === 0) {

201-218: Missing pagination for ListObjectsV2Command.

ListObjectsV2Command returns a maximum of 1000 objects. If the bucket contains more locks, results will be silently truncated.

Also, the type assertion at lines 204-206 is a workaround for listObjects being private. Consider making listObjects public in SharedWorkspaceLocking to avoid this cast.

src/model/cloud-runner/services/core/shared-workspace-locking.ts (2)

17-34: Lazy S3 client initialization via factory is good.

The factory pattern ensures consistent configuration (endpoints, path-style) across the codebase.

However, the rclone helper at lines 30-34 concatenates user-controlled values into a shell string, which is a command injection risk.

The command injection concern was previously flagged. Using execFile with an args array instead of exec with string interpolation would eliminate this risk.


73-101: Missing pagination in S3 listObjects.

ListObjectsV2Command returns at most 1000 objects. Without ContinuationToken handling, results will be truncated for large lock sets.

-    const result = await SharedWorkspaceLocking.s3.send(
-      new ListObjectsV2Command({ Bucket: bucket, Prefix: prefix, Delimiter: '/' }),
-    );
     const entries: string[] = [];
-    for (const p of result.CommonPrefixes || []) {
-      if (p.Prefix) entries.push(p.Prefix.slice(prefix.length));
-    }
-    for (const c of result.Contents || []) {
-      if (c.Key && c.Key !== prefix) entries.push(c.Key.slice(prefix.length));
-    }
+    let token: string | undefined;
+    do {
+      const result = await SharedWorkspaceLocking.s3.send(
+        new ListObjectsV2Command({ Bucket: bucket, Prefix: prefix, Delimiter: '/', ContinuationToken: token }),
+      );
+      for (const p of result.CommonPrefixes || []) {
+        if (p.Prefix) entries.push(p.Prefix.slice(prefix.length));
+      }
+      for (const c of result.Contents || []) {
+        if (c.Key && c.Key !== prefix) entries.push(c.Key.slice(prefix.length));
+      }
+      token = result.NextContinuationToken;
+    } while (token);
src/model/cloud-runner/workflows/build-automation-workflow.ts (4)

62-83: Non‑container setup only initializes logging; no builder clone or pre‑build, and /home/job-log.txt is still hard‑coded.

In the non‑container path, setupCommands now only exports GIT_DISCOVERY_ACROSS_FILESYSTEM, creates the LOG_FILE directory, and logs CACHE_KEY, but it never clones the builder repo or runs remote-cli-pre-build. Together with the non‑container BuildCommands (see lines 205‑210), this means local/non‑container providers appear to do only log streaming/post‑build without ever staging or invoking the Unity build entrypoint, so local builds may effectively be no‑ops unless something external sets everything up.

If the intent is that non‑container providers have the builder and its dist/index.js pre‑installed and that they start the build outside this workflow, please document that expectation and/or guard this path with an explicit feature flag. Otherwise, consider aligning the non‑container setup with the containerized path (clone + pre‑build) so that it can stand alone.

Also, now that LOG_FILE is exported, you can simplify and de‑duplicate log handling by consistently using $LOG_FILE instead of hard‑coding /home/job-log.txt in the containerized setup.

Also applies to: 117-125, 128-131


162-170: Stub node/npm/n/yarn binaries can hide real dependency problems.

The local‑docker branch creates no‑op stubs for node, npm, n, and yarn that simply exit 0 when the commands are missing. This will cause any script that actually depends on these tools to appear to succeed while silently doing nothing, making misconfigurations hard to debug.

Instead, either:

  • Fail fast with a clear error when required tooling is missing, or
  • If stubbing is needed for a very specific compatibility case, log prominent warnings (including which binary is missing and why it’s being stubbed) and gate this behavior behind an explicit opt‑in flag.

179-180: remote-cli-post-build failures are being masked as warnings/success.

Both containerized branches treat post‑build failures as success:

  • Local‑docker: node ${builderPath} -m remote-cli-post-build || echo "Post-build command completed with warnings".
  • Other containerized: node ${builderPath} -m remote-cli-post-build | node ${builderPath} -m remote-cli-log-stream --logFile /home/job-log.txt || true.

If remote-cli-post-build is responsible for uploading artifacts, updating locks, or persisting cache, hiding its failures can leave the system in an inconsistent state while the job appears green.

Consider either:

  • Letting the command fail (removing the || / || true), or
  • Capturing the exit code and logging a clear warning that includes the exit code and context, and choosing explicitly whether to exit non‑zero.

Example (if failures should be surfaced):

-node ${builderPath} -m remote-cli-post-build || echo "Post-build command completed with warnings"
+node ${builderPath} -m remote-cli-post-build || {
+  EXIT_CODE=$?
+  echo "ERROR: Post-build command failed with exit code $EXIT_CODE"
+  exit "$EXIT_CODE"
+}

Apply a similar pattern to the piped variant if post‑build is meant to be critical.

Also applies to: 199-200


205-210: Non‑container providers appear to run only log/post‑build helpers and never execute the build entrypoint.

The non‑container BuildCommands branch now just:

  • Logs “game ci start” to $LOG_FILE.
  • Runs remote-cli-log-stream for 3 seconds.
  • Runs remote-cli-post-build.

Unlike the containerized branches, it never stages the default build script, does not copy or execute /entrypoint.sh, and has no pre‑build step. Unless some external process is starting the actual Unity build and wiring up the remote CLI, local/non‑container providers will effectively do no build work while still returning from the workflow.

If local (or other non‑container) providers are meant to execute real builds through this workflow, consider restoring the entrypoint/steps invocation pattern (or adding an explicit local build path) so that this branch actually runs the build, not just the remote‑cli helpers.

src/model/cloud-runner/services/hooks/container-hook-service.ts (5)

166-184: aws-s3-pull-cache: S3 path handling and object selection are brittle

The cache pull logic works but is fragile:

  • BUCKET1/BUCKET2 are defined without s3://, then cp prepends s3://$BUCKET... while ls uses $BUCKET directly. Mixed schemes are confusing and can be error-prone.
  • OBJECT1/OBJECT2 default to '', yet aws ... cp s3://$BUCKET$OBJECT still executes (errors are hidden by || '' / || true), so you silently skip cache without any guard.

A more robust pattern (from the earlier suggestion) would be:

-      aws $ENDPOINT_ARGS s3 ls ${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/ || true
-      aws $ENDPOINT_ARGS s3 ls ${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/ || true
-      BUCKET1="${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/Library/"
-      aws $ENDPOINT_ARGS s3 ls $BUCKET1 || true
-      OBJECT1="$(aws $ENDPOINT_ARGS s3 ls $BUCKET1 | sort | tail -n 1 | awk '{print $4}' || '')"
-      aws $ENDPOINT_ARGS s3 cp s3://$BUCKET1$OBJECT1 /data/cache/$CACHE_KEY/Library/ || true
-      BUCKET2="${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/lfs/"
-      aws $ENDPOINT_ARGS s3 ls $BUCKET2 || true
-      OBJECT2="$(aws $ENDPOINT_ARGS s3 ls $BUCKET2 | sort | tail -n 1 | awk '{print $4}' || '')"
-      aws $ENDPOINT_ARGS s3 cp s3://$BUCKET2$OBJECT2 /data/cache/$CACHE_KEY/lfs/ || true
+      aws $ENDPOINT_ARGS s3 ls s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/ || true
+      aws $ENDPOINT_ARGS s3 ls s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/ || true
+      BUCKET1="s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/Library/"
+      aws $ENDPOINT_ARGS s3 ls "$BUCKET1" || true
+      OBJECT1="$(aws $ENDPOINT_ARGS s3 ls "$BUCKET1" | sort | tail -n 1 | awk '{print $4}')"
+      [ -n "$OBJECT1" ] && aws $ENDPOINT_ARGS s3 cp "${BUCKET1}${OBJECT1}" "/data/cache/$CACHE_KEY/Library/" || true
+      BUCKET2="s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/lfs/"
+      aws $ENDPOINT_ARGS s3 ls "$BUCKET2" || true
+      OBJECT2="$(aws $ENDPOINT_ARGS s3 ls "$BUCKET2" | sort | tail -n 1 | awk '{print $4}')"
+      [ -n "$OBJECT2" ] && aws $ENDPOINT_ARGS s3 cp "${BUCKET2}${OBJECT2}" "/data/cache/$CACHE_KEY/lfs/" || true

This keeps S3 URIs consistent and avoids issuing cp calls with empty object names.


40-56: Avoid deleting build tar when S3 upload fails (aws-s3-upload-build)

The AWS CLI presence check and AWS_S3_ENDPOINT wiring look good, but the cleanup logic is risky:

  • aws $ENDPOINT_ARGS s3 cp ... || true swallows upload failures.
  • The subsequent rm still executes and is also || true, so a failed upload still deletes /data/cache/$CACHE_KEY/build/build-*.tar[.lz4], preventing any retry using the existing artifact.

To keep the behavior non-fatal while avoiding data loss, gate deletion on a successful copy, e.g.:

-      aws $ENDPOINT_ARGS s3 cp /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
-        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
-      } s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/build/build-$BUILD_GUID.tar${
-        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
-      } || true
-      rm /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
-        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
-      } || true
+      aws $ENDPOINT_ARGS s3 cp /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
+        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
+      } s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/build/build-$BUILD_GUID.tar${
+        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
+      } && rm /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
+        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
+      } || true

Same pattern applies to other upload hooks that rm immediately after a best-effort remote copy.

Also applies to: 64-65


70-87: aws-s3-pull-build: missing $ENDPOINT_ARGS on cp and inconsistent S3 URIs

This hook is close, but two details will break LocalStack/custom endpoints and make path handling brittle:

  • Both aws $ENDPOINT_ARGS s3 ls calls use ${CloudRunner.buildParameters.awsStackName}/... without s3://, while the cp uses s3://.... Consistent s3:// URIs are safer and match AWS CLI examples.
  • The aws s3 cp line omits $ENDPOINT_ARGS, so AWS_S3_ENDPOINT is honored for listing but not for the download itself.

Consider:

-      aws $ENDPOINT_ARGS s3 ls ${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/ || true
-      aws $ENDPOINT_ARGS s3 ls ${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/build || true
-      aws s3 cp s3://${
-        CloudRunner.buildParameters.awsStackName
-      }/cloud-runner-cache/$CACHE_KEY/build/build-$BUILD_GUID_TARGET.tar${
+      aws $ENDPOINT_ARGS s3 ls s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/ || true
+      aws $ENDPOINT_ARGS s3 ls s3://${CloudRunner.buildParameters.awsStackName}/cloud-runner-cache/$CACHE_KEY/build || true
+      aws $ENDPOINT_ARGS s3 cp s3://${
+        CloudRunner.buildParameters.awsStackName
+      }/cloud-runner-cache/$CACHE_KEY/build/build-$BUILD_GUID_TARGET.tar${
         CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
       } /data/cache/$CACHE_KEY/build/build-$BUILD_GUID_TARGET.tar${
         CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
       } || true

This keeps endpoint handling consistent across both list and copy operations.

Also applies to: 93-93


185-201: Gate rclone cache deletions on successful copy (rclone-upload-build / rclone-upload-cache)

Both rclone upload hooks delete local data even when the remote copy fails:

  • rclone copy ... || true hides upload failures.
  • rm / rm -r always run and are also || true, so a transient error or bad remote can wipe the only local copy of the build or cache.

This is the same concern raised previously for these hooks. A safer pattern:

-      rclone copy /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
-        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
-      } ${CloudRunner.buildParameters.rcloneRemote}/cloud-runner-cache/$CACHE_KEY/build/ || true
-      rm /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
-        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
-      } || true
+      rclone copy /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
+        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
+      } ${CloudRunner.buildParameters.rcloneRemote}/cloud-runner-cache/$CACHE_KEY/build/ && rm /data/cache/$CACHE_KEY/build/build-${CloudRunner.buildParameters.buildGuid}.tar${
+        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
+      }
-      rclone copy /data/cache/$CACHE_KEY/lfs ${
-        CloudRunner.buildParameters.rcloneRemote
-      }/cloud-runner-cache/$CACHE_KEY/lfs || true
-      rm -r /data/cache/$CACHE_KEY/lfs || true
-      rclone copy /data/cache/$CACHE_KEY/Library ${
-        CloudRunner.buildParameters.rcloneRemote
-      }/cloud-runner-cache/$CACHE_KEY/Library || true
-      rm -r /data/cache/$CACHE_KEY/Library || true
+      rclone copy /data/cache/$CACHE_KEY/lfs ${
+        CloudRunner.buildParameters.rcloneRemote
+      }/cloud-runner-cache/$CACHE_KEY/lfs && rm -r /data/cache/$CACHE_KEY/lfs
+      rclone copy /data/cache/$CACHE_KEY/Library ${
+        CloudRunner.buildParameters.rcloneRemote
+      }/cloud-runner-cache/$CACHE_KEY/Library && rm -r /data/cache/$CACHE_KEY/Library

You may also want to skip these hooks entirely when CloudRunner.buildParameters.rcloneRemote is empty to avoid rclone invoking an invalid remote.

Also applies to: 221-239


203-213: rclone-pull-build: destination should be a directory (or use copyto)

The pull hook currently copies from a remote file path to what looks like a local file path:

rclone copy <remote>/build-$BUILD_GUID_TARGET.tar[.lz4] /data/cache/$CACHE_KEY/build/build-$BUILD_GUID_TARGET.tar[.lz4]

rclone copy expects the destination to be a directory; this form typically creates nested directories named like build-$BUILD_GUID_TARGET.tar[.lz4] rather than the single expected file, which will confuse downstream consumers looking for /data/cache/$CACHE_KEY/build/build-*.tar[.lz4].

Two options:

  • Treat the destination as a directory (simpler):
-      } /data/cache/$CACHE_KEY/build/build-$BUILD_GUID_TARGET.tar${
-        CloudRunner.buildParameters.useCompressionStrategy ? '.lz4' : ''
-      } || true
+      } /data/cache/$CACHE_KEY/build/ || true
  • Or switch to rclone copyto and pass the full file path as the destination.

Also applies to: 218-220

🧹 Nitpick comments (12)
src/model/cloud-runner/tests/cloud-runner-rclone-steps.test.ts (4)

11-17: Global state mutation in test helper.

CreateParameters mutates the global Cli.options as a side effect, which could cause test pollution if multiple tests run in the same process without cleanup. Consider adding teardown logic to reset Cli.options after tests or documenting this coupling.


60-60: Redundant Unity version resolution.

UnityVersioning.determineUnityVersion(path, UnityVersioning.read(path)) is redundant—read() returns the version string, and determineUnityVersion() returns it directly when it's not 'auto'. Use one or the other:

-          unityVersion: UnityVersioning.determineUnityVersion('test-project', UnityVersioning.read('test-project')),
+          unityVersion: UnityVersioning.read('test-project'),

67-67: Double type assertion bypasses type safety.

as unknown as OptionValues circumvents TypeScript's type checking. While acceptable in tests, consider defining a partial type or using a test utility that properly types the overrides to catch mismatches during refactoring.


75-79: Empty catch block silently swallows errors.

The "best-effort" verification completely hides failures, making debugging harder if the remote is misconfigured. At minimum, log the error:

        try {
          const lines = await CloudRunnerSystem.RunAndReadLines(`rclone lsf ${remote}`);
          CloudRunnerLogger.log(lines.join(','));
-        } catch {}
+        } catch (error) {
+          CloudRunnerLogger.log(`Best-effort remote verification failed: ${error}`);
+        }
src/model/cloud-runner/providers/local/index.ts (1)

70-82: Windows bash wrapping looks safe; clarify the “if available” wording

The Windows branch now:

  • Normalizes CRLF, flattens multi-line scripts into a single ;‑separated string, and
  • Uses bash -lc ${quote([inline])}, which correctly shell‑escapes the script and mitigates injection via commands.

Two minor points to consider:

  1. The comment says “Execute via bash if available,” but the code unconditionally invokes bash when process.platform === 'win32'. Either:
    • Add a lightweight availability check (e.g., fail fast with a clear error if bash is missing), or
    • Reword the comment to state that bash is required on Windows for POSIX-style hooks.
  2. If you expect some consumers to supply native Windows cmd commands, a brief note in docs about the enforced bash-on-Windows behavior would help avoid surprises.
src/model/cloud-runner/tests/providers/provider-git-manager.test.ts (2)

36-154: Tests only validate manual mocks, not the real ProviderGitManager behavior

All the cloneRepository, updateRepository, and getProviderModulePath tests stub the corresponding methods on mockProviderGitManager and then assert that the returned value matches the stub. This exercises Jest’s mocking rather than the actual implementation: the tests will still pass even if ProviderGitManager’s real logic is completely broken.

Consider instead:

  • Mocking dependencies (execAsync, fs) and importing the real ProviderGitManager, then asserting on:
    • Correct git commands, timeouts, and working directories for cloneRepository/updateRepository.
    • Cache directory behavior and branch handling.
    • Entry-point resolution and fallback behavior in getProviderModulePath.
  • Adding at least one test for ensureRepositoryAvailable (existing repo, update success/failure, fresh clone) and optionally cleanupOldRepositories with a fake cache directory structure.

That will give you regression protection for the provider loading flow instead of just type/shape checks.


36-155: Add coverage for high‑level flows like ensureRepositoryAvailable

ProviderGitManager.ensureRepositoryAvailable and cleanupOldRepositories are key to the provider plugin system and cache behavior, but they are not exercised here. A few targeted tests using mocked fs and execAsync would catch regressions in:

  • Update-then-clone fallback when updateRepository fails.
  • Behavior when the repo is missing vs already cloned.
  • Old cache eviction logic.

This can be incremental, but adding at least one “happy path” and one “error path” test for ensureRepositoryAvailable would materially increase confidence in the new plugin loader.

src/model/cloud-runner/remote-client/index.ts (1)

300-308: Rename catch binding to follow convention.

Line 302 uses _error but the error is referenced at line 306 when rethrown. The naming convention for catch bindings that are used should be error, not _error.

Apply this diff:

-      } catch (_error) {
+      } catch (error) {
         if ((targetBranch || '').startsWith('pull/')) {
           await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
         } else {
-          throw _error;
+          throw error;
         }
       }
src/model/cloud-runner/services/core/shared-workspace-locking.ts (1)

51-72: Bucket existence check and creation logic is correct.

The region handling for LocationConstraint (only for non-us-east-1) follows AWS requirements.

Consider using CreateBucketCommandInput type instead of any at line 66:

-      const createParameters: any = { Bucket: bucket };
+      const createParameters: { Bucket: string; CreateBucketConfiguration?: { LocationConstraint: string } } = { Bucket: bucket };
src/model/cloud-runner/providers/aws/aws-task-runner.ts (1)

229-254: Consider restoring specific types instead of any.

Multiple parameters and variables have been changed to any, losing type safety. Consider using SDK response types:

import type { DescribeStreamCommandOutput, GetRecordsCommandOutput } from '@aws-sdk/client-kinesis';
import type { Task } from '@aws-sdk/client-ecs';

// Line 229
private static checkStreamingShouldContinue(taskData: Task | undefined, ...)

// Line 249
private static logRecords(records: GetRecordsCommandOutput, ...)

// Line 280
private static async getLogIterator(stream: DescribeStreamCommandOutput)
src/model/cloud-runner/workflows/build-automation-workflow.ts (1)

53-61: Make containerization a provider capability, not a string whitelist.

isContainerized is derived from hard‑coded providerStrategy values. This works for aws/k8s/local-docker today but will drift as new strategies or plugin providers are introduced. Consider surfacing an isContainerized (or similar) flag from the provider itself and deriving builderPath from that instead of string comparisons.

src/model/cloud-runner/workflows/custom-workflow.ts (1)

35-64: Per-step allowFailure handling looks correct; consider refining logging and catch scope

The step-level try/catch and allowFailure gating are wired sensibly and align with the new default set in ContainerHookService.ParseContainerHooks. Two minor refinements you might consider:

  • Use CloudRunnerLogger.error (or .error + message) in the non‑allowFailure branch so hard failures surface clearly in logs/annotations rather than as plain log.
  • The outer try/catch that immediately rethrows (Lines 27–70) is redundant; letting errors bubble up naturally will preserve behavior while simplifying the method.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c61c9f8 and a99defa.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (16)
  • src/model/cloud-runner/providers/aws/aws-task-runner.ts (10 hunks)
  • src/model/cloud-runner/providers/aws/services/task-service.ts (6 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts (5 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-pods.ts (1 hunks)
  • src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (2 hunks)
  • src/model/cloud-runner/providers/local/index.ts (2 hunks)
  • src/model/cloud-runner/remote-client/caching.ts (1 hunks)
  • src/model/cloud-runner/remote-client/index.ts (5 hunks)
  • src/model/cloud-runner/services/core/shared-workspace-locking.ts (8 hunks)
  • src/model/cloud-runner/services/core/task-parameter-serializer.ts (2 hunks)
  • src/model/cloud-runner/services/hooks/container-hook-service.ts (3 hunks)
  • src/model/cloud-runner/tests/cloud-runner-rclone-steps.test.ts (1 hunks)
  • src/model/cloud-runner/tests/e2e/cloud-runner-end2end-retaining.test.ts (4 hunks)
  • src/model/cloud-runner/tests/providers/provider-git-manager.test.ts (1 hunks)
  • src/model/cloud-runner/workflows/build-automation-workflow.ts (1 hunks)
  • src/model/cloud-runner/workflows/custom-workflow.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/model/cloud-runner/providers/k8s/kubernetes-job-spec-factory.ts
  • src/model/cloud-runner/services/core/task-parameter-serializer.ts
  • src/model/cloud-runner/providers/k8s/kubernetes-pods.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-30T18:01:50.339Z
Learnt from: MichaelBuhler
Repo: game-ci/unity-builder PR: 685
File: .github/workflows/build-tests-ubuntu.yml:146-146
Timestamp: 2025-01-30T18:01:50.339Z
Learning: The build steps duplication in the workflow is intentional to implement a retry mechanism with different delay intervals between attempts.

Applied to files:

  • src/model/cloud-runner/workflows/custom-workflow.ts
  • src/model/cloud-runner/workflows/build-automation-workflow.ts
🧬 Code graph analysis (9)
src/model/cloud-runner/workflows/custom-workflow.ts (2)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • CloudRunnerFolders (6-90)
src/model/cloud-runner/services/core/cloud-runner-logger.ts (1)
  • error (24-26)
src/model/cloud-runner/tests/cloud-runner-rclone-steps.test.ts (3)
src/model/cli/cli.ts (1)
  • Cli (16-175)
src/model/unity-versioning.ts (1)
  • UnityVersioning (4-32)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
src/model/cloud-runner/providers/local/index.ts (1)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
src/model/cloud-runner/providers/aws/aws-task-runner.ts (2)
src/model/cloud-runner/providers/aws/aws-client-factory.ts (1)
  • AwsClientFactory (9-71)
src/model/cloud-runner/services/core/cloud-runner-logger.ts (1)
  • error (24-26)
src/model/cloud-runner/remote-client/caching.ts (3)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • cacheFolder (87-89)
src/model/cloud-runner/services/core/cloud-runner-logger.ts (1)
  • error (24-26)
src/model/cloud-runner/services/core/shared-workspace-locking.ts (2)
src/model/cloud-runner/providers/aws/aws-client-factory.ts (1)
  • AwsClientFactory (9-71)
src/model/input.ts (1)
  • region (48-50)
src/model/cloud-runner/workflows/build-automation-workflow.ts (1)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • CloudRunnerFolders (6-90)
src/model/cloud-runner/providers/aws/services/task-service.ts (1)
src/model/cloud-runner/services/core/shared-workspace-locking.ts (2)
  • SharedWorkspaceLocking (17-400)
  • s3 (19-26)
src/model/cloud-runner/tests/providers/provider-git-manager.test.ts (2)
src/model/cloud-runner/providers/provider-git-manager.ts (1)
  • ProviderGitManager (25-278)
src/model/cloud-runner/providers/provider-url-parser.ts (1)
  • GitHubUrlInfo (3-10)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneLinux64 on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 2023.2.2f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: iOS on 2021.3.45f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: iOS on 2022.3.13f1
🔇 Additional comments (30)
src/model/cloud-runner/tests/cloud-runner-rclone-steps.test.ts (3)

1-9: LGTM!

Imports are appropriate for the test's needs—CloudRunner, BuildParameters, ImageTag for build execution, UnityVersioning for version resolution, uuid for unique cache keys, and CloudRunnerSystem for remote verification.


20-24: LGTM!

Placeholder tests and setups() invocation provide basic smoke testing to ensure the file loads correctly.


81-85: LGTM!

Skipped test with informative logging is appropriate when rclone is not configured.

src/model/cloud-runner/providers/local/index.ts (1)

9-9: Good choice using shell-quote for argument escaping

Importing quote from shell-quote is a solid fix for the previous manual-escaping approach and should address the code scanning warning about incomplete escaping (including backslashes).

src/model/cloud-runner/tests/e2e/cloud-runner-end2end-retaining.test.ts (4)

27-27: Explicit debug flag is good for clarity.

Adding cloudRunnerDebug: true to the test parameters ensures the flag is explicitly passed to the build, even though the test is already guarded by CloudRunnerOptions.cloudRunnerDebug. This provides clear intent and proper configuration propagation.


38-40: Improved test assertions with boolean BuildSucceeded property.

The update from string-based assertions to the explicit BuildSucceeded boolean property makes the test more robust and less brittle. The comment on line 40 provides helpful context for future maintenance. The consistent use across both test runs (lines 38 and 75) ensures uniform verification.

Also applies to: 75-75


89-98: Good defensive cleanup with error handling.

The enhanced cleanup properly handles potential failures during workspace removal. The combination of || true in the command and try-catch wrapping provides robust error handling, ensuring cleanup failures don't break the test suite. The use of path.basename() adds safety by preventing directory traversal in the constructed path.


100-109: Cache cleanup is safe for CI environments due to sequential test execution.

The cache directory cleanup is good test hygiene and will prevent disk space accumulation. The concern about parallel test execution is mitigated by the Jest CI configuration (jest.ci.config.js), which explicitly sets maxWorkers: 1 to enforce sequential test execution in CI pipelines.

Multiple tests reference the cloud-runner-cache directory, but since they execute sequentially in CI and this cleanup runs at the end of the test suite, there is no conflict risk. The wildcard deletion (rm -rf ${cachePath}/*) removes all cache contents safely when tests complete.

Note: In local development with the default Jest configuration, tests could theoretically run in parallel; the cleanup happening at test completion limits any practical impact.

src/model/cloud-runner/remote-client/index.ts (8)

35-38: LGTM!

Setting stdout encoding for K8s ensures consistent output capture by kubectl logs.


82-110: LGTM!

The defensive guards prevent unnecessary cache operations and provide clear logging when skipping. The error handling ensures the build continues even if cache push fails.


112-136: LGTM!

Consistent defensive pattern for Build cache push, matching the Library cache logic.


139-146: LGTM!

The existence check prevents errors when attempting to clean up a non-existent folder.


154-164: LGTM!

Writing to stdout before CloudRunnerLogger ensures kubectl logs can capture the success marker for test assertions.


272-298: LGTM!

The multi-layered checkout fallback strategy properly handles SHA misses, PR refs, and origin refs. The catch block patterns are now correct, using parameterless catch where errors aren't referenced and naming the error when it's rethrown.


351-391: LGTM!

The fallback strategy for LFS pull tries multiple authentication methods (plain, GIT_PRIVATE_TOKEN, GITHUB_TOKEN) with proper error handling and logging at each step.


396-432: LGTM!

The retained workspace handling properly fetches refs (including PR refs), pulls LFS files, and uses the same robust SHA/branch checkout fallback strategy as the initial clone path.

src/model/cloud-runner/providers/k8s/kubernetes-task-runner.ts (2)

54-95: Fallback log retrieval logic looks reasonable.

The fallback from --previous to current container logs handles edge cases well. The retry logic increments correctly in both success and failure paths. The nested try-catch adds complexity but is justified for resilience.


104-114: Good error discrimination after retry exhaustion.

Rethrowing non-"previous container" errors while gracefully breaking on expected container lifecycle issues is the right approach.

src/model/cloud-runner/remote-client/caching.ts (2)

82-96: Pre-flight disk space check is a good addition.

The disk usage parsing is best-effort and gracefully handles failures. This proactive check helps prevent tar operations from failing mid-write.


201-207: Good defensive check: recreate cache folder if deleted.

Cleanup operations might delete the cache folder if it becomes empty. Recreating it before mv prevents failures.

src/model/cloud-runner/providers/aws/services/task-service.ts (2)

6-18: Good use of type-only imports and factory pattern.

Using import type for SDK types and centralizing client creation via AwsClientFactory improves type safety and consistency with the codebase's approach.


88-108: Pagination for ListClusters and ListTasks is well implemented.

The do-while pattern with nextToken correctly handles pagination for both cluster and task enumeration.

src/model/cloud-runner/services/core/shared-workspace-locking.ts (2)

297-321: CreateWorkspace correctly handles both backends.

The key generation and conditional backend selection (rclone touch vs S3 PutObject) is implemented consistently.


392-399: ReadLines provides backward compatibility.

Parsing legacy command strings (aws s3 ls, rclone lsf) and routing to listObjects maintains compatibility with existing code paths.

src/model/cloud-runner/providers/aws/aws-task-runner.ts (4)

1-14: Good migration to factory-based AWS clients.

Centralizing client creation via AwsClientFactory ensures consistent configuration across the codebase.


135-162: Well-implemented retry with exponential backoff and jitter.

The throttle detection, exponential backoff (doubling with 60s cap), and jitter follow AWS SDK best practices for handling rate limiting.


201-216: GetRecords throttle handling is correct.

Returning early with the same iterator preserves position while respecting rate limits. The outer loop will retry on the next iteration.


86-94: Good defensive handling for missing container data.

Checking for empty/undefined containers array before accessing prevents undefined access errors during task startup.

src/model/cloud-runner/services/hooks/container-hook-service.ts (2)

263-265: Debug-cache hook: safer apt usage and consistent secrets

The updated debug-cache hook is a nice improvement:

  • apt-get update and the optional tree installation/listing are now guarded with || true, so missing apt or tree no longer breaks the build for a purely diagnostic step.
  • Adding AWS_S3_ENDPOINT to the secrets keeps the secret schema aligned with other hooks even if this particular command block doesn’t currently use it.

No changes requested here.

Also applies to: 272-274


338-341: Good defaulting of allowFailure in ParseContainerHooks

Defaulting step.allowFailure to false when it’s undefined ensures:

  • Existing hooks without an explicit flag remain strict and will still fail the build on errors.
  • The new per-step handling in CustomWorkflow.runContainerJob can safely rely on a boolean and treat only explicit true as “soft-fail”.

This is a straightforward, low-risk change that complements the workflow error handling.

Comment on lines +277 to +293
// In local provider mode (non-container) or when AWS credentials are not present, skip AWS S3 hooks
const provider = CloudRunner.buildParameters?.providerStrategy;
const isContainerized = provider === 'aws' || provider === 'k8s' || provider === 'local-docker';
const hasAwsCreds =
(process.env.AWS_ACCESS_KEY_ID && process.env.AWS_SECRET_ACCESS_KEY) ||
(process.env.awsAccessKeyId && process.env.awsSecretAccessKey);

// Always include AWS hooks on the AWS provider (task role provides creds),
// otherwise require explicit creds for other containerized providers.
const shouldIncludeAwsHooks =
isContainerized && !CloudRunner.buildParameters?.skipCache && (provider === 'aws' || Boolean(hasAwsCreds));
const filteredBuiltIns = shouldIncludeAwsHooks
? builtInContainerHooks
: builtInContainerHooks.filter((x) => x.image !== 'amazon/aws-cli');

if (filteredBuiltIns.length > 0) {
results.push(...filteredBuiltIns);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "container-hook-service.ts" | head -5

Repository: game-ci/unity-builder

Length of output: 130


🏁 Script executed:

wc -l ./src/model/cloud-runner/services/hooks/container-hook-service.ts

Repository: game-ci/unity-builder

Length of output: 134


🏁 Script executed:

cat -n ./src/model/cloud-runner/services/hooks/container-hook-service.ts

Repository: game-ci/unity-builder

Length of output: 18940


🏁 Script executed:

rg "containerHookFiles" --type ts -A 3 -B 3

Repository: game-ci/unity-builder

Length of output: 7951


🏁 Script executed:

rg "skipCache" --type ts -B 3 -A 3

Repository: game-ci/unity-builder

Length of output: 3524


🏁 Script executed:

rg "skipCache.*true" --type ts -B 5 -A 5

Repository: game-ci/unity-builder

Length of output: 2024


🏁 Script executed:

rg "rcloneRemote" --type ts -B 2 -A 2 | head -40

Repository: game-ci/unity-builder

Length of output: 3538


🏁 Script executed:

rg "skipCache" --type ts | grep -i rclone

Repository: game-ci/unity-builder

Length of output: 47


Hook filtering: skipCache does not disable rclone caching

When skipCache is true, AWS hooks are correctly excluded. However, rclone hooks (rclone-pull-cache, rclone-upload-cache, rclone-pull-build, rclone-upload-build) are not gated by this flag. If these are included in containerHookFiles with skipCache=true, they will execute but fail silently (due to || true in their commands).

For consistency with skipCache's intent to disable all caching, consider extending the filtering logic to exclude rclone hooks when cache is disabled:

const shouldIncludeAwsHooks =
  isContainerized && !CloudRunner.buildParameters?.skipCache && (provider === 'aws' || Boolean(hasAwsCreds));
const shouldIncludeRcloneHooks = 
  !CloudRunner.buildParameters?.skipCache && Boolean(CloudRunner.buildParameters?.rcloneRemote);
const filteredBuiltIns = builtInContainerHooks.filter((x) => {
  if (x.image === 'amazon/aws-cli') return shouldIncludeAwsHooks;
  if (x.image === 'rclone/rclone') return shouldIncludeRcloneHooks;
  return true;
});
🤖 Prompt for AI Agents
In src/model/cloud-runner/services/hooks/container-hook-service.ts around lines
277 to 293, the current filter only removes AWS CLI hooks when skipCache is true
but still allows rclone hooks to run; update the filtering so rclone hooks are
also excluded when CloudRunner.buildParameters?.skipCache is true or when
rcloneRemote is not configured. Specifically, compute a shouldIncludeRcloneHooks
boolean (false if skipCache is true or rcloneRemote is missing) and change the
builtInContainerHooks filtering to return false for entries where x.image ===
'rclone/rclone' unless shouldIncludeRcloneHooks is true, while keeping the
existing logic for the AWS image; push the filtered results into results as
before.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (7)
src/model/cloud-runner/workflows/build-automation-workflow.ts (5)

119-119: Hardcoded log path should use $LOG_FILE variable.

Line 119 hardcodes /home/job-log.txt instead of using the $LOG_FILE variable defined on line 82, creating an inconsistency with lines 130, 172, 181, 183, 184, 201, 203, and 212 which correctly use the variable.

Based on learnings, this issue was already flagged in a previous review.

Apply this diff to use the variable consistently:

-echo "log start" >> /home/job-log.txt
+echo "log start" >> "$LOG_FILE"

128-131: Non-containerized setup omits builder repository clone.

The non-containerized code path only initializes logging but doesn't clone the Unity builder repository, while the containerized path (lines 109-115) does. Since builderPath for non-containerized mode points to process.cwd()/dist/index.js, downstream node ${builderPath} commands (lines 212-213) will fail unless the builder is already present.

Based on learnings, this critical issue was already flagged in a previous review.

Clarify whether non-containerized providers require the builder to be pre-installed in the working directory, and document this requirement. If the builder should be present, add clone logic similar to the containerized path.


167-170: Stub binaries may silently mask missing tool dependencies.

Creating no-op stubs for node, npm, n, and yarn when they're missing will cause scripts requiring these tools to fail silently. If local-docker images should provide these tools, their absence indicates a misconfiguration.

Based on learnings, this major issue was already flagged in a previous review.

Consider logging warnings before creating stubs or removing stubs entirely and letting builds fail explicitly when required tools are missing.


181-181: Post-build error suppression may hide critical failures.

The || echo "Post-build command completed with warnings" pattern converts all post-build failures into success, potentially hiding critical issues like failed artifact uploads or cache persistence.

Based on learnings, this major issue was already flagged in a previous review.

Consider either propagating the failure or capturing and logging the actual exit code before continuing:

-node ${builderPath} -m remote-cli-post-build | node ${builderPath} -m remote-cli-log-stream --logFile /home/job-log.txt || echo "Post-build command completed with warnings"
+node ${builderPath} -m remote-cli-post-build | node ${builderPath} -m remote-cli-log-stream --logFile /home/job-log.txt || { EXIT_CODE=$?; echo "WARNING: Post-build failed with exit code $EXIT_CODE"; exit $EXIT_CODE; }

208-214: Non-containerized path omits build execution logic.

The non-containerized code path (local/local-system providers) only logs markers and runs post-build commands but never copies build scripts or executes /entrypoint.sh. In contrast, both containerized paths (lines 172 and 201) execute the entrypoint which performs the actual Unity build. Without this execution, non-containerized builds will log success without performing any build operations.

Based on learnings, this critical issue was already flagged in a previous review.

Add the entrypoint execution to the non-containerized path:

 return `
     echo "game ci start"
     echo "game ci start" >> "$LOG_FILE"
+    cd ${CloudRunnerFolders.ToLinuxFolder(CloudRunnerFolders.projectPathAbsolute)}
+    cp -r "${CloudRunnerFolders.ToLinuxFolder(path.join(distFolder, 'default-build-script'))}" "/UnityBuilderAction"
+    cp -r "${CloudRunnerFolders.ToLinuxFolder(path.join(ubuntuPlatformsFolder, 'entrypoint.sh'))}" "/entrypoint.sh"
+    cp -r "${CloudRunnerFolders.ToLinuxFolder(path.join(ubuntuPlatformsFolder, 'steps'))}" "/steps"
+    chmod -R +x "/entrypoint.sh"
+    chmod -R +x "/steps"
+    /entrypoint.sh | node ${builderPath} -m remote-cli-log-stream --logFile "$LOG_FILE"
-    timeout 3s node ${builderPath} -m remote-cli-log-stream --logFile "$LOG_FILE" || true
     node ${builderPath} -m remote-cli-post-build`;
src/model/cloud-runner/remote-client/index.ts (2)

275-301: Fix LFS folder assertion inconsistency and clean up catch naming.

The expanded fetch/checkout logic (tags + PR refs, SHA‑first with branch/PR fallbacks) is a good improvement, but there are two issues:

  1. Line 315 currently asserts that the LFS folder exists while the message says it “should not exist before caching”:
assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');

If the intent is to verify that no LFS objects are present before the cache pull (which matches the message and surrounding flow), the predicate should be inverted. As written, the assertion enforces the opposite of what the message says, which is misleading for anyone debugging this path.

  1. In the targetSha empty path, the catch binding _error is used but still non‑idiomatic and will typically trip unicorn/catch-error-name lint rules. Using error (without the underscore) matches the convention used elsewhere in this file and prior review guidance. Based on learnings / past comments, this was previously flagged.

Suggested fix:

-    } else {
-      try {
-        await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
-      } catch (_error) {
-        if ((targetBranch || '').startsWith('pull/')) {
-          await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
-        } else {
-          throw _error;
-        }
-      }
+    } else {
+      try {
+        await CloudRunnerSystem.Run(`git checkout ${targetBranch}`);
+      } catch (error) {
+        if ((targetBranch || '').startsWith('pull/')) {
+          await CloudRunnerSystem.Run(`git checkout origin/${targetBranch}`);
+        } else {
+          throw error;
+        }
+      }
@@
-    assert(fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');
+    assert(!fs.existsSync(path.join(`.git`, `lfs`)), 'LFS folder should not exist before caching');

Also applies to: 303-315


336-398: pullLatestLFS fallbacks are effectively disabled due to suppressError=true.

The multi‑strategy LFS pull logic (plain, then GIT_PRIVATE_TOKEN, then GITHUB_TOKEN, finally warn and continue) is a good design, but with the current use of CloudRunnerSystem.Run it doesn’t behave as intended:

  • In the “plain pull” block you call:

    await CloudRunnerSystem.Run(`git lfs pull`, true);
    await CloudRunnerSystem.Run(`git lfs checkout || true`, true);

    With suppressError = true, CloudRunnerSystem.Run never rejects even when git lfs pull fails (see cloud-runner-system.ts: non‑zero exit only rejects when suppressError is false). As a result, the try block always “succeeds”, you log:

    RemoteClientLogger.log(`Pulled LFS files without explicit token configuration`);

    and return, skipping all token‑based fallbacks and the final warning.

  • The GIT_PRIVATE_TOKEN and GITHUB_TOKEN blocks similarly call git lfs pull and git lfs checkout with suppressError = true, so their catch (error) blocks won’t run for normal command failures either, and the “Successfully pulled…” logs may be emitted even when the commands fail.

  • The final RemoteClientLogger.logWarning('Proceeding without LFS files...') will only execute if something other than the git commands throws, which defeats the safety net you’re aiming for.

This is a functional bug: builds can silently proceed without LFS content while logs claim success, and the fallback token strategies are practically unreachable. Based on learnings and the provided CloudRunnerSystem.Run implementation, error handling hinges on suppressError.

A safer implementation is to let git lfs pull throw (no suppressError) and catch at the TS level, while still keeping the overall operation best‑effort:

   private static async pullLatestLFS() {
@@
-    if (CloudRunner.buildParameters.skipLfs) {
-      RemoteClientLogger.log(`Skipping LFS pull (skipLfs=true)`);
-
-      return;
-    }
-
-    // Best effort: try plain pull first (works for public repos or pre-configured auth)
-    try {
-      await CloudRunnerSystem.Run(`git lfs pull`, true);
-      await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
-      RemoteClientLogger.log(`Pulled LFS files without explicit token configuration`);
-
-      return;
-    } catch {
-      /* no-op: best-effort git lfs pull without tokens may fail */
-      void 0;
-    }
+    if (CloudRunner.buildParameters.skipLfs) {
+      RemoteClientLogger.log(`Skipping LFS pull (skipLfs=true)`);
+      return;
+    }
+
+    // Best effort: try plain pull first (works for public repos or pre-configured auth)
+    try {
+      await CloudRunnerSystem.Run(`git lfs pull`); // allow failure to throw
+      await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
+      RemoteClientLogger.log(`Pulled LFS files without explicit token configuration`);
+      return;
+    } catch (error: any) {
+      RemoteClientLogger.logCliDiagnostic(
+        `git lfs pull without explicit token configuration failed: ${error?.message ?? String(error)}`,
+      );
+    }
@@
-    // Try with GIT_PRIVATE_TOKEN
-    try {
+    // Try with GIT_PRIVATE_TOKEN
+    try {
@@
-        await CloudRunnerSystem.Run(`git lfs pull`, true);
-        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
+        await CloudRunnerSystem.Run(`git lfs pull`);
+        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
         RemoteClientLogger.log(`Successfully pulled LFS files with GIT_PRIVATE_TOKEN`);
@@
-    } catch (error: any) {
-      RemoteClientLogger.logCliError(`Failed with GIT_PRIVATE_TOKEN: ${error.message}`);
+    } catch (error: any) {
+      RemoteClientLogger.logCliError(`Failed with GIT_PRIVATE_TOKEN: ${error?.message ?? String(error)}`);
     }
@@
-    // Try with GITHUB_TOKEN
-    try {
+    // Try with GITHUB_TOKEN
+    try {
@@
-        await CloudRunnerSystem.Run(`git lfs pull`, true);
-        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
+        await CloudRunnerSystem.Run(`git lfs pull`);
+        await CloudRunnerSystem.Run(`git lfs checkout || true`, true);
         RemoteClientLogger.log(`Successfully pulled LFS files with GITHUB_TOKEN`);
@@
-    } catch (error: any) {
-      RemoteClientLogger.logCliError(`Failed with GITHUB_TOKEN: ${error.message}`);
+    } catch (error: any) {
+      RemoteClientLogger.logCliError(`Failed with GITHUB_TOKEN: ${error?.message ?? String(error)}`);
     }
@@
-    // If we get here, all strategies failed; continue without failing the build
-    RemoteClientLogger.logWarning(`Proceeding without LFS files (no tokens or pull failed)`);
+    // If we get here, all strategies failed; continue without failing the build
+    RemoteClientLogger.logWarning(`Proceeding without LFS files (no tokens or pull failed)`);

This preserves “best effort” semantics while actually exercising the token fallbacks and the final warning. It also removes the now‑redundant void 0; no‑op that was previously called out.

🧹 Nitpick comments (4)
src/model/cloud-runner/workflows/build-automation-workflow.ts (1)

172-184: Consider using $LOG_FILE variable consistently throughout containerized paths.

Lines 172, 181, 183, 184, 201, and 203 hardcode /home/job-log.txt, which matches the containerized value set on line 82. While functionally correct, using the $LOG_FILE variable would improve consistency and make the code more maintainable if the log path ever needs to change.

Example for line 172:

-{ echo "game ci start"; echo "game ci start" >> /home/job-log.txt; echo "CACHE_KEY=$CACHE_KEY"; echo "$CACHE_KEY"; if [ -n "$LOCKED_WORKSPACE" ]; then echo "Retained Workspace: true"; fi; if [ -n "$LOCKED_WORKSPACE" ] && [ -d "$GITHUB_WORKSPACE/.git" ]; then echo "Retained Workspace Already Exists!"; fi; /entrypoint.sh; } | node ${builderPath} -m remote-cli-log-stream --logFile /home/job-log.txt
+{ echo "game ci start"; echo "game ci start" >> "$LOG_FILE"; echo "CACHE_KEY=$CACHE_KEY"; echo "$CACHE_KEY"; if [ -n "$LOCKED_WORKSPACE" ]; then echo "Retained Workspace: true"; fi; if [ -n "$LOCKED_WORKSPACE" ] && [ -d "$GITHUB_WORKSPACE/.git" ]; then echo "Retained Workspace Already Exists!"; fi; /entrypoint.sh; } | node ${builderPath} -m remote-cli-log-stream --logFile "$LOG_FILE"

Apply similar changes to lines 181, 183, 184, 201, and 203.

Also applies to: 201-203

src/model/cloud-runner/remote-client/index.ts (3)

35-78: K8s log streaming changes look solid; only minor whitespace/logging nits.

The real‑time streaming behavior and separation of concerns (file, stdout for k8s, CloudRunnerLogger) look good.

Two minor consistency tweaks you might consider:

  • The log file skips lines where element.trim() is falsy, but CloudRunnerLogger.log(element) and the k8s stdout path still emit them. If you ever assert on exact line counts between file/stdout, this divergence could surprise you.
  • On stdin end, CloudRunnerLogger.log(lingeringLine) runs even when lingeringLine is empty; guarding that call would avoid an extra blank client log entry.

Example tweak if you want to align behavior:

-      for (const element of lines) {
-        // Always write to log file so output can be collected by providers
-        if (element.trim()) {
-          fs.appendFileSync(logFile, `${element}\n`);
-        }
+      for (const element of lines) {
+        const trimmed = element.trimEnd();
+        // Always write to log file so output can be collected by providers
+        if (trimmed) {
+          fs.appendFileSync(logFile, `${trimmed}\n`);
+        }
@@
-      if (lingeringLine) {
+      if (lingeringLine) {
         // Always write to log file so output can be collected by providers
         fs.appendFileSync(logFile, `${lingeringLine}\n`);
@@
-      CloudRunnerLogger.log(lingeringLine);
+      if (lingeringLine) {
+        CloudRunnerLogger.log(lingeringLine);
+      }

85-139: Post‑build cache guards and activation logging are good; consider small DRY/error‑logging improvements.

The guarded Library/Build cache pushes and best‑effort error handling are clear and match the intent to avoid failing builds on cache issues. The activation success marker to both stdout and CloudRunnerLogger is also a nice touch for tests and diagnostics.

Two non‑blocking suggestions:

  • The Library and Build blocks are almost identical; extracting a small helper like pushFolderToCacheIfNonEmpty({source, target, key, label}) would reduce duplication and keep future behavior changes in sync.
  • In both cache catch blocks you use error: any and log error.message. If a non‑Error is thrown, message will be undefined. Using String(error) as a fallback is a bit more robust.

Illustrative change for the logging part:

-    } catch (error: any) {
-      RemoteClientLogger.logWarning(`Library cache push skipped with error: ${error.message}`);
+    } catch (error: any) {
+      const msg = error?.message ?? String(error);
+      RemoteClientLogger.logWarning(`Library cache push skipped with error: ${msg}`);
@@
-    } catch (error: any) {
-      RemoteClientLogger.logWarning(`Build cache push skipped with error: ${error.message}`);
+    } catch (error: any) {
+      const msg = error?.message ?? String(error);
+      RemoteClientLogger.logWarning(`Build cache push skipped with error: ${msg}`);

Also applies to: 142-149, 157-166


404-435: Retained workspace fetch/checkout flow is solid; consider reusing pullLatestLFS and guarding empty SHA/branch.

The retained workspace path now:

  • Logs the cache key,
  • Fetches all refs + tags (and PR refs when needed),
  • Updates LFS,
  • Then does a SHA‑first checkout with branch/PR fallbacks.

That’s a good, predictable resolution strategy.

Two follow‑ups to consider:

  • Once pullLatestLFS is fixed, you can likely reuse it here instead of git lfs pull + git lfs checkout || true to keep LFS behavior (incl. token fallbacks and “proceed without LFS” warning) consistent between fresh clones and retained workspaces.
  • sha and branch are read without guards; if either is unexpectedly empty, you’ll end up running git reset --hard "" or git checkout undefined. Even though upstream usually supplies them, adding a small guard would make this path more defensive.

Illustrative change:

-      await CloudRunnerSystem.Run(`git lfs pull`);
-      await CloudRunnerSystem.Run(`git lfs checkout || true`);
-      const sha = CloudRunner.buildParameters.gitSha;
-      const branch = CloudRunner.buildParameters.branch;
+      await RemoteClient.pullLatestLFS();
+      const sha = CloudRunner.buildParameters.gitSha;
+      const branch = CloudRunner.buildParameters.branch;
+      if (!sha && !branch) {
+        RemoteClientLogger.logWarning(
+          'Retained workspace: no gitSha or branch provided; skipping reset/checkout step',
+        );
+        return true;
+      }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a99defa and 9bfb4df.

⛔ Files ignored due to path filters (2)
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (2)
  • src/model/cloud-runner/remote-client/index.ts (4 hunks)
  • src/model/cloud-runner/workflows/build-automation-workflow.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-30T18:01:50.339Z
Learnt from: MichaelBuhler
Repo: game-ci/unity-builder PR: 685
File: .github/workflows/build-tests-ubuntu.yml:146-146
Timestamp: 2025-01-30T18:01:50.339Z
Learning: The build steps duplication in the workflow is intentional to implement a retry mechanism with different delay intervals between attempts.

Applied to files:

  • src/model/cloud-runner/workflows/build-automation-workflow.ts
🧬 Code graph analysis (2)
src/model/cloud-runner/remote-client/index.ts (2)
src/model/cloud-runner/remote-client/remote-client-logger.ts (1)
  • RemoteClientLogger (7-93)
src/model/cloud-runner/services/core/cloud-runner-system.ts (1)
  • CloudRunnerSystem (4-69)
src/model/cloud-runner/workflows/build-automation-workflow.ts (2)
src/model/index.ts (1)
  • CloudRunner (27-27)
src/model/cloud-runner/options/cloud-runner-folders.ts (1)
  • CloudRunnerFolders (6-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: WSAPlayer on 2021.3.32f1
  • GitHub Check: StandaloneOSX on 6000.0.36f1
  • GitHub Check: iOS on 2023.2.2f1
  • GitHub Check: iOS on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2022.3.13f1
  • GitHub Check: StandaloneOSX on 2021.3.45f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 6000.0.36f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: Android on 2023.2.2f1
  • GitHub Check: tvOS on 2023.2.2f1
  • GitHub Check: StandaloneWindows64 on 2023.2.2f1
  • GitHub Check: tvOS on 2021.3.32f1
  • GitHub Check: StandaloneWindows64 on 2022.3.13f1
  • GitHub Check: tvOS on 2022.3.13f1
  • GitHub Check: WSAPlayer on 2023.2.2f1
  • GitHub Check: Android on 2021.3.32f1
  • GitHub Check: WSAPlayer on 2022.3.13f1
  • GitHub Check: Android on 2022.3.13f1
  • GitHub Check: StandaloneWindows64 on 2021.3.32f1
🔇 Additional comments (9)
src/model/cloud-runner/workflows/build-automation-workflow.ts (9)

53-60: Container-awareness logic is well-structured.

The isContainerized flag correctly distinguishes between containerized providers (aws, k8s, local-docker) and non-containerized providers, and the builderPath selection appropriately reflects where the builder will be located for each scenario.


64-73: Conditional toolchain setup appropriately handles provider differences.

Skipping apt-get and toolchain setup for local-docker and non-containerized providers is correct, as local-docker uses pre-built images and non-containerized environments are expected to have tools pre-installed.


75-82: Workspace and log file path handling is environment-aware.

The conditional GITHUB_WORKSPACE setup correctly handles local-docker's special workspace requirements, and the LOG_FILE path appropriately differentiates between containerized and non-containerized environments.


95-106: Robust builder repository clone with branch fallback.

The enhanced clone logic correctly checks for the specified branch using git ls-remote before cloning, and provides a sensible fallback chain (cloud-runner-develop → main → default branch). This prevents failures when the requested branch doesn't exist.


150-161: Defensive Git LFS handling for local-docker environments.

The Git LFS pull logic correctly checks for the .git directory before attempting operations and uses || true for graceful degradation. This ensures LFS content is available when possible without failing builds in non-git environments.


162-166: CRLF normalization prevents script execution failures.

Converting potential CRLF line endings to LF in shell scripts is essential for local-docker scenarios where scripts may originate from Windows environments. The use of || true ensures the build continues even if sed is unavailable.


173-179: Empty tar creation prevents downstream cache errors.

Creating empty tar files when cache doesn't exist is a good defensive practice that prevents errors in subsequent operations expecting these files to be present.


185-189: Cache mirroring enables test assertions in local-docker mode.

Copying cache contents back to the workspace allows test workflows to verify cache behavior. The || true ensures build success even if cache directories are empty or inaccessible.


192-206: General containerized path executes build correctly.

The AWS/K8s containerized path appropriately copies build scripts, executes the entrypoint, and streams logs. The structure is simpler than local-docker since these providers run in controlled environments without Windows compatibility concerns.

Note: Line 203 uses || true to suppress post-build errors, which is more transparent than the misleading message in line 181 but still prevents error propagation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants