feat: add git sparse checkout mode for batch scanning#1
Open
SecKatie wants to merge 10 commits intonorth-echo:mainfrom
Open
feat: add git sparse checkout mode for batch scanning#1SecKatie wants to merge 10 commits intonorth-echo:mainfrom
SecKatie wants to merge 10 commits intonorth-echo:mainfrom
Conversation
…riage Four new detection capabilities based on Red Hat triage analysis: - Gap 1: Actor guards — detect github.actor == 'bot[bot]' gates (→ info) and human actor restrictions (→ downgrade by 1) - Gap 2: Action-based permission gates — recognize actions-cool/check-user-permission and similar third-party permission-checking actions as maintainer checks - Gap 3: Cross-job needs: gating — follow needs: chains to detect environment approval gates on upstream authorize jobs (→ downgrade by 1) - Gap 4: Path isolation — detect fork code checked out to subdirectory with no direct execution, downgrade confidence to pattern-only Also adds fork guard to Fluxgate's own CI workflow (fixes FG-006 self-finding). Validated against 11 triage findings: 5 false criticals corrected automatically, 2 false positives eliminated, 2 clean criticals unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ulk scanning New GitHub Actions rules: - FG-008: OIDC misconfiguration (id-token:write on fork-accessible triggers) - FG-009: Self-hosted runner exposure on PR/PRT workflows - FG-010: Cache poisoning via actions/cache on external triggers Cross-platform CI/CD support: - Platform-agnostic Pipeline interface (internal/cicd) - GitLab CI parser with 4 rules (GL-001 MR secrets, GL-002 script injection, GL-003 unsafe includes, GL-009 self-hosted MR runner) Bulk scanning infrastructure: - BigQuery ingest command for large-scale workflow analysis - Gato-X import command for converting discovery output to repo lists - Analysis SQL queries for scan campaign reporting Fixes: - Route all warning output to stderr (fixes JSON output corruption) - Fix runs-on group+labels parser to prefer labels over group name - Version bump to v0.5.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add cross-platform Azure DevOps Pipelines support: - AZ-001: PR builds with secret/variable group exposure to forks - AZ-002: Script injection via predefined variables (Build.SourceBranchName, etc.) - AZ-003: Unpinned template extends and repository resources - AZ-009: Self-hosted agent pools on PR-triggered pipelines Parser handles all Azure Pipelines YAML structures: stages, jobs, single-job (root steps), deployment jobs, pool inheritance, resources, and extends templates. Environment protection reduces AZ-009 severity from high to medium. Wired into ScanDirectory for automatic detection of azure-pipelines.yml alongside GitHub Actions and GitLab CI. 8 tests, 5 fixtures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…g nodes YAML parses unquoted script lines containing ": " (e.g. `echo "MR title: $CI_MERGE_REQUEST_TITLE"`) as mapping nodes instead of scalars. This caused GL-002 script injection detection to miss these lines entirely. Fix: extractScriptSteps now reconstructs the original command string from mapping node key-value pairs when sequence items are parsed as mappings. Found during v0.5.0 cross-platform test corpus validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The path isolation mitigation incorrectly classified workflows as safe when fork code was referenced via shell variable aliases (e.g., PR="$GITHUB_WORKSPACE/pr" followed by python script.py "$PR"). The referencesForkPath function now: - Matches $GITHUB_WORKSPACE/<path> references directly - Detects shell variable assignments aliasing the checkout path - Tracks alias variables and checks their usage in non-data commands This fixes a confidence tiering regression on tinygrad/szdiff.yml, which was classified as pattern-only instead of confirmed. The pip install and python execution of fork code via $PR variable is now correctly detected. Found during v0.5.0 ground truth validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ack correlation FG-011: New rule detects bot actor guard TOCTOU bypass risk on pull_request_target and workflow_run triggers. Bot actor guards (dependabot[bot], renovate[bot]) no longer suppress FG-001 findings to info — capped at high to reflect TOCTOU bypassability. FG-002 extended: workflow_dispatch inputs and workflow_call inputs now detected as injectable expressions (github.event.inputs.*, inputs.*). FG-001+FG-002 correlation: post-scan pass merges co-occurring pwn request and script injection findings into a single enhanced finding referencing the Ultralytics attack pattern. Triage prompts added with BoostSecurity attack taxonomy (pipeline parasitism, transitive action compromise, bot TOCTOU, Shai-Hulud, Ultralytics chain). 21 rules across 3 platforms, 69 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…repo Add SECURITY-BOUNDARIES.md defining the public/private boundary for this project. Add CLAUDE.md with CC instructions referencing it. Remove prompts/ from git tracking — triage agent prompts encode assessment methodology and must not be public (rule 1). Update .gitignore to exclude prompts/, queries/, scans/, findings/, reports/, and .sql files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove reference to committed security boundaries file from CLAUDE.md. Add SECURITY-BOUNDARIES.md to .gitignore to prevent future accidental commits. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds --clone flag to batch and discover commands, scanning repos via local git sparse checkout instead of the GitHub API. This avoids API rate limits when scanning large numbers of repos. Key changes: - internal/git: sparse clone package with concurrent clone-and-scan - Sliding star-count windows in FetchTopRepos to paginate beyond GitHub's 1,000-result search limit - Repo list caching in SQLite for --resume with --top N - SQLite hardening: single-conn serialization + busy_timeout for concurrent goroutine writes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Test plan resultsAll items verified:
|
Owner
|
Hey @SecKatie — could you rebase this onto the current |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--cloneflag tobatchanddiscovercommands to scan repos via local git sparse checkout instead of the GitHub API, avoiding rate limits at scaleFetchTopReposto paginate beyond GitHub's 1,000-result search limit--resumewith--top Nskips re-fetchingbusy_timeout)Test plan
go test -short ./...passesfluxgate batch --top 500 --clone --resume(389/500 scanned, 382 with findings)--resumere-run skips already-scanned repos--keepflag preserves cloned directoriesdiscover --clonepath🤖 Generated with Claude Code