Skip to content

Comments

Refactor: upgrade deps, improve reliability, update docs and tooling#4

Merged
cbullinger merged 46 commits intomainfrom
refactor/cursor-recs
Feb 20, 2026
Merged

Refactor: upgrade deps, improve reliability, update docs and tooling#4
cbullinger merged 46 commits intomainfrom
refactor/cursor-recs

Conversation

@cbullinger
Copy link
Collaborator

@cbullinger cbullinger commented Feb 14, 2026

Summary

Major refactor covering dependency upgrades, reliability improvements, documentation overhaul, and tooling updates.

Dependency Upgrades

  • Go 1.24.0 → 1.26.0
  • google/go-github v48 → v82
  • mongo-driver v1 → v2

Reliability & Features

  • Rate limit handling: RateLimitTransport with auto-retry on 403/429
  • Webhook idempotency: DeliveryTracker deduplicates via X-GitHub-Delivery header
  • Structured logging: migrated to log/slog with JSON output for Cloud Logging
  • Sentinel errors: ErrRateLimited, ErrNotFound, etc. in errors.go
  • Token manager: thread-safe token state with sync.RWMutex
  • Readiness endpoint: /ready probe separate from /health liveness
  • Dry-run enforcement: DryRun flag now actually prevents writes (was cosmetic-only)
  • Added backward-compatible fields to DeprecatedFileEntry:
    • source_path - Original source file path (for traceability)
    • pr_number - PR that caused the deletion (for auditing)
  • Slack notifications for deprecations: Automatic Slack notification when files are deprecated
  • Slack added to startup banner: Added Slack: true/false to the startup banner to show whether Slack notifications are enabled
image

Docs & Config

  • Deleted stale/duplicative docs (QUICK-REFERENCE.md, DEBUG-LOGGING.md, RECOMMENDATIONS.md)
  • Updated all docs for Cloud Run (removed App Engine references), /events webhook path, slog logging
  • Deleted legacy app.yaml (App Engine Flex config)
  • Added github-app-manifest.yml documenting required permissions and events
  • Updated LOCAL-TESTING.md with GitHub App auth setup (SKIP_SECRET_MANAGER flow)
  • Fixed default WebserverPath from /webhook to /events

Scripts

  • Deleted obsolete scripts (convert-env-format.sh, convert-env-to-yaml.sh, validate-config-detailed.py)
  • Added ci-local.sh (mirrors CI pipeline locally)
  • Updated all scripts for Cloud Run, correct binary name, /events path
  • Added release.sh to create a tag-based release system

CLI Tools (cmd/)

  • test-webhook: added X-GitHub-Delivery header, fixed stale URLs
  • config-validator: init now supports basic/glob/regex templates, fixed type: "pr""pull_request"
  • test-pem: rewritten with proper error handling, added README

CI

  • Updated ci.yml for Go 1.26, added security scanning (gosec, Trivy)
  • Kept SERVICE_NAME: "examples-copier" to deploy to existing Cloud Run service
  • Dockerfile binary renamed internally to github-copier
  • Removed release deployment on merge to main. Instead, it now requires a manual release via release.sh script

Bug Fixes

Deprecation File Location Fix

  • Issue: Deprecation file (deprecated_examples.json) was being read/written to the config repo instead of the source repo
  • Fix: Now correctly reads from and writes to the source repository where files are being deleted
  • Bonus: Creates the deprecation file automatically if it doesn't exist (handles 404 gracefully)
  • Bonus: Detects and skips duplicate deprecation entries on webhook redelivery

Duplicate Prevention on Webhook Redelivery

  • Issue: Webhook redelivery could create duplicate entries in the deprecation file
  • Fix: UpdateDeprecationFile() now checks for existing entries before appending
  • Key matching: Uses filename|repo|branch composite key for deduplication

Deprecation Config Enabled Check

  • Issue: DeprecationConfig.Enabled field was never checked - deprecation tracking happened for ALL deleted files regardless of config
  • Fix: addToDeprecationMap() now only tracks deprecations when workflow.DeprecationCheck.Enabled == true
  • Impact: Prevents unintended deprecation tracking for workflows that don't opt-in

Exclude Patterns Fix

  • Issue: Workflow.Exclude patterns weren't working because the code used glob matching but the documentation (and configs) specified regex patterns
  • Fix: Changed isExcluded() to use regex matching, consistent with documentation and SourcePattern.ExcludePatterns
  • Files like *_test.ts, *.spec.ts, and .eslintrc.* are now properly excluded

Auto-Merge Batching Warning

  • Issue: When multiple workflows target the same repo with conflicting auto_merge settings, there was no indication of the conflict
  • Fix: Added warning log when workflows conflict, with AND logic for safety (any false disables auto-merge for the batch)

Deprecation Feature Enhancements (Latest Session)

Test plan

  • go build ./... passes
  • go test -race ./... passes
  • golangci-lint run ./... passes
  • Local dry-run test with smee webhook forwarding (confirmed dry-run prevents writes)
  • Local test against production config (grove-platform/github-copier.copier/main.yaml)
  • CI pipeline passes on this PR
  • Exclude patterns correctly filter out *_test.ts, *.spec.ts, .eslintrc.* files
  • Deprecation file created/updated in source repo (not config repo)
  • Duplicate deprecation entries skipped on webhook redelivery
  • Auto-merge conflict warning logged when workflows have different settings
  • Deprecation only tracked when DeprecationCheck.Enabled: true
  • Deprecation entries include source_path and pr_number fields
  • Slack notification sent when files are deprecated
  • Deploy to Cloud Run and verify /health, /ready, /metrics endpoints
  • Verify webhook processing on production with a real merged PR

Made with Cursor

cbullinger and others added 30 commits January 13, 2026 16:13
The workflow processor was creating UploadKey with BranchPath set to just
the branch name (e.g., 'main') instead of the full ref path (e.g.,
'refs/heads/main'). This caused GitHub API calls to fail with 404 errors
when trying to access the branch ref.

This fix ensures BranchPath is always set with the 'refs/heads/' prefix,
consistent with how it's used throughout the rest of the codebase.
- Add GetRestClientForOrg() to get installation-specific tokens
- Fix GraphQL query to use node(id:) instead of repository(owner:)
- Update RetrieveFileContentsWithConfigAndBranch to use org-specific client
- Remove refs/heads/ prefix duplication in workflow processor
- Fixes 404 errors when accessing repos in different orgs
- Add explicit 'GITHUB APP AUTHENTICATION FAILED' message for 401 errors
- Point users to check CODE_COPIER_PEM secret in GCP Secret Manager
- Add detection in getInstallationIDForOrg, getInstallationAccessToken
- Add detection in config_loader and main_config_loader when fetching configs

This makes it immediately obvious when the PEM key is invalid/expired
instead of showing misleading 'failed to load config' errors.
golangci-lint v1.x is built with Go 1.24 and can't analyze Go 1.26 code.
Pin to v2.9.0 which supports Go 1.26. Similarly, the gosec Docker action
bundles Go 1.25.7 — switch to `go install` so it uses the Go 1.26 from
setup-go.

Co-authored-by: Cursor <cursoragent@cursor.com>
golangci-lint-action v6 doesn't support golangci-lint v2. Switch to
action v7 which does.

gosec @latest includes new taint analysis rules (G703-G706) that flag
all http.Client.Do() calls as SSRF and CLI os.ReadFile as path traversal.
These are false positives for this codebase — exclude globally alongside
the existing G115 exclusion.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Wrap deferred Close() calls to satisfy errcheck
- Replace deprecated github.String/Int/Bool with github.Ptr (SA1019)
- Apply De Morgan's law to simplify boolean expressions (QF1001)
- Remove unused parseIntWithDefault function
- Use t.Setenv in tests instead of unchecked os.Setenv (errcheck)
- Use system golangci-lint v2.9.0 in pre-commit for full coverage

Co-authored-by: Cursor <cursoragent@cursor.com>
Add EffectiveConfigFile() method to Config that returns the actual
config file in use (MainConfigFile when USE_MAIN_CONFIG=true, otherwise
ConfigFile). Use it in the startup banner and main_config_loader.

Co-authored-by: Cursor <cursoragent@cursor.com>
…rget

UploadKey now includes CommitStrategy, so workflows targeting the same
repo/branch with different strategies (direct vs pull_request) produce
independent write operations instead of silently merging into one batch.

Also adds a config-load-time warning for mixed strategies, enriches
write-phase log messages with strategy_source and file_count, and
includes a test verifying the new behavior.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ier PRs

#3: workflow_processor now logs when a subsequent workflow in the same
batch overwrites the commit message or PR title, showing both old and
new values so the "last wins" behavior is transparent.

#5: addFilesViaPR now checks for an existing open PR from a copier/*
branch before creating a new one. If found, pushes to the existing
branch and updates the PR title/body instead of creating a duplicate.

Co-authored-by: Cursor <cursoragent@cursor.com>
…llel

- Add CachedConfigLoader with configurable TTL (default 5min) that
  wraps ConfigLoader to avoid re-resolving YAML on every webhook
- Refactor ProcessWorkflow into match/fetch/queue phases, fetching
  file contents concurrently via errgroup (max 5 parallel fetches)
- Mark RECOMMENDATIONS.md items #13 and #14 as resolved

Co-authored-by: Cursor <cursoragent@cursor.com>
cbullinger and others added 15 commits February 15, 2026 13:39
Before creating a commit, compare the new tree SHA against the base
commit's tree SHA. If identical (e.g., duplicate webhook processing
where changes are already at HEAD), skip the commit and log a message
instead of creating an empty 0-change commit.

Applies to both the direct-commit and PR-via-temp-branch code paths.
Marks RECOMMENDATIONS.md item #2 as resolved.

Co-authored-by: Cursor <cursoragent@cursor.com>
- #9: Wrap background webhook processing in context.WithTimeout
  (configurable via WEBHOOK_PROCESSING_TIMEOUT_SECONDS, default 5min)
- #7: Add exponential-backoff retry via processWebhookWithRetry
  (configurable via WEBHOOK_MAX_RETRIES and WEBHOOK_RETRY_INITIAL_DELAY);
  panics are converted to errors; Slack alert sent after exhaustion
- #6: processFilesWithWorkflows now returns per-workflow error map so
  one failing workflow doesn't block others; aggregate error enables
  retry of the full processing attempt

Co-authored-by: Cursor <cursoragent@cursor.com>
- #10: Rotate test PEM key in .env.test to a purpose-generated
  test-only key; update .gitleaksignore with new fingerprint
- #16: Add .golangci.yml (v2 format) pinning linters, formatters,
  and documenting suppressed rules; auto-fix pre-existing gofmt issues
- #23: Add DeliveryID and Attempts fields to ErrorEvent; thread
  delivery ID through processWebhookWithRetry to Slack alerts
- #19: Add TestIntegration_TargetRepoBatching_MixedStrategies
  verifying 2 direct workflows batch into 1 commit while a PR
  workflow produces a separate PR; fix existing integration test
  to mock GetCommit for empty-commit detection

Co-authored-by: Cursor <cursoragent@cursor.com>
- golangci.yml: move exclude-rules from issues to exclusions (v2 schema)
- Add #nosec annotations for gosec taint analysis rules (G703-G706)
  - G704 (SSRF): hardcoded api.github.com URLs in auth and CLI tools
  - G703 (path traversal): CLI tool reads user-provided file path
  - G705 (XSS): CLI tool writes to stderr, not web output
  - G706 (log injection): structured slog with key-value pairs

Co-authored-by: Cursor <cursoragent@cursor.com>
- golangci.yml: move exclusions under linters (v2 schema nests it there)
- slack_notifier.go: add #nosec G704 for Slack webhook URL

Co-authored-by: Cursor <cursoragent@cursor.com>
…egex

Two bug fixes:

1. Deprecation file location: The deprecation file (deprecated_examples.json)
   is now correctly read from and written to the source repository where files
   are being deleted, instead of the config repository. Also handles creating
   the file if it doesn't exist.

2. Exclude patterns: Changed workflow exclude patterns to use regex matching
   (as documented) instead of glob matching. This fixes exclude patterns like
   ".*_test\.ts$" not working.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add .cursorignore to exclude binaries, secrets, IDE files, and test
  fixtures from Cursor indexing
- Update AGENT.md with missing files (config_cache.go, utility scripts)
- Add release process documentation and Quick Reference section
- Add Key Documentation table for discoverability

Co-authored-by: Cursor <cursoragent@cursor.com>
…lows

When multiple workflows target the same repo with different auto_merge
settings, log a warning and use AND logic (any false wins) for safety.
Copy link
Collaborator

@MongoCaleb MongoCaleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of Qs. LTGM.

# PEM_NAME=CODE_COPIER_PEM
# GOOGLE_CLOUD_PROJECT_ID=github-copy-code-examples

# Option B: Provide the PEM key directly (no GCP access needed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edification question: It's been a long time since I visited this; do we now have a safe way to store a PEM?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be only if you're testing locally, so it'd be stored in an uncommitted .env.local

# Only set these if you want to test with actual GCP
# GCP_PROJECT_ID=your-project-id
# PEM_KEY_NAME=projects/123/secrets/CODE_COPIER_PEM/versions/latest
# A PAT is NOT used by the app itself — only by the test-webhook CLI tool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to spell out/define Personal Access Token here?

@cbullinger cbullinger merged commit 522d185 into main Feb 20, 2026
7 checks passed
@cbullinger cbullinger deleted the refactor/cursor-recs branch February 20, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants