Skip to content

Scrubbed one-way replication from private Git repos to public targets

License

Notifications You must be signed in to change notification settings

obinnaokechukwu/git-copy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

git-copy

Scrubbed one-way replication from private Git repos to public targets.

Overview

git-copy is a CLI tool that safely synchronizes Git repositories from private sources to public targets (GitHub, GitLab, Gitea, etc.) while automatically scrubbing sensitive information. It rewrites Git history to replace private usernames, exclude sensitive files, and apply custom text replacements.

Features

  • Automatic Scrubbing: Replaces private usernames and sensitive strings throughout Git history
  • File Exclusion: Exclude files by pattern (e.g., .env, secrets/**, etc.)
  • Opt-In Override: Selectively include files that would otherwise be excluded
  • History Replacement: Replace file contents throughout history (e.g., retroactively change LICENSE)
  • Author Rewriting: Replace commit author information with public identities
  • Empty Commit Pruning: Automatically drops commits that become empty after filtering
  • Multi-Target: Sync to multiple destinations (GitHub, GitLab, Gitea)
  • Multi-Account Support: Automatically uses correct credentials for different GitHub accounts
  • Auto-Sync Daemon: Background service auto-discovers and syncs repos
  • Topics/Tags: Copy repository topics from source to target
  • Safe by Default: Validates scrubbed repos before pushing (blocks .env, CLAUDE.md, etc.)
  • Efficient: Uses git fast-export/fast-import for fast history rewriting

Installation

go install github.com/obinnaokechukwu/git-copy/cmd/git-copy@latest

Or build from source:

git clone https://github.com/obinnaokechukwu/git-copy
cd git-copy
go build -o git-copy ./cmd/git-copy

Quick Start

1. Initialize a Repository

cd /path/to/your/private/repo
git-copy init

This creates a .git-copy/config.json file with your scrubbing rules. The private_username is automatically detected from your origin remote URL (e.g., github.com/your-username/repoyour-username).

2. Add a Sync Target

git-copy add-target

Follow the interactive prompts to configure:

  • Target label (e.g., "github-public")
  • Provider (github, gitlab, gitea)
  • Account/organization name
  • Repository name
  • Authentication credentials

3. Sync to Target

git-copy sync

This will:

  1. Export your Git history
  2. Apply scrubbing rules (replace usernames, exclude files)
  3. Validate the scrubbed repo
  4. Push to the configured target(s)

Configuration

The .git-copy/config.json file controls scrubbing behavior:

{
  "private_username": "myPrivateUsername",
  "defaults": {
    "exclude": [
      ".env",
      "secrets/**",
      "*.key"
    ],
    "opt_in": [],
    "replace_history_with_current": [
      "LICENSE",
      "README.md"
    ],
    "extra_replacements": {
      "company-internal.example.com": "public.example.com"
    }
  },
  "targets": [
    {
      "label": "github-public",
      "provider": "github",
      "account": "my-public-account",
      "repo_name": "my-public-repo",
      "replacement": "PublicName",
      "public_author_name": "Public Name",
      "public_author_email": "public@example.com"
    }
  ]
}

Configuration Fields

  • private_username: Your private username to be replaced in all text/commits
  • defaults.exclude: File patterns to exclude (glob syntax, ** supported)
  • defaults.opt_in: Override exclusions for specific files
  • defaults.replace_history_with_current: Files to replace with current content throughout history (see below)
  • defaults.extra_replacements: Additional string replacements (old → new)
  • targets[].label: Unique identifier for this sync target
  • targets[].provider: github, gitlab, or gitea
  • targets[].account: Target account/organization
  • targets[].repo_name: Target repository name
  • targets[].replacement: String to replace private_username with
  • targets[].public_author_name: Name for rewritten commits
  • targets[].public_author_email: Email for rewritten commits
  • targets[].replace_history_with_current: Target-specific files to replace (merged with defaults)

Replace History With Current

The replace_history_with_current feature allows you to retroactively replace file contents throughout your entire Git history. This is useful when you need to make a file appear as if it was always a certain way.

Common use cases:

  • Changing LICENSE from MIT to Apache 2.0 retroactively
  • Updating README to reflect current branding from the start
  • Fixing configuration files that contained wrong values historically

How it works:

Source history:              Public history (after sync):
─────────────────            ────────────────────────────
commit A: add files          commit A: add files
commit B: add LICENSE (MIT)  commit B: add LICENSE (Apache) ← replaced
commit C: fix bug            commit C: fix bug
commit D: update LICENSE     [DROPPED - became empty]
commit E: new feature        commit E: new feature
  1. The current (HEAD) content of specified files is used
  2. When the file first appears in history, it gets the current content instead
  3. All subsequent commits that only modify these files are automatically dropped (they become empty)
  4. Commits that modify these files and other files keep the other changes

Example:

{
  "defaults": {
    "replace_history_with_current": ["LICENSE", "NOTICE"]
  }
}

This makes LICENSE and NOTICE appear unchanged throughout history, using their current content from the first commit where they appear.

Important notes:

  • Files are replaced at their first occurrence in history, not injected into commits that never had them
  • The file content is scrubbed (private username replacement still applies)
  • Empty commits are pruned automatically - no trace of intermediate changes

Multi-Account GitHub Support

If you have multiple GitHub accounts authenticated with gh auth login, git-copy automatically uses the correct credentials for each target:

# Check your authenticated accounts
gh auth status

# git-copy will use the right token based on target account
git-copy sync  # Uses obinnaokechukwu's token for obinnaokechukwu/repo

This works for both repo creation and pushing. No manual token switching needed.

Commands

Repository Commands

# Initialize git-copy in current repo
git-copy init [--repo PATH]

# Add a new sync target interactively
git-copy add-target [--repo PATH]

# Remove a sync target
git-copy remove-target <label> [--repo PATH]

# List configured targets
git-copy list-targets [--repo PATH]

# Sync to all targets (or specific target). Audits the scrubbed output by default.
git-copy sync [--repo PATH] [--target LABEL] [--audit] [--audit-remote]

# Disable post-sync audit (faster, less safe)
git-copy sync --audit=false

# Audit without syncing (local cache and/or remote mirror)
git-copy audit [--repo PATH] --target LABEL [--remote] [--string S ...]

# Show sync status
git-copy status [--repo PATH]

Daemon Commands

The daemon automatically discovers and syncs git-copy enabled repositories:

# Start the daemon manually
git-copy serve

# Install daemon to run at system startup (Linux/macOS)
git-copy install

# Uninstall daemon service
git-copy install --uninstall

# Check daemon status (Linux)
systemctl --user status git-copy

# View daemon logs (Linux)
journalctl --user -u git-copy -f

The daemon:

  • Auto-discovers repos with .git-copy/config.json in your home directory
  • Polls every 30 seconds for changes
  • Logs sync activity with commit hashes and target URLs
  • Reloads config each cycle to pick up new repos

The install command automatically sets up:

  • Linux: systemd user service (~/.config/systemd/user/git-copy.service)
  • macOS: launchd agent (~/Library/LaunchAgents/com.obinnaokechukwu.git-copy.plist)

After git-copy init, you'll be prompted to install the daemon for auto-sync.

How It Works

  1. Fast Export: Uses git fast-export to stream the entire Git history
  2. Streaming Filter: Processes each commit, blob, and ref in the stream
  3. Scrubbing:
    • Replaces private_username with replacement in all text
    • Applies extra_replacements
    • Excludes files matching exclude patterns (unless in opt_in)
    • Replaces replace_history_with_current files with HEAD content
    • Rewrites author/committer information
  4. Empty Commit Pruning: Commits with no remaining file operations are automatically dropped
  5. Fast Import: Imports the scrubbed stream into a temporary bare repo
  6. Validation: Checks for leaked private username or forbidden files
  7. Push Mirror: Force-pushes all refs to the target repository

Safety Features

  • Validation: Automatically validates scrubbed repos for:
    • Presence of private username in any file
    • Forbidden files (.env, CLAUDE.md by default)
  • Audit (history-aware): git-copy sync audits the scrubbed mirror by default to detect forbidden paths/strings anywhere in reachable history (and can optionally audit the remote mirror)
  • Non-negotiable Exclusions: .git-copy/** and .claude/** are always excluded
  • Opt-In Override: Files in opt_in bypass exclude patterns
  • Author Protection: Rewrites commit authors to prevent identity leakage
  • Atomic Updates: Uses temporary repos and atomic rename for safe caching

Use Cases

  • Open-sourcing Private Repos: Scrub internal references before making code public
  • Multi-Account Publishing: Maintain one private repo, sync to multiple public accounts
  • Compliance: Ensure sensitive files never reach public repositories
  • Brand Consistency: Replace internal names with public branding
  • Personal Privacy: Separate work identity from public contributions
  • License Changes: Retroactively change LICENSE to appear as if it was always Apache/MIT/etc.
  • Pristine History: Make the public repo look like it was always public-ready

Development

# Build
go build -o git-copy ./cmd/git-copy

# Run tests
go test ./...

# Install locally
go install ./cmd/git-copy

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Copyright 2026 obinnaokechukwu

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Security

Important: Always review the scrubbed repository before syncing to ensure no sensitive data leaks. While git-copy includes validation, it's not foolproof.

  • Test with --repo flag on a copy first
  • Review .git-copy/config.json carefully
  • Check excluded patterns cover all sensitive paths
  • Verify extra_replacements catches domain-specific secrets

About

Scrubbed one-way replication from private Git repos to public targets

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages