From ad4389d7a082b5deac48fd3965f86075f4d24b82 Mon Sep 17 00:00:00 2001 From: Peyton Montei Date: Tue, 17 Feb 2026 17:31:02 -0800 Subject: [PATCH 1/3] Add security and privacy documentation --- README.md | 7 ++++++ docs/security-and-privacy.md | 43 ++++++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+) create mode 100644 docs/security-and-privacy.md diff --git a/README.md b/README.md index 737fcf450..1bcef1d6b 100644 --- a/README.md +++ b/README.md @@ -19,6 +19,7 @@ With Entire, you can: - [Strategies](#strategies) - [Commands Reference](#commands-reference) - [Configuration](#configuration) +- [Security & Privacy](#security--privacy) - [Troubleshooting](#troubleshooting) - [Development](#development) - [Getting Help](#getting-help) @@ -276,6 +277,12 @@ All commands (`rewind`, `status`, `doctor`, etc.) work the same regardless of wh If you run into any issues with Gemini CLI integration, please [open an issue](https://github.com/entireio/cli/issues). +## Security & Privacy + +**Your session transcripts are stored in your git repository** on the `entire/checkpoints/v1` branch. If your repository is public, this data is visible to anyone. + +Entire automatically redacts detected secrets (API keys, tokens, credentials) before committing, but redaction is best-effort. See [docs/security-and-privacy.md](docs/security-and-privacy.md) for details. + ## Troubleshooting ### Common Issues diff --git a/docs/security-and-privacy.md b/docs/security-and-privacy.md new file mode 100644 index 000000000..065cf0280 --- /dev/null +++ b/docs/security-and-privacy.md @@ -0,0 +1,43 @@ +# Security & Privacy + +Entire stores AI session transcripts and metadata in your git repository. This document explains what data is stored, how sensitive content is protected, and how to configure additional safeguards. + +## Transcript Storage & Git History + +### Where data is stored + +When you use Entire with an AI agent (Claude Code, Gemini CLI), session transcripts, user prompts, and checkpoint metadata are committed to a dedicated branch in your git repository (`entire/checkpoints/v1`). This branch is separate from your working branches — your code commits stay clean — but it lives in the same repository. + +If your repository is **public**, anyone with access can view the transcript data on the `entire/checkpoints/v1` branch. This includes the full prompt/response history and session metadata. Note that transcripts may contain file content quoted by the AI agent during the session. + +### What Entire redacts automatically + +Entire automatically scans all transcript and metadata content **before** committing it to git. Two detection methods run on every write: + +1. **Entropy scoring** — Identifies high-entropy strings (Shannon entropy > 4.5) that look like randomly generated secrets, even if they don't match a known pattern. +2. **Pattern matching** — Uses [gitleaks](https://github.com/gitleaks/gitleaks) built-in rules (220+ patterns) to detect known secret formats. + +Detected secrets are replaced with `REDACTED` before the data is ever written to a git object. This is **always on** and cannot be disabled. + +### Recommendations + +If your AI sessions will touch sensitive data: + +- **Use a private repository.** This is the simplest and most complete protection. Transcripts on `entire/checkpoints/v1` are only visible to collaborators. +- **Avoid passing sensitive files to your agent.** Content that never enters the agent conversation never appears in transcripts. +- **Review before pushing.** You can inspect the `entire/checkpoints/v1` branch locally before pushing it to a remote. + +## What Gets Redacted + +### Secrets (always on) + +Gitleaks pattern matching covers cloud providers (AWS, GCP, Azure), version control platforms (GitHub, GitLab, Bitbucket), payment processors (Stripe, Square), communication tools (Slack, Discord, Twilio), database connection strings, private key blocks (RSA, DSA, EC, PGP), and generic credentials (bearer tokens, basic auth, JWTs). Entropy scoring catches secrets that don't match any known pattern. + +All detected secrets are replaced with `REDACTED`. + +## Limitations + +- **Best-effort.** Novel or low-entropy secrets (short passwords, predictable tokens) may not be caught. +- **Filenames and binary data.** Secrets in filenames, binary files, or deeply nested structures may not be detected. +- **JSONL skip rules.** Entire skips scanning fields named `signature` or ending in `id`/`ids`, and objects whose `type` starts with `image` or equals `base64`, to avoid false positives. +- **Users are ultimately responsible** for reviewing what they commit and push. Redaction is a safety net, not a guarantee. From 1f4703a7e67ea2e4cfe248319ac3e930937ad0f4 Mon Sep 17 00:00:00 2001 From: Peyton Montei Date: Tue, 17 Feb 2026 18:31:48 -0800 Subject: [PATCH 2/3] Fix shadow branch redaction accuracy and address PR feedback --- README.md | 2 +- docs/security-and-privacy.md | 8 +++++--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 1bcef1d6b..0b2274521 100644 --- a/README.md +++ b/README.md @@ -281,7 +281,7 @@ If you run into any issues with Gemini CLI integration, please [open an issue](h **Your session transcripts are stored in your git repository** on the `entire/checkpoints/v1` branch. If your repository is public, this data is visible to anyone. -Entire automatically redacts detected secrets (API keys, tokens, credentials) before committing, but redaction is best-effort. See [docs/security-and-privacy.md](docs/security-and-privacy.md) for details. +Entire automatically redacts detected secrets (API keys, tokens, credentials) when writing to `entire/checkpoints/v1`, but redaction is best-effort. Temporary shadow branches used during a session may contain unredacted data and should not be pushed. See [docs/security-and-privacy.md](docs/security-and-privacy.md) for details. ## Troubleshooting diff --git a/docs/security-and-privacy.md b/docs/security-and-privacy.md index 065cf0280..3920878bf 100644 --- a/docs/security-and-privacy.md +++ b/docs/security-and-privacy.md @@ -6,16 +6,18 @@ Entire stores AI session transcripts and metadata in your git repository. This d ### Where data is stored -When you use Entire with an AI agent (Claude Code, Gemini CLI), session transcripts, user prompts, and checkpoint metadata are committed to a dedicated branch in your git repository (`entire/checkpoints/v1`). This branch is separate from your working branches — your code commits stay clean — but it lives in the same repository. +When you use Entire with an AI agent (Claude Code, Gemini CLI), session transcripts, user prompts, and checkpoint metadata are committed to a dedicated branch in your git repository (`entire/checkpoints/v1`). This branch is separate from your working branches, your code commits stay clean, but it lives in the same repository. + +Entire also creates temporary local branches (e.g., `entire/`) as working storage during a session. These shadow branches store file snapshots and transcripts **without redaction**. They are cleaned up when session data is condensed (with redaction) into `entire/checkpoints/v1` at commit time. Shadow branches are **not** pushed by Entire — do not push them manually, as unredacted content would be visible on the remote. If your repository is **public**, anyone with access can view the transcript data on the `entire/checkpoints/v1` branch. This includes the full prompt/response history and session metadata. Note that transcripts may contain file content quoted by the AI agent during the session. ### What Entire redacts automatically -Entire automatically scans all transcript and metadata content **before** committing it to git. Two detection methods run on every write: +Entire automatically scans transcript and metadata content before writing it to the `entire/checkpoints/v1` branch. Two detection methods run during condensation: 1. **Entropy scoring** — Identifies high-entropy strings (Shannon entropy > 4.5) that look like randomly generated secrets, even if they don't match a known pattern. -2. **Pattern matching** — Uses [gitleaks](https://github.com/gitleaks/gitleaks) built-in rules (220+ patterns) to detect known secret formats. +2. **Pattern matching** — Uses [gitleaks](https://github.com/gitleaks/gitleaks) built-in rules to detect known secret formats. Detected secrets are replaced with `REDACTED` before the data is ever written to a git object. This is **always on** and cannot be disabled. From f387db3bbe2c209f62dfdd8057f7163630cad67e Mon Sep 17 00:00:00 2001 From: peyton-alt Date: Wed, 18 Feb 2026 01:18:55 -0500 Subject: [PATCH 3/3] Update docs/security-and-privacy.md Co-authored-by: Alex Ong --- docs/security-and-privacy.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/security-and-privacy.md b/docs/security-and-privacy.md index 3920878bf..a6e8cf1bf 100644 --- a/docs/security-and-privacy.md +++ b/docs/security-and-privacy.md @@ -10,7 +10,9 @@ When you use Entire with an AI agent (Claude Code, Gemini CLI), session transcri Entire also creates temporary local branches (e.g., `entire/`) as working storage during a session. These shadow branches store file snapshots and transcripts **without redaction**. They are cleaned up when session data is condensed (with redaction) into `entire/checkpoints/v1` at commit time. Shadow branches are **not** pushed by Entire — do not push them manually, as unredacted content would be visible on the remote. -If your repository is **public**, anyone with access can view the transcript data on the `entire/checkpoints/v1` branch. This includes the full prompt/response history and session metadata. Note that transcripts may contain file content quoted by the AI agent during the session. +Anyone with access to your repository can view the transcript data on the `entire/checkpoints/v1` branch. This includes the full prompt/response history and session metadata. Note that transcripts capture all tool interactions — including file contents, MCP server calls, and other data exchanged during the session. + +If your repository is **public**, this data is visible to the entire internet. ### What Entire redacts automatically