Skip to content

Commit e16044c

Browse files
author
Shane Wall
committed
Documented the new UX loop, drop-in action, and demo.
README.md now walks through record ➜ timeline ➜ export, highlights the composite GitHub Action usage, and points to the flaky CI demo flow. docs/quickstart.md updated to use the built-in flaky demo with a timeline step before export. CHANGELOG.md and docs/releases/release-notes.md add a v2.1 entry for the timeline command, GitHub Action, and demo.
1 parent 8b50650 commit e16044c

4 files changed

Lines changed: 89 additions & 42 deletions

File tree

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Changelog
22

3+
## v2.1 - Timeline UX + CI Action (2025-11-28)
4+
- New `timeline` command to list all file events with relative timestamps (no more guessing export times).
5+
- Added composite GitHub Action (`action.yml`) for one-line CI adoption: `uses: saworbit/diffkeeper@v1`.
6+
- Included flaky CI demo (`demo/flaky-ci-test`) and updated docs/readme/quickstart to show the full record ➜ timeline ➜ export loop.
7+
38
## v2.0 - Time Machine Preview (2025-11-28)
49
- Pivoted from BoltDB persistence to Pebble-based flight recorder with key prefixes (`l:/c:/m:`).
510
- Added journal ingestion + async worker that hashes/compresses into CAS and metadata.

README.md

Lines changed: 49 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,11 @@
33
[![CI](https://github.com/saworbit/diffkeeper/actions/workflows/ci.yml/badge.svg)](https://github.com/saworbit/diffkeeper/actions/workflows/ci.yml)
44
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
55

6-
**DiffKeeper is a "Black Box Flight Recorder" for your containers.**
6+
**DiffKeeper is a Black Box Flight Recorder for your containers.**
77

8-
It watches your application's filesystem in real-time and records every change. When a container crashes—or a CI test flakes—you can rewind the state to any exact moment in time to see exactly what happened.
8+
It watches your application's filesystem in real-time and records every change. When a container crashes—or a CI test flakes—you can rewind the state to any exact moment and see exactly what happened.
99

10-
> **Note:** This project previously focused on "Stateful Containers" and database persistence. That architecture has been archived. See [Genesis & Pivot](docs/GENESIS_AND_PIVOT.md) for the story.
10+
> **Note:** The earlier "stateful containers" design is archived. See [Genesis & Pivot](docs/GENESIS_AND_PIVOT.md) for the story.
1111
1212
---
1313

@@ -17,33 +17,67 @@ You have a flaky test in CI. It fails 1 out of 50 times. You re-run the job, and
1717
* They don't show you that a config file was corrupted, a temp file was locked, or a binary was overwritten.
1818

1919
## The Solution: Instant Replay
20-
DiffKeeper uses eBPF to capture filesystem writes at line-rate and stores them in a high-speed log (Pebble).
20+
DiffKeeper uses eBPF to capture filesystem writes at line-rate and stores them in Pebble. Then it gives you a timeline so you never guess timestamps again.
2121

22-
### 1. Record a Session
23-
Wrap your flaky test (or command) with `diffkeeper record`. It adds minimal overhead.
22+
### 1) Record a Session
23+
Wrap your flaky test (or any command). Minimal overhead.
2424

2525
```bash
26-
# In your local terminal or CI pipeline
2726
diffkeeper record --state-dir=/tmp/trace -- go test ./...
2827
```
2928

30-
### 2. Export the "Crash Site"
31-
The test failed! But the container is gone. No problem—DiffKeeper saved the history. Restore the filesystem to exactly 2 minutes and 14 seconds into the run:
29+
### 2) See the Timeline (no blindfolds)
30+
List every write in order to pick the exact second to rewind:
31+
32+
```bash
33+
diffkeeper timeline --state-dir=/tmp/trace
34+
[00m:01s] WRITE status.log (13B)
35+
[00m:05s] WRITE db.lock (6B)
36+
[02m:14s] WRITE status.log (22B) <-- the failure
37+
```
38+
39+
### 3) Export the Crash Site
40+
Restore the filesystem to the moment of failure:
3241

3342
```bash
3443
diffkeeper export --state-dir=/tmp/trace --out=./debug_fs --time="2m14s"
3544
```
3645

37-
Now `cd ./debug_fs` and explore the files exactly as they existed at that moment.
46+
`cd ./debug_fs` and inspect files exactly as they existed at that moment.
47+
48+
## Drop-in GitHub Action
49+
No curl | sh snippets needed—use the composite action directly:
50+
51+
```yaml
52+
steps:
53+
- uses: actions/checkout@v4
54+
- name: Record flaky test
55+
uses: saworbit/diffkeeper@v1
56+
with:
57+
command: go test ./...
58+
state-dir: diffkeeper-trace
59+
```
60+
61+
On failure the trace uploads as an artifact; you can run `diffkeeper timeline` to find the culprit write, then `diffkeeper export` to reconstruct it locally.
62+
63+
## The "Flaky CI" Demo
64+
Run the built-in demo to see the loop end-to-end:
65+
66+
```bash
67+
diffkeeper record --state-dir=./trace -- go run ./demo/flaky-ci-test
68+
diffkeeper timeline --state-dir=./trace
69+
diffkeeper export --state-dir=./trace --out=./restored --time="2s"
70+
cat ./restored/status.log # ERROR: Connection Lost
71+
```
3872

3973
## Architecture
4074
- **Engine:** Pure Go + eBPF (CO-RE)
41-
- **Storage:** Pebble (LSM Tree) for high-speed ingestion.
42-
- **Diffing:** bsdiff (Binary patches) for efficient storage.
75+
- **Storage:** Pebble (LSM) for high-speed ingestion.
76+
- **Diffing:** bsdiff (binary patches) for efficient storage.
4377

4478
## CI / Dogfooding
45-
- GitHub Actions (`.github/workflows/ci.yml`) runs unit/race tests, cross-platform builds, and a functional "time machine" test that records a flaky script and verifies exports at multiple timestamps.
46-
- The BoltDB era workflows are archived under `docs/archive/v1-legacy/workflows/`.
79+
- GitHub Actions (`.github/workflows/ci.yml`) runs unit/race tests, cross-platform builds, and a functional time-machine test that records a flaky script and verifies exports.
80+
- BoltDB-era workflows remain archived under `docs/archive/v1-legacy/workflows/`.
4781

4882
## Getting Started
49-
See the [Quickstart](docs/quickstart.md) to record your first trace.
83+
See the [Quickstart](docs/quickstart.md) to record, view the timeline, and export your first trace.

docs/quickstart.md

Lines changed: 20 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,42 @@
11
# Quickstart
22

3-
This guide shows you how to use DiffKeeper to debug a "flaky" script.
3+
Debug a flaky script end-to-end: record, inspect the timeline, and rewind to the exact moment it broke.
44

5-
## 1. Installation
5+
## 1) Build DiffKeeper (or install via the GitHub Action)
66

77
```bash
8-
# Build from source (requires Go 1.23+)
8+
# Build from source (Go 1.23+)
99
go build -o diffkeeper .
1010
```
1111

12-
## 2. The "Flaky" Application
13-
Create a script `flaky.sh` that simulates a bug where a file gets corrupted halfway through execution:
12+
CI users can skip this and reference `uses: saworbit/diffkeeper@v1` in a workflow.
1413

15-
```bash
16-
#!/bin/bash
17-
echo "All systems operational" > status.txt
18-
sleep 2
19-
echo "CRITICAL FAILURE" > status.txt # <--- The Bug
20-
sleep 1
21-
```
22-
23-
## 3. Record the Crash
24-
Run the script wrapped in DiffKeeper:
14+
## 2) Run the Flaky Demo Under DiffKeeper
15+
The repo ships with a tiny flaky test that silently corrupts `status.log` after 2 seconds.
2516

2617
```bash
27-
./diffkeeper record --state-dir=./trace -- ./flaky.sh
18+
./diffkeeper record --state-dir=./trace -- go run ./demo/flaky-ci-test
2819
```
2920

30-
## 4. Time Travel
31-
Investigate what the file looked like before the crash (at 1 second) and after (at 3 seconds).
21+
The process exits with a failure after a few seconds (expected).
3222

33-
**At 1 Second:**
23+
## 3) Read the Timeline (no more guesswork)
3424

3525
```bash
36-
./diffkeeper export --state-dir=./trace --out=./restore_1s --time="1s"
37-
cat ./restore_1s/status.txt
38-
# Output: All systems operational
26+
./diffkeeper timeline --state-dir=./trace
27+
[00m:00s] WRITE status.log (13B)
28+
[00m:01s] WRITE db.lock (6B)
29+
[00m:02s] WRITE status.log (22B) <-- corruption point
3930
```
4031

41-
**At 3 Seconds:**
32+
Now you know the precise timestamp to rewind to.
33+
34+
## 4) Export the Crash Site
4235

4336
```bash
44-
./diffkeeper export --state-dir=./trace --out=./restore_3s --time="3s"
45-
cat ./restore_3s/status.txt
46-
# Output: CRITICAL FAILURE
37+
./diffkeeper export --state-dir=./trace --out=./restored --time="2s"
38+
cat ./restored/status.log
39+
# Output: ERROR: Connection Lost
4740
```
4841

49-
You have now successfully captured and rewound a filesystem state!
42+
You have successfully captured the filesystem history, located the offending write, and restored the exact failing state.

docs/releases/release-notes.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,20 @@
11
# DiffKeeper Release Notes
22

3+
## v2.1 "Timeline UX" - November 28, 2025
4+
5+
**Status:** Preview (ready for CI adoption)
6+
7+
### Highlights
8+
- **Timeline CLI:** `diffkeeper timeline` streams a chronological feed of filesystem writes with relative timestamps, so export targets are never a guess.
9+
- **GitHub Action:** Composite `action.yml` enables `uses: saworbit/diffkeeper@v1` with automatic artifact upload of traces on failure.
10+
- **Flaky Demo:** Added `demo/flaky-ci-test` and doc updates (README/quickstart) to show record ➜ timeline ➜ export in a few seconds.
11+
12+
### Notes
13+
- Timeline reads Pebble metadata in read-only mode; no impact on recorded traces.
14+
- Action installs via release installer and runs `diffkeeper record` with sudo to attach eBPF on Linux runners.
15+
16+
---
17+
318
## v2.0 "Time Machine" - November 28, 2025
419

520
**Status:** Preview (CI/CD flight recorder)

0 commit comments

Comments
 (0)