doordash-oss · alex-reilly-dd · Mar 26, 2026 · Mar 27, 2026 · Mar 27, 2026 · Mar 27, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -4,15 +4,13 @@
 - `nonisolated(unsafe)` should not be used.
 - Force unwrapping should not be used.
 - If you believe something about the development environment (OS, tools, compiler, etc.) is blocking you, then double check that assumption. Explain what the issue is and why you believe that to be the case.
-- Our eventual goal is to have a functioning library that developers can use to write property-based tests. Anything that prevents us from achieving this goal should be addressed. We are not satisfied when things block us from achieving this goal. The project needs to be operational from both the command line and Xcode.
-- Backwards compatibility is not important.
+- The project needs to be operational from both the command line and Xcode.
 
 ## Building
 - Do not build this project with system swift. Use the build script found in the scripts directory. It builds this project using a patched swift toolchain that fixes issues with parameter packs.
 - You can build with `./scripts/build-local-toolchain.sh`
 
 ## Debugging
-- Use LLDB interactively instead of print debugging when it will speed up the process.
 - If you're debugging a crash, you will not be able to do so without identifying the stack trace. Use `lldb` to get the stack trace and then use `bt` to print it.
 - Read DEBUGGING.md
 
@@ -22,17 +20,20 @@
 - You can test with `./scripts/build-local-toolchain.sh test` and if you want to run the main test suite, use `./scripts/build-local-toolchain.sh --filter "PropertyTestingKitTests"`
 - You can try to find flaky tests by running `./scripts/test-until-failure.sh PropertyTestingKitTests 100` which will run the `PropertyTestingKitTests` target 100 times until it fails.
   - The test-until-failure script places output in `/tmp/test-failure-run{N}.log`. Look for failures there.
-- When targeting 100% coverage, target 100% branch coverage. If branches are difficult or impossible to reach, either rework code to remove the need for them, or use dependency injection to achieve the necessary state.
 - The test filter uses the method name, not the human readable name.
 
+### TDD Workflow
+- Always write failing tests BEFORE implementation
+- Use AAA pattern: Arrange-Act-Assert
+- One assertion per test when possible
+- Test names describe behavior: "should_return_empty_when_no_items"
+
+### Test-First Rules
+- When I ask for a feature, write tests first
+- Tests should FAIL initially (no implementation exists)
+- Only after tests are written, implement minimal code to pass
+
 ### Benchmarks
 - To benchmark, run `./scripts/run-benchmarks.sh`.
 - The filter flag for benchmarks requires that you match the entire name of the benchmark you want to run. Partial matches will not work, and may appear to hang.
 - You can analyze calltrees using `./scripts/parse-call-tree.py`.
-
-## Scripts
-- If you find yourself performing operations frequently, add a script to the `scripts` directory.
-- If one of those scripts stops working, fix it.
-
-## Plugin Architecture
-- Do not skip plugin events as an optimization. Plugins like plateau detectors need to see every iteration to track statistics correctly.
diff --git a/DEBUGGING.md b/DEBUGGING.md
@@ -59,6 +59,133 @@ This typically returns something like:
 /Applications/Xcode.app/Contents/Developer/usr/bin/xctest
 ```
 
+## Debugging a Specific Swift Testing Test
+
+The `xctest` binary doesn't support `--filter` for Swift Testing. To debug a specific test,
+use `swiftpm-testing-helper` directly with the correct library paths.
+
+### Steps
+
+1. Build test targets first:
+   ```
+   ./scripts/build-local-toolchain.sh build --build-tests
+   ```
+
+2. Start an LLDB session and load `swiftpm-testing-helper`:
+   ```
+   file $BUILD_ROOT/swiftpm-macosx-arm64/arm64-apple-macosx/release/swiftpm-testing-helper
+   ```
+
+3. Set run arguments with `--filter`:
+   ```
+   settings set -- target.run-args "--test-bundle-path" "/path/to/.build/arm64-apple-macosx/debug/PropertyTestingKitPackageTests.xctest/Contents/MacOS/PropertyTestingKitPackageTests" "--filter" "yourTestMethodName" "/path/to/.build/arm64-apple-macosx/debug/PropertyTestingKitPackageTests.xctest/Contents/MacOS/PropertyTestingKitPackageTests" "--testing-library" "swift-testing"
+   ```
+
+4. Set environment variables for library loading:
+   ```
+   env DYLD_LIBRARY_PATH=$BUILD_ROOT/swift-macosx-arm64/lib/swift/macosx:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/usr/lib
+   env DYLD_FRAMEWORK_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/Library/Frameworks:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/Library/PrivateFrameworks
+   ```
+
+5. Set breakpoints and launch:
+   ```
+   breakpoint set --file MyFile.swift --line 42
+   process launch -X false
+   ```
+
+### Why swiftpm-testing-helper?
+
+`swift-test` passes `--filter` to `swiftpm-testing-helper`, which loads the test bundle via
+`dlopen` and passes the filter to Swift Testing's `CommandLine.arguments` parser. The `xctest`
+binary parses arguments itself and rejects unknown flags like `--filter`.
+
+### Required Library Paths
+
+`swiftpm-testing-helper` needs:
+- **DYLD_LIBRARY_PATH**: Local Swift runtime + `libXCTestSwiftSupport.dylib`
+- **DYLD_FRAMEWORK_PATH**: `XCTest.framework` + `XCTestCore.framework` (private)
+
+Without these, `dlopen` fails with "Library not loaded" errors.
+
+### Alternative: Attach to Running Process
+
+For quick debugging without setting up library paths:
+```bash
+# Terminal 1: Launch test
+./scripts/build-local-toolchain.sh test --filter "testName" --skip-build &
+# Terminal 2: Find PID and attach
+pgrep -f "swiftpm-testing-helper"
+# In LLDB: process attach -p <PID>
+```
+
+This works but has a race condition — the test may complete before you attach.
+
+## Catching an Intermittent Crash (Loop-Until-Crash Recipe)
+
+For an intermittent crash, write the lldb commands to a file and run lldb in batch
+mode (`-b`) inside a shell loop until the process stops on a fatal signal. Verified
+to capture a real `EXC_BAD_ACCESS` from `test16ParallelFuzzTiming` within 4 attempts.
+
+### Pattern
+
+1. Write a one-shot lldb script that:
+   - Loads `swiftpm-testing-helper` and sets the `--filter` run-args (see "Debugging a Specific Swift Testing Test").
+   - Tells lldb to stop on `SIGSEGV` / `SIGBUS` instead of forwarding them to the process:
+     ```
+     process handle SIGSEGV --stop true --pass false --notify true
+     process handle SIGBUS  --stop true --pass false --notify true
+     ```
+   - Launches with `process launch -X false`.
+   - On stop, runs `process status` and `bt all`, then `quit`.
+2. Wrap that lldb script in a shell loop:
+   - Run `lldb -b -s script.lldb > run-N.log 2>&1` per attempt.
+   - Detect a real crash by grepping the log for `stop reason = signal` or
+     `EXC_BAD_ACCESS`.
+   - Detect a clean exit by `exited with status = 0`.
+   - Stop on first crash; report the log path.
+
+### Example
+
+A working example is checked into `/tmp/lldb-loop.sh` in past runs (regenerable from
+this recipe). Each attempt at `test16ParallelFuzzTiming` takes ~5 seconds; 50
+attempts ≈ 5 minutes wall-clock.
+
+### Why batch mode
+
+`lldb -b -s script.lldb` runs commands in order, halts when any signal handler
+configured with `--stop true` fires, and lets the script process the resulting
+stop deterministically. The interactive session is unnecessary and harder to
+script around — batch mode lets you `quit` cleanly after the trace is captured
+and re-launch on the next loop iteration with no state carried over.
+
+### Reading the trace
+
+After capture, the log contains many `AST validation error` and module-import
+warnings (toolchain SDK / lldb compiler skew). These do **not** prevent `bt all`
+from rendering the C/Swift frames — they only suppress some Swift-side
+expression evaluation. To extract just the frame list:
+
+```bash
+grep -A 200 "thread #N" run-NNN.log | grep -E "^    frame #|^\* thread"
+```
+
+Where `N` is the thread number reported on the `* thread #N, … stop reason = …` line.
+
+### What this is good for
+
+- Capturing the actual crash subsystem (which thread, which call site) without
+  guessing from test output.
+- Distinguishing a fix from a masking effect — if the fix is real, the crash
+  rate drops; if it's masking via timing/memory layout, the same `bt` may still
+  appear, just less often. Counter-instrumentation cannot distinguish those.
+
+### What it is not good for
+
+- Capturing the *interleaving* that produced a race. The trace tells you where
+  the crash landed, not what other threads were doing at the moment of the
+  collision. For interleaving you need watchpoints, recorded execution, or
+  reasoning from static analysis of the stack you do have.
+
 ## Console Log Debugging
 
 If you encounter attach failures, check the system logs: