Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
16f4700
Let `claude` use `grep`
goodboy Mar 12, 2026
739f4ef
Move `/commit-msg` output to `msgs/` subdir
goodboy Mar 12, 2026
0f19a87
Let claudy access `tests/`
goodboy Mar 13, 2026
81a2587
Add `ai_notes/docs_todos.md` for `literalinclude` idea
goodboy Mar 14, 2026
a69b6a4
Add more `Bash` allow-rules to `claude` local settings
goodboy Mar 23, 2026
79e971d
Add `/run-tests` `claude` skill for `pytest` suite runs
goodboy Mar 23, 2026
0f828f9
Extend `/run-tests` skill with dev-workflow helpers
goodboy Mar 23, 2026
82e1dc4
Add worktree detection to `/commit-msg` skill
goodboy Mar 24, 2026
d72c2fc
Add `/pr-msg` skill for cross-service PR descr
goodboy Mar 25, 2026
933516f
Update `/pr-msg` skill with cross-service PR refs
goodboy Mar 25, 2026
e2e1e81
Ignore skill `msgs/` dirs, keep pr-msg LATEST
goodboy Mar 25, 2026
0ecd04f
Allow `git remote` in local `claude` settings
goodboy Mar 25, 2026
20a9c27
Add "never auto-commit" rule to `/run-tests` skill
goodboy Mar 25, 2026
152efdf
Add regression-fix context to `/commit-msg` skill
goodboy Mar 25, 2026
11d25cf
Add worktree venv detection to `/run-tests` skill
goodboy Mar 25, 2026
6c09af8
Allow `git stash`, `ln`, and `uv sync` in local settings
goodboy Mar 25, 2026
9b35c0b
Add review-context trailer to `/commit-msg` skill
goodboy Mar 25, 2026
4db4b25
Add auto-PATCH of review reply placeholders
goodboy Mar 25, 2026
7b0ca29
Allow `gh pr`, `cat`, and `gh api` in local settings
goodboy Mar 25, 2026
7c2774c
Allow `python3` and `Skill(run-tests)`
goodboy Mar 25, 2026
d0062b1
Bump `/pr-msg` line-length limit 67 -> 72
goodboy Mar 25, 2026
91142f0
Add `/open-wkt` + `/close-wkt` worktree lifecycle skills
goodboy Mar 25, 2026
63373cd
Ignore `claude` worktree dirs in `.gitignore`
goodboy Mar 25, 2026
4d4cad7
Clarify `/pr-msg` wrap rule: fill *to* 72 chars
goodboy Mar 25, 2026
acebbe0
Allow claude to do discovery sys research re `multiaddrs`
goodboy Mar 25, 2026
7712fa5
Address Copilot review on PR #428
goodboy Mar 26, 2026
4447dd3
Symlink `/commit-msg` skill, tidy local settings
goodboy Mar 27, 2026
060a21f
claude: add `conc-anal` skill
goodboy Apr 7, 2026
95af10d
Symlink skills to central `ai.skillz` repo
goodboy Apr 7, 2026
b36a43e
Gitignore `ai.skillz` symlinks, drop from tracking
goodboy Apr 8, 2026
78b4035
Re-filter/org `claude` perms
goodboy Apr 9, 2026
eea39b8
Gitignore review-skill ephemeral ctx files
goodboy Apr 9, 2026
0c3d99f
Permit `gh api` CLI eps usage
goodboy Apr 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .claude/ai_notes/docs_todos.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Docs TODOs

## Auto-sync README code examples with source

The `docs/README.rst` has inline code blocks that
duplicate actual example files (e.g.
`examples/infected_asyncio_echo_server.py`). Every time
the public API changes we have to manually sync both.

Sphinx's `literalinclude` directive can pull code directly
from source files:

```rst
.. literalinclude:: ../examples/infected_asyncio_echo_server.py
:language: python
:caption: examples/infected_asyncio_echo_server.py
```

Or to include only a specific function/section:

```rst
.. literalinclude:: ../examples/infected_asyncio_echo_server.py
:language: python
:pyobject: aio_echo_server
```

This way the docs always reflect the actual code without
manual syncing.

### Considerations
- `README.rst` is also rendered on GitHub/PyPI which do
NOT support `literalinclude` - so we'd need a build
step or a separate `_sphinx_readme.rst` (which already
exists at `docs/github_readme/_sphinx_readme.rst`).
- Could use a pre-commit hook or CI step to extract code
from examples into the README for GitHub rendering.
- Another option: `sphinx-autodoc` style approach where
docstrings from the actual module are pulled in.
25 changes: 22 additions & 3 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -1,13 +1,32 @@
{
"permissions": {
"allow": [
"Write(.claude/*commit_msg*)",
"Write(.claude/git_commit_msg_LATEST.md)",
"Bash(date *)",
"Bash(cp .claude/*)",
"Bash(git diff *)",
"Bash(git log *)",
"Bash(git status)"
"Bash(git status)",
"Bash(git remote:*)",
"Bash(git stash:*)",
"Bash(git mv:*)",
"Bash(test:*)",
"Bash(ls:*)",
"Bash(grep:*)",
"Bash(find:*)",
"Bash(ln:*)",
"Bash(cat:*)",
"Bash(mkdir:*)",
"Bash(gh pr:*)",
"Bash(gh api:*)",
"Bash(UV_PROJECT_ENVIRONMENT=py313 uv sync:*)",
"Bash(UV_PROJECT_ENVIRONMENT=py313 uv run:*)",
"Write(.claude/*commit_msg*)",
"Write(.claude/git_commit_msg_LATEST.md)",
"Skill(run-tests)",
"Bash(echo EXIT:$?:*)",
"Bash(gh api:*)",
"Bash(gh pr:*)",
"Bash(gh issue:*)"
],
"deny": [],
"ask": []
Expand Down
86 changes: 0 additions & 86 deletions .claude/skills/commit-msg/SKILL.md

This file was deleted.

231 changes: 231 additions & 0 deletions .claude/skills/conc-anal/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
---
name: conc-anal
description: >
Concurrency analysis for tractor's trio-based
async primitives. Trace task scheduling across
checkpoint boundaries, identify race windows in
shared mutable state, and verify synchronization
correctness. Invoke on code segments the user
points at, OR proactively when reviewing/writing
concurrent cache, lock, or multi-task acm code.
argument-hint: "[file:line-range or function name]"
allowed-tools:
- Read
- Grep
- Glob
- Task
---

Perform a structured concurrency analysis on the
target code. This skill should be invoked:

- **On demand**: user points at a code segment
(file:lines, function name, or pastes a snippet)
- **Proactively**: when writing or reviewing code
that touches shared mutable state across trio
tasks — especially `_Cache`, locks, events, or
multi-task `@acm` lifecycle management

## 0. Identify the target

If the user provides a file:line-range or function
name, read that code. If not explicitly provided,
identify the relevant concurrent code from context
(e.g. the current diff, a failing test, or the
function under discussion).

## 1. Inventory shared mutable state

List every piece of state that is accessed by
multiple tasks. For each, note:

- **What**: the variable/dict/attr (e.g.
`_Cache.values`, `_Cache.resources`,
`_Cache.users`)
- **Scope**: class-level, module-level, or
closure-captured
- **Writers**: which tasks/code-paths mutate it
- **Readers**: which tasks/code-paths read it
- **Guarded by**: which lock/event/ordering
protects it (or "UNGUARDED" if none)

Format as a table:

```
| State | Writers | Readers | Guard |
|---------------------|-----------------|-----------------|----------------|
| _Cache.values | run_ctx, moc¹ | moc | ctx_key lock |
| _Cache.resources | run_ctx, moc | moc, run_ctx | UNGUARDED |
```

¹ `moc` = `maybe_open_context`

## 2. Map checkpoint boundaries

For each code path through the target, mark every
**checkpoint** — any `await` expression where trio
can switch to another task. Use line numbers:

```
L325: await lock.acquire() ← CHECKPOINT
L395: await service_tn.start(...) ← CHECKPOINT
L411: lock.release() ← (not a checkpoint, but changes lock state)
L414: yield (False, yielded) ← SUSPEND (caller runs)
L485: no_more_users.set() ← (wakes run_ctx, no switch yet)
```

**Key trio scheduling rules to apply:**
- `Event.set()` makes waiters *ready* but does NOT
switch immediately
- `lock.release()` is not a checkpoint
- `await sleep(0)` IS a checkpoint
- Code in `finally` blocks CAN have checkpoints
(unlike asyncio)
- `await` inside `except` blocks can be
`trio.Cancelled`-masked

## 3. Trace concurrent task schedules

Write out the **interleaved execution trace** for
the problematic scenario. Number each step and tag
which task executes it:

```
[Task A] 1. acquires lock
[Task A] 2. cache miss → allocates resources
[Task A] 3. releases lock
[Task A] 4. yields to caller
[Task A] 5. caller exits → finally runs
[Task A] 6. users-- → 0, sets no_more_users
[Task A] 7. pops lock from _Cache.locks
[run_ctx] 8. wakes from no_more_users.wait()
[run_ctx] 9. values.pop(ctx_key)
[run_ctx] 10. acm __aexit__ → CHECKPOINT
[Task B] 11. creates NEW lock (old one popped)
[Task B] 12. acquires immediately
[Task B] 13. values[ctx_key] → KeyError
[Task B] 14. resources[ctx_key] → STILL EXISTS
[Task B] 15. 💥 RuntimeError
```

Identify the **race window**: the range of steps
where state is inconsistent. In the example above,
steps 9–10 are the window (values gone, resources
still alive).

## 4. Classify the bug

Categorize what kind of concurrency issue this is:

- **TOCTOU** (time-of-check-to-time-of-use): state
changes between a check and the action based on it
- **Stale reference**: a task holds a reference to
state that another task has invalidated
- **Lifetime mismatch**: a synchronization primitive
(lock, event) has a shorter lifetime than the
state it's supposed to protect
- **Missing guard**: shared state is accessed
without any synchronization
- **Atomicity gap**: two operations that should be
atomic have a checkpoint between them

## 5. Propose fixes

For each proposed fix, provide:

- **Sketch**: pseudocode or diff showing the change
- **How it closes the window**: which step(s) from
the trace it eliminates or reorders
- **Tradeoffs**: complexity, perf, new edge cases,
impact on other code paths
- **Risk**: what could go wrong (deadlocks, new
races, cancellation issues)

Rate each fix: `[simple|moderate|complex]` impl
effort.

## 6. Output format

Structure the full analysis as:

```markdown
## Concurrency analysis: `<target>`

### Shared state
<table from step 1>

### Checkpoints
<list from step 2>

### Race trace
<interleaved trace from step 3>

### Classification
<bug type from step 4>

### Fixes
<proposals from step 5>
```

## Tractor-specific patterns to watch

These are known problem areas in tractor's
concurrency model. Flag them when encountered:

### `_Cache` lock vs `run_ctx` lifetime

The `_Cache.locks` entry is managed by
`maybe_open_context` callers, but `run_ctx` runs
in `service_tn` — a different task tree. Lock
pop/release in the caller's `finally` does NOT
wait for `run_ctx` to finish tearing down. Any
state that `run_ctx` cleans up in its `finally`
(e.g. `resources.pop()`) is vulnerable to
re-entry races after the lock is popped.

### `values.pop()` → acm `__aexit__` → `resources.pop()` gap

In `_Cache.run_ctx`, the inner `finally` pops
`values`, then the acm's `__aexit__` runs (which
has checkpoints), then the outer `finally` pops
`resources`. This creates a window where `values`
is gone but `resources` still exists — a classic
atomicity gap.

### Global vs per-key counters

`_Cache.users` as a single `int` (pre-fix) meant
that users of different `ctx_key`s inflated each
other's counts, preventing teardown when one key's
users hit zero. Always verify that per-key state
(`users`, `locks`) is actually keyed on `ctx_key`
and not on `fid` or some broader key.

### `Event.set()` wakes but doesn't switch

`trio.Event.set()` makes waiting tasks *ready* but
the current task continues executing until its next
checkpoint. Code between `.set()` and the next
`await` runs atomically from the scheduler's
perspective. Use this to your advantage (or watch
for bugs where code assumes the woken task runs
immediately).

### `except` block checkpoint masking

`await` expressions inside `except` handlers can
be masked by `trio.Cancelled`. If a `finally`
block runs from an `except` and contains
`lock.release()`, the release happens — but any
`await` after it in the same `except` may be
swallowed. This is why `maybe_open_context`'s
cache-miss path does `lock.release()` in a
`finally` inside the `except KeyError`.

### Cancellation in `finally`

Unlike asyncio, trio allows checkpoints in
`finally` blocks. This means `finally` cleanup
that does `await` can itself be cancelled (e.g.
by nursery shutdown). Watch for cleanup code that
assumes it will run to completion.
Loading
Loading