Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions constraints.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# second-brain — Constraints

---

## Must Do

- Load and confirm context (`AGENTS.md`, `intent.md`, `constraints.md`) before every session.
- Write a failing test before any production code — no exceptions (RED → GREEN → REFACTOR).
- Run the full backend quality-check pass before any commit: pytest + ruff + mypy + bandit.
- Sanitize all Tiptap/rich-text editor output before storing to the database.
- Use `async/await` for every database operation — no synchronous SQLAlchemy calls.
- Write three verifiable acceptance criteria before delegating any significant subtask.
- Add a `# VERIFY:` comment rather than guess a function signature, API behavior, or SQLAlchemy idiom.
- Confirm understanding before any destructive migration (column drop, rename, table drop).

---

## Must NOT Do

- Do not write production code without a failing test. RED phase is local only — never committed.
- Do not use synchronous SQLAlchemy anywhere in the application code.
- Do not add a repository layer unless two or more services share identical query logic.
- Do not exceed 15 lines in a route handler — delegate to `services/`.
- Do not reuse a Pydantic schema across semantically different use cases.
- Do not hardcode secrets, tokens, or `DATABASE_URL` values — use environment variables.
- Do not commit `backend/basb.db`, `.env`, `.env.*`, or `__pycache__/`.
- Do not re-litigate decisions logged in the Persistent Decisions tables without surfacing the question first.
- Do not implement AI/LLM features without first resolving the Open Loop on AI augmentation strategy.

---

## Preferences

- Prefer brevity over completeness unless depth is explicitly requested.
- Prefer editing an existing file over creating a new one.
- Prefer `grounded-code-mcp` knowledge base over training data for FastAPI, Vue 3, SQLAlchemy, and Pydantic v2 idioms.
- Prefer a single focused service-layer unit test over a broad integration test when testing business logic.
- Prefer `Mapped[Optional[T]]` with an explicit `None` default over omitting the default for nullable columns.
- Prefer inline `# VERIFY:` annotations over guessing async patterns or SQLAlchemy 2.0 syntax.

---

## Escalate Rather Than Decide

- Any AI/LLM feature — confirm provider, model name, cost, and integration point before implementing.
- Any DB schema migration that drops or renames a column (hard to reverse).
- Any CORS policy change that broadens allowed origins beyond `localhost`.
- Any change that moves business logic out of `services/` (violates the layered architecture decision).
- Any security-relevant decision not explicitly covered by these constraints.

---

## Code Quality Gates

- **Test coverage (backend business logic):** ≥ 80% — `cd backend && .venv/bin/pytest --cov=app --cov-report=term-missing`
- **Test coverage (frontend):** ≥ 70% — `cd frontend && npm run test:run`
- **Test coverage (security-critical paths):** ≥ 95%
- **Cyclomatic complexity (per method):** < 10
- **Code duplication:** ≤ 3%
- **Commit format:** Conventional Commits — `feat:`, `fix:`, `refactor:`, `chore:`, `test:`, `docs:`
- **Commit scope:** Atomic — one logical change per commit; RED phase never committed
91 changes: 91 additions & 0 deletions evals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# second-brain — Evals

---

## Eval Philosophy

Evals are safety infrastructure, not a finishing step. Write them before the agent starts.
A passing test suite ≠ done; tests verify code correctness, evals verify output is actually
good relative to BASB intent.

A passing eval is measurable, repeatable, and would survive scrutiny from a developer who
understands Tiago Forte's BASB methodology and expects the domain semantics to be respected.

---

## Test Cases

### Test Case 1: Note Capture and PARA Move

- **Input / Prompt:** "Implement the endpoint to move a note to a PARA container and advance its CODE stage."
- **Known-Good Output:** `PATCH /api/v1/notes/{id}/move` sets `container_id` on the note AND advances `code_stage` from `capture` to `organize`. Route handler ≤15 lines. Business logic in `note_service.py`. Integration test covers happy path and 404 on missing note.
- **Pass Criteria:**
- [ ] `code_stage` transitions from `capture` → `organize` on move (BASB semantic — not just a field update)
- [ ] `container_id` is set to the target container
- [ ] Route handler is ≤15 lines; logic is in `services/note_service.py`
- [ ] Integration test in `tests/integration/test_notes_api.py` covers: success (200), missing note (404), missing container (404)
- [ ] `pytest` passes with coverage ≥ 80%
- **Last Run:** — | **Result:** —
- **Notes:** —

---

### Test Case 2: Progressive Summarization Highlight Update

- **Input / Prompt:** "Implement `PATCH /api/v1/notes/{id}/highlights` to update L2/L3 highlight ranges."
- **Known-Good Output:** Endpoint accepts `{"highlights": [{"start": int, "end": int, "layer": 2|3}]}`. Validates layer is 2 or 3 only. Replaces the full highlight list (not appends). Service validates that ranges don't exceed content length. Unit test in `tests/unit/` covers validation; integration test covers the HTTP contract.
- **Pass Criteria:**
- [ ] Layer values other than 2 or 3 return 422 (Pydantic validation, not a manual check)
- [ ] L4 (executive summary) is NOT updated by this endpoint — separate field/endpoint
- [ ] Highlight ranges are replaced atomically (not merged with existing)
- [ ] Service-layer unit test covers: valid payload, invalid layer, empty list (clears all)
- [ ] `pytest` passes; no ruff or mypy errors introduced
- **Last Run:** — | **Result:** —
- **Notes:** Highlight offset drift (content edited after highlights set) is a known open issue — not in scope here.

---

### Test Case 3: Full-Text Search

- **Input / Prompt:** "Implement `GET /api/v1/search?q=` for full-text search across note titles and content."
- **Known-Good Output:** Returns notes where title OR content contains the query string (case-insensitive). Empty `q` returns 422. Results include `code_stage` and `container_id`. Implemented in `search_service.py`. Integration test covers: match in title, match in content, no match, empty query.
- **Pass Criteria:**
- [ ] Search is case-insensitive
- [ ] Empty `q` returns 422 (not an empty list)
- [ ] Response schema includes `code_stage` and `container_id` for each result
- [ ] Logic is in `search_service.py`, not in the route handler
- [ ] Integration tests cover all four cases above
- **Last Run:** — | **Result:** —
- **Notes:** SQLite `LIKE` is acceptable for now; defer FTS5 until the Open Loop on deployment target is resolved.

---

## Taste Rules (Encoded Rejections)

| # | Pattern to Reject | Why It Fails | Rule |
|---|---|---|---|
| 1 | Moving a note to a container without updating `code_stage` | Technically correct HTTP but wrong BASB semantics — capture stays capture forever | A move operation MUST advance `code_stage` to `organize` |
| 2 | Putting query logic directly in route handlers | Defeats the layered architecture; untestable in isolation | All DB access goes through `services/`; routes call services only |
| 3 | Reusing `NoteResponse` schema as the input schema for update | Semantically wrong; leaks read-only fields into writes | One schema per use case: `CreateNote`, `UpdateNote`, `NoteResponse` are all distinct |
| 4 | AI feature implemented without surfacing provider/cost question | Binds the project to an unconfirmed external dependency | Escalate AI integration to human before writing any model call code |

---

## CI Gate

The agent must not declare a task complete if any gate below fails.

- **Backend tests:** `cd backend && .venv/bin/pytest -v --cov=app --cov-report=term-missing` — all pass, coverage ≥ 80%
- **Backend lint:** `.venv/bin/ruff check app/ tests/` — zero errors
- **Backend format:** `.venv/bin/ruff format --check app/ tests/` — clean
- **Backend types:** `.venv/bin/mypy app/` — zero errors
- **Backend security:** `.venv/bin/bandit -r app/ -c pyproject.toml` — zero high/critical
- **Dependency audit:** `.venv/bin/pip-audit --skip-editable` — zero known vulnerabilities
- **Frontend tests:** `cd frontend && npm run test:run` — all pass
- **Frontend build:** `npm run build` — zero errors

---

## Rejection Log

*(Append entries here as outputs are rejected. Never delete entries.)*
90 changes: 90 additions & 0 deletions intent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# second-brain — Intent

---

## Agent Architecture

**This project uses:** Coding harness

**Reason:** Solo developer with human review at every step; task-level features and bug fixes do not require autonomous multi-session loops.

---

## Primary Goal

A fully functional AI-augmented personal KMS: the user captures raw notes, organizes them into the PARA taxonomy, and progressively distills them from L1 raw text to L4 executive summary — with AI assistance accelerating the distill and express stages of the CODE workflow.

---

## Values (What We Optimize For)

1. **Correctness** — code accurately implements BASB semantics; no data loss or corruption
2. **Security** — user content is sanitized, stored safely, and never leaked
3. **Maintainability** — readable, tested code a solo developer can return to after weeks away
4. **Performance** — async throughout; UI responses feel immediate
5. **Speed of delivery** — last priority; correctness is never sacrificed for pace

---

## Tradeoff Rules

| Conflict | Resolution |
|---|---|
| Speed vs. correctness | Default to correctness. Flag explicitly if timeline requires compromise. |
| Completeness vs. brevity | Prefer brevity unless depth is explicitly requested. |
| New abstraction vs. duplication | Tolerate duplication until the third occurrence; then extract. |
| AI feature richness vs. scope creep | Confirm AI integration points before implementing; see Open Loops in AGENTS.md. |

---

## Decision Boundaries

### Decide Autonomously

- Formatting, structure, naming within established project conventions
- Tool selection for read-only exploration
- Refactoring within an approved, scoped task
- Choosing between two equivalent async SQLAlchemy patterns
- Adding a test for an untested code path discovered during a task

### Escalate to Human

- Any AI feature that touches external APIs — confirm provider, model name, and cost before implementing
- Any DB schema migration that drops or renames a column
- Any CORS policy change that broadens allowed origins beyond `localhost`
- Any change that moves business logic out of `services/` and into routes or models
- Any output intended for external distribution
- Any irreversible action (delete, force-push, send)
- Scope changes beyond the stated task
- When acceptance criteria cannot be met within stated constraints

---

## What "Good" Looks Like

A good output for this project:

- Implements the BASB concept correctly (not just the literal endpoint spec) — e.g., a "move" operation correctly advances the `code_stage`
- Produces working, tested code on the first attempt within the defined scope
- Stays thin at the route layer and puts logic in services — verifiable by line count
- Uses the domain vocabulary (`CodeStage`, `ContainerType`, `highlights`) consistently
- Flags risks (schema changes, async pitfalls, highlight offset drift) proactively

---

## Anti-Patterns (What Bad Looks Like)

- Implementing the literal request while missing the BASB intent (e.g., moving a note to a container without updating `code_stage`)
- Adding a repository layer or other abstraction "for future flexibility" — YAGNI
- Synchronous SQLAlchemy calls that block the event loop
- Reusing a Pydantic schema across semantically different operations to save lines
- Recommending an AI feature without surfacing the provider/cost/integration question first

---

## Persistent Decisions

| Date | Decision | Rationale |
|---|---|---|
| [VERIFY: date] | L4 executive summary is always user-authored | AI may suggest, but the user's own words are the point of the express stage |
| [VERIFY: date] | Inbox = notes with no `container_id` | Simple; avoids a separate inbox table |
Loading