|
| 1 | +# AI Agent — LLM Programming Guidelines |
| 2 | + |
| 3 | +Supplement to `CCTBX_LLM_PROGRAMMING_GUIDELINES.md`. |
| 4 | +This document covers patterns and pitfalls specific to |
| 5 | +the AI Agent codebase (`agent/`, `knowledge/`, |
| 6 | +`programs/ai_agent.py`). Read the CCTBX guide first — |
| 7 | +everything there applies here too. |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## 1. Parameter Verification Against programs.yaml |
| 12 | + |
| 13 | +LLMs frequently hallucinate PHENIX command-line |
| 14 | +arguments. The agent has a specific defense: the |
| 15 | +command sanitizer strips any parameter not in the |
| 16 | +program's `strategy_flags` allowlist in |
| 17 | +`knowledge/programs.yaml`. |
| 18 | + |
| 19 | +When generating or modifying agent code that builds |
| 20 | +commands: |
| 21 | + |
| 22 | +- Check `programs.yaml` for the program's |
| 23 | + `strategy_flags` section. If a parameter is not |
| 24 | + listed, the sanitizer will strip it — your code |
| 25 | + will silently have no effect. |
| 26 | +- If you need a new strategy flag, add it to the |
| 27 | + YAML first, then use it in code. |
| 28 | +- The YAML also defines `inputs` (required/optional |
| 29 | + file slots), `invariants` (auto-fill rules), and |
| 30 | + `defaults` (always appended). Understand all four |
| 31 | + sections before modifying command building. |
| 32 | + |
| 33 | +Example: to add `rebuild_in_place=False` for |
| 34 | +autobuild, you must first verify that |
| 35 | +`rebuild_in_place` is in autobuild's `strategy_flags`. |
| 36 | +If it isn't, the sanitizer strips it and the parameter |
| 37 | +never reaches PHENIX. |
| 38 | + |
| 39 | +--- |
| 40 | + |
| 41 | +## 2. Logging Conventions |
| 42 | + |
| 43 | +The codebase uses three logging patterns in different |
| 44 | +contexts. Don't mix them. |
| 45 | + |
| 46 | +**`agent/` and `knowledge/` modules:** |
| 47 | +Use `logging.getLogger(__name__)`: |
| 48 | + |
| 49 | +```python |
| 50 | +logger = logging.getLogger(__name__) |
| 51 | +logger.debug("update failed", exc_info=True) |
| 52 | +``` |
| 53 | + |
| 54 | +**`programs/ai_agent.py`:** |
| 55 | +Use `self.vlog` at the appropriate verbosity level: |
| 56 | + |
| 57 | +```python |
| 58 | +self.vlog.quiet("Fatal: %s" % error) # always |
| 59 | +self.vlog.normal("Decision: %s" % prog) # default |
| 60 | +self.vlog.verbose("File list: %s" % f) # detail |
| 61 | +``` |
| 62 | + |
| 63 | +**Test files:** |
| 64 | +Use `print()`: |
| 65 | + |
| 66 | +```python |
| 67 | +print(" PASS: test_name") |
| 68 | +print(" SKIP (ai_agent.py not found)") |
| 69 | +``` |
| 70 | + |
| 71 | +No `print()` in `agent/` modules. No `logger` in |
| 72 | +test files. |
| 73 | + |
| 74 | +--- |
| 75 | + |
| 76 | +## 3. Import Fallbacks |
| 77 | + |
| 78 | +The agent code runs in two contexts: inside PHENIX |
| 79 | +(where modules live under `libtbx.langchain`) and |
| 80 | +standalone (where they're imported directly). All |
| 81 | +cross-module imports in `agent/` and `knowledge/` |
| 82 | +must have a fallback: |
| 83 | + |
| 84 | +```python |
| 85 | +try: |
| 86 | + from libtbx.langchain.agent.structure_model \ |
| 87 | + import StructureModel |
| 88 | +except ImportError: |
| 89 | + from agent.structure_model import StructureModel |
| 90 | +``` |
| 91 | + |
| 92 | +Never use bare `from libtbx.langchain.X import Y` |
| 93 | +without a fallback `except ImportError` block. |
| 94 | + |
| 95 | +This is different from the general cctbx pattern |
| 96 | +(where optional imports set the name to `None`). |
| 97 | +In the agent, both paths must resolve to the same |
| 98 | +class — the code that follows uses it unconditionally. |
| 99 | + |
| 100 | +--- |
| 101 | + |
| 102 | +## 4. Analysis Mode Routing |
| 103 | + |
| 104 | +`ai_analysis.py` has five `analysis_mode` values, but |
| 105 | +the AI Agent only uses four of them. Understanding |
| 106 | +which modes need the server is critical: |
| 107 | + |
| 108 | +| Mode | Used by | Needs server | |
| 109 | +|------|---------|:------------:| |
| 110 | +| `standard` | `phenix.ai_analysis` (standalone) | **Yes** (RAG DB) | |
| 111 | +| `directive_extraction` | Agent (session start) | No | |
| 112 | +| `advice_preprocessing` | Agent (session start) | No | |
| 113 | +| `failure_diagnosis` | Agent (on terminal error) | No | |
| 114 | +| `agent_session` | Agent (session end) | No | |
| 115 | + |
| 116 | +The agent **never uses `standard` mode**. That mode is |
| 117 | +the standalone `phenix.ai_analysis` program which does |
| 118 | +retrieval-augmented generation against the Phenix |
| 119 | +knowledge base. |
| 120 | + |
| 121 | +The four agent modes are pure LLM calls routed locally |
| 122 | +when `run_on_server=False` or `provider=ollama`. If you |
| 123 | +add a new analysis mode, decide whether it needs the |
| 124 | +RAG database and add it to `_LLM_ONLY_MODES` in |
| 125 | +`run_job_on_server_or_locally()` if it doesn't. |
| 126 | + |
| 127 | +The agent's THINK node does its own log analysis via |
| 128 | +`thinking_prompts.py` and the expert knowledge base — |
| 129 | +it does not go through `ai_analysis.py` at all. |
| 130 | + |
| 131 | +--- |
| 132 | + |
| 133 | +## 5. Three Error Classification Systems |
| 134 | + |
| 135 | +Error handling is split across three independent |
| 136 | +systems. When adding a new error pattern, you must |
| 137 | +check all three to ensure they agree: |
| 138 | + |
| 139 | +1. **`agent/error_classifier.py`** — called by |
| 140 | + PERCEIVE at the start of the next cycle. Five |
| 141 | + categories: TERMINAL, PHIL_ERROR, AMBIGUOUS_PHIL, |
| 142 | + LABEL_ERROR, RETRYABLE. Feeds `should_pivot()`. |
| 143 | + |
| 144 | +2. **`agent/error_analyzer.py`** — YAML-driven. |
| 145 | + `recoverable_errors.yaml` for auto-fixable errors |
| 146 | + (ambiguous labels). `diagnosable_errors.yaml` for |
| 147 | + terminal errors needing LLM diagnosis. |
| 148 | + |
| 149 | +3. **`programs/ai_agent.py::_classify_error()`** — |
| 150 | + oldest classifier. Maps to INPUT_ERROR (agent's |
| 151 | + fault, don't count) or REAL_FAILURE (count it). |
| 152 | + Used only for cycle history bookkeeping. |
| 153 | + |
| 154 | +See ARCHITECTURE.md "Design tensions" for the full |
| 155 | +overlap analysis and recommended consolidation path. |
| 156 | + |
| 157 | +--- |
| 158 | + |
| 159 | +## 6. Client-Server Code Path Awareness |
| 160 | + |
| 161 | +Changes to files in `agent/` and `knowledge/` are |
| 162 | +server-side — they take effect immediately for all |
| 163 | +users. Changes to `programs/ai_agent.py` are |
| 164 | +client-side — users must update their install. |
| 165 | + |
| 166 | +When modifying `programs/ai_agent.py`, identify |
| 167 | +whether your change is in the **top half** (client |
| 168 | +code: `run()`, `_run_single_cycle()`, |
| 169 | +`_inject_user_params()`) or the **bottom half** |
| 170 | +(server code called when `run_on_server=False`). |
| 171 | +Top-half changes require user updates; bottom-half |
| 172 | +changes are effectively server-side. |
| 173 | + |
| 174 | +See ARCHITECTURE.md "Client-Server Update Model" |
| 175 | +for the full execution split diagram. |
| 176 | + |
| 177 | +--- |
| 178 | + |
| 179 | +## 7. Session State Persistence |
| 180 | + |
| 181 | +Every piece of state the agent needs across cycles |
| 182 | +must appear in three places: |
| 183 | + |
| 184 | +1. Written to `session.data["key"]` after the cycle |
| 185 | +2. Read from `session.data["key"]` on resume |
| 186 | +3. Included in `create_initial_state()` if passed |
| 187 | + through the graph |
| 188 | + |
| 189 | +When you add a new state field, grep for all three |
| 190 | +patterns and verify the field appears in each. Also |
| 191 | +register new `session_info` fields in |
| 192 | +`agent/contract.py` with a default value. |
| 193 | + |
| 194 | +--- |
| 195 | + |
| 196 | +## 8. CHANGELOG Format |
| 197 | + |
| 198 | +CHANGELOG entries for the agent follow a specific |
| 199 | +format: |
| 200 | + |
| 201 | +``` |
| 202 | +## Version NNN.NN (Short Title) |
| 203 | +
|
| 204 | +### Summary |
| 205 | +One paragraph describing the change. |
| 206 | +
|
| 207 | +### New files (N) |
| 208 | +| File | Lines | Purpose | |
| 209 | +|------|-------|---------| |
| 210 | +| `path/to/file.py` | 500 | Description | |
| 211 | +
|
| 212 | +### Modified files (N) |
| 213 | +| File | Changes | |
| 214 | +|------|---------| |
| 215 | +| `path/to/file.py` | What changed | |
| 216 | +
|
| 217 | +### Tests |
| 218 | +N new tests. |
| 219 | +Run with: `python3 tests/run_all_tests.py` |
| 220 | +``` |
| 221 | + |
| 222 | +See `docs/project/CHANGELOG.md` for examples. |
| 223 | + |
| 224 | +--- |
| 225 | + |
| 226 | +## 9. Test Patterns Specific to the Agent |
| 227 | + |
| 228 | +### File-exists guards |
| 229 | + |
| 230 | +Tests that open files outside `agent/` and |
| 231 | +`knowledge/` (e.g., `programs/ai_agent.py`, |
| 232 | +`wxGUI2/Programs/AIAgent.py`) MUST guard with |
| 233 | +`os.path.isfile()` because these directories may |
| 234 | +not be present in all deployment contexts: |
| 235 | + |
| 236 | +```python |
| 237 | +def test_something_in_ai_agent(): |
| 238 | + path = os.path.join(PROJECT_ROOT, |
| 239 | + "programs", "ai_agent.py") |
| 240 | + if not os.path.isfile(path): |
| 241 | + print(" SKIP (ai_agent.py not found)") |
| 242 | + return |
| 243 | + with open(path) as f: |
| 244 | + source = f.read() |
| 245 | + assert_in("expected_string", source) |
| 246 | +``` |
| 247 | + |
| 248 | +Files in `agent/` and `knowledge/` are always |
| 249 | +present — no guard needed. |
| 250 | + |
| 251 | +### Source-scanning search windows |
| 252 | + |
| 253 | +Tests that scan source code for specific patterns |
| 254 | +must account for function length: |
| 255 | + |
| 256 | +```python |
| 257 | +# WRONG — function may be longer than 5000 chars |
| 258 | +idx = src.find("def my_function") |
| 259 | +assert "expected" in src[idx:idx + 5000] |
| 260 | + |
| 261 | +# RIGHT — search to the next function definition |
| 262 | +idx = src.find("def my_function") |
| 263 | +next_def = src.find("\ndef ", idx + 1) |
| 264 | +assert "expected" in src[idx:next_def] |
| 265 | +``` |
| 266 | + |
| 267 | +### Don't swallow test failures |
| 268 | + |
| 269 | +`run_tests_with_fail_fast()` raises `AssertionError` |
| 270 | +on failure. If your `run_all_tests()` wraps it in a |
| 271 | +try/except, failures are silently swallowed: |
| 272 | + |
| 273 | +```python |
| 274 | +# WRONG — test failures discarded |
| 275 | +def run_all_tests(): |
| 276 | + try: |
| 277 | + run_tests_with_fail_fast() |
| 278 | + except Exception: |
| 279 | + pass |
| 280 | + |
| 281 | +# RIGHT — let failures propagate |
| 282 | +def run_all_tests(): |
| 283 | + run_tests_with_fail_fast() |
| 284 | +``` |
| 285 | + |
| 286 | +--- |
| 287 | + |
| 288 | +## 10. Checklist (Agent-Specific Additions) |
| 289 | + |
| 290 | +In addition to the CCTBX checklist: |
| 291 | + |
| 292 | +- [ ] New strategy flags added to `programs.yaml` |
| 293 | + before use in command-building code |
| 294 | +- [ ] New error patterns checked against all three |
| 295 | + classification systems |
| 296 | +- [ ] New `analysis_mode` values added to |
| 297 | + `_LLM_ONLY_MODES` if they don't need the RAG DB |
| 298 | +- [ ] Client vs server code path identified for any |
| 299 | + changes to `programs/ai_agent.py` |
| 300 | +- [ ] Session state fields appear in `session.data`, |
| 301 | + `create_initial_state()`, and `contract.py` |
| 302 | +- [ ] File-exists guards for tests opening files |
| 303 | + outside `agent/` or `knowledge/` |
| 304 | +- [ ] Source-scanning tests use function-boundary |
| 305 | + windows, not fixed character counts |
| 306 | +- [ ] CHANGELOG entry follows the standard table |
| 307 | + format |
| 308 | +- [ ] Documentation updated (OVERVIEW.md, |
| 309 | + ARCHITECTURE.md, DEVELOPER_GUIDE.md) for |
| 310 | + user-visible or architectural changes |
0 commit comments