Skip to content

Commit 8f33425

Browse files
authored
chore:update docs (#295)
1 parent ececae7 commit 8f33425

12 files changed

Lines changed: 437 additions & 24 deletions

docs/ROADMAP.md

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
# Roadmap: api.faculytics
22

3-
This roadmap tracks the implementation status of the `api.faculytics` backend against the product direction. It reflects the checked-in `develop` branch as of 2026-03-31.
3+
This roadmap tracks the implementation status of the `api.faculytics` backend against the product direction. It reflects the checked-in `develop` branch as of 2026-04-12.
44

55
## Project Vision
66

77
Provide a robust, analytics-driven bridge between Moodle learning environments and institutional assessment workflows, enabling data-informed decisions through synchronized academic data, asynchronous AI enrichment, and structured feedback collection from direct submissions and file-based ingestion.
88

99
## Status Snapshot
1010

11-
| Phase | Status | Notes |
12-
| --------------------------------------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
13-
| Phase 1. Foundation & Core Synchronization | Complete | Core auth, Moodle sync, hydration, scheduling, and resilience are in place. |
14-
| Phase 2. Questionnaire & Ingestion Engine | Mostly complete | Questionnaire management, draft/submit flows, and CSV ingestion are live; self-serve file mapping is still pending. |
15-
| Phase 3. AI & Inference Pipeline | Mostly complete | End-to-end pipeline is shipped; production worker rollout and operator monitoring remain open. |
16-
| Phase 4. Analytics & Reporting Infrastructure | In progress | Materialized-view analytics, faculty reports, and PDF export are live; Excel export and long-term analytics scaling remain open. |
17-
| Phase 5. Governance & Ecosystem | In progress | Scoped access, admin tooling, and audit logging are implemented; finer-grained permissions and ecosystem integrations remain open. |
11+
| Phase | Status | Notes |
12+
| --------------------------------------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
13+
| Phase 1. Foundation & Core Synchronization | Complete | Core auth, Moodle sync, hydration, scheduling, and resilience are in place. |
14+
| Phase 2. Questionnaire & Ingestion Engine | Mostly complete | Questionnaire management, draft/submit flows, and CSV ingestion are live; self-serve file mapping is still pending. |
15+
| Phase 3. AI & Inference Pipeline | Mostly complete | End-to-end pipeline is shipped; production worker rollout and operator monitoring remain open. |
16+
| Phase 4. Analytics & Reporting Infrastructure | In progress | Materialized-view analytics, faculty reports, and PDF export are live; Excel export and long-term analytics scaling remain open. |
17+
| Phase 5. Governance & Ecosystem | In progress | Scoped access, admin tooling, audit logging, and audit query endpoints are implemented; finer-grained permissions and ecosystem integrations remain open. |
1818

1919
Cross-cutting platform capabilities already present in the codebase but not treated as separate roadmap phases include Redis-backed caching and throttling, structured health checks, request-scoped CLS metadata, and the authenticated `ChatKit` endpoint.
2020

@@ -28,7 +28,8 @@ Cross-cutting platform capabilities already present in the codebase but not trea
2828
- [x] **Institutional Hierarchy Sync:** Campuses, semesters, departments, programs, courses, and enrollments are mirrored from Moodle.
2929
- [x] **Per-User Hydration on Login:** Moodle logins refresh the user's courses, enrollments, sections, and institutional roles.
3030
- [x] **Section Sync from Moodle Groups:** Course groups are materialized locally as `Section` and attached to enrollments.
31-
- [x] **Institutional Authority Mapping:** Dean and chairperson scope is derived from Moodle category structure, with support for manual dean assignment.
31+
- [x] **Institutional Authority Mapping:** Dean and chairperson scope is derived from Moodle category structure, with support for manual dean assignment and a server-resolved dean-eligibility lookup for the admin UI.
32+
- [x] **User Scope & Role Backfill During Sync:** Enrollment sync now populates `user.campus/program/department` and derives `user.roles` from enrollments + institutional roles (protecting manually granted `SUPER_ADMIN`/`ADMIN`) as explicit post-enrollment phases.
3233
- [x] **Dynamic Sync Scheduling & SyncLog Observability:** Sync cadence is runtime-configurable and every run is recorded with per-phase metrics.
3334
- [x] **Semester Label Enrichment:** Moodle semester codes are parsed into display labels and academic year metadata.
3435
- [x] **Moodle Connectivity Resilience (FAC-33):** 10-second request timeouts and connectivity-specific failures prevent hanging auth and sync paths.
@@ -38,7 +39,7 @@ Cross-cutting platform capabilities already present in the codebase but not trea
3839

3940
- [x] **Recursive Schema Validation:** Questionnaire schemas enforce leaf-only questions and exact weight totals.
4041
- [x] **Dimension Registry & Admin API:** Canonical dimensions are seeded and can be managed through the dimensions module.
41-
- [x] **Questionnaire Lifecycle Management:** Questionnaires support creation, update, archive, publish, deprecate, and version detail flows.
42+
- [x] **Questionnaire Lifecycle Management:** Questionnaires support creation, update, archive, publish, deprecate, version detail, and version-from-template flows (draft seeded from any prior published version).
4243
- [x] **Institutional Snapshotting:** Submissions persist faculty, department, program, campus, and semester snapshots for historical stability.
4344
- [x] **Draft Save/Resume Flow:** Respondents can save, retrieve, list, and delete drafts before final submission.
4445
- [x] **Submission & Scoring:** Finalized submissions validate answers, compute normalized scores, and enforce duplicate-submission rules.
@@ -56,7 +57,8 @@ Cross-cutting platform capabilities already present in the codebase but not trea
5657
- [x] **Topic Modeling (FAC-46):** Topic discovery persists assignments, keywords, and run provenance.
5758
- [x] **Topic Labeling:** Topic clusters are labeled before recommendation generation.
5859
- [x] **Embedding Generation (FAC-46):** pgvector-backed embeddings are stored and upserted per submission.
59-
- [x] **Recommendations Engine v2 (FAC-55):** Recommendations are generated directly via OpenAI with structured output, confidence, and supporting evidence.
60+
- [x] **Recommendations Engine v2 (FAC-55):** Recommendations are generated directly via OpenAI with structured output, confidence, and pipeline-scoped supporting evidence (topic counts narrowed to the pipeline's `submissionIds`, preventing cross-faculty leakage).
61+
- [x] **LLM Worker Hardening:** Sentiment processor pins responses to the dispatched `submissionId` set, drops hallucinated IDs with observability logs, and terminally fails the stage when a batch is 100% hallucinated (retrying the LLM is counter-productive).
6062
- [x] **Worker Contracts & Inference Versioning:** Zod-validated contracts and version fields exist across pipeline runs.
6163
- [x] **Local Worker Simulation:** `mock-worker/` supports local development without deployed inference workers.
6264
- [ ] **RunPod / Production Worker Rollout:** Sentiment and topic-model stages still need production endpoint deployment and cutover.
@@ -81,9 +83,10 @@ Cross-cutting platform capabilities already present in the codebase but not trea
8183
- [x] **Scoped Dean/Chairperson Access:** `ScopeResolverService` restricts analytics, curriculum, and faculty queries to authorized departments and programs.
8284
- [x] **Institutional Role Administration:** Super admins can assign and remove manual dean/chairperson roles through admin endpoints.
8385
- [x] **Admin Directory APIs:** Super-admin endpoints support user listing, filtering, and institutional role management workflows.
84-
- [x] **Append-Only Audit Trail:** Auth, sync, questionnaire, and analysis actions are captured through the global audit pipeline.
86+
- [x] **Append-Only Audit Trail:** Auth, sync, questionnaire, analysis, and Moodle provisioning actions are captured through the global audit pipeline.
87+
- [x] **Audit Review Surface:** `GET /audit-logs` and `GET /audit-logs/:id` expose filterable, paginated audit queries (super-admin only) with stable ordering and LIKE-pattern sanitization.
88+
- [x] **Moodle Seeding Toolkit:** API-native provisioning of categories, bulk/quick courses, and fake users replaces the external Rust CLI, with live Moodle tree inspection and cascading admin filters.
8589
- [ ] **Fine-Grained Permission Model:** Access control is still role-centric rather than permission-centric.
86-
- [ ] **Audit Review Surface:** The write path exists, but there are no audit query/reporting endpoints for operators yet.
8790
- [ ] **Notification Engine:** Automated reminders and outbound notifications are still pending.
8891
- [ ] **External SIS Integration:** Moodle remains the only production integration surface for institutional data.
8992

docs/architecture/ai-inference-pipeline.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,19 @@ The recommendations stage does **not** use the batch message contract — see [R
119119

120120
See `docs/worker-contracts/` for full per-worker contracts.
121121

122+
### Dispatch-Set Pinning (LLM Workers)
123+
124+
Zod validates the **shape** of a worker response but cannot validate that the `submissionId` keys actually correspond to rows the API dispatched. For LLM-backed workers this matters: under some prompts the model hallucinates UUIDs that don't exist in the dispatched batch, and persisting them causes PostgreSQL FK violations that abort the whole batch transaction — losing even the valid results.
125+
126+
`SentimentProcessor.Persist()` pins the response against a dispatch set:
127+
128+
1. Build `dispatchedIds = new Set(job.data.items.map(i => i.submissionId))` before any DB work.
129+
2. Drop every result whose `submissionId` is not in `dispatchedIds`. Log `warn "Dropped X of Y sentiment results for run {runId} (unknown submissionIds)"` whenever the drop count is non-zero.
130+
3. If **all** results are dropped, call `orchestrator.OnStageFailed(pipelineId, 'sentiment_analysis', ...)` and return. Retry is not useful — more LLM calls will produce more hallucinations.
131+
4. The pre-existing `sentimentResultItemSchema.safeParse` loop still runs on the filtered set as a second validation layer.
132+
133+
Treat any new LLM-backed processor under `BaseAnalysisProcessor` as needing the same pattern. See [Decision #41 — LLM-Backed Worker Dispatch-Set Pinning](../decisions/decisions.md#41-llm-backed-worker-dispatch-set-pinning).
134+
122135
## 4. Sentiment Gate
123136

124137
Between sentiment analysis and topic modeling, a **sentiment gate** filters the corpus:
@@ -292,6 +305,8 @@ Each `RecommendedAction` stores a `supportingEvidence` JSONB column with:
292305
- **Confidence level:** HIGH / MEDIUM / LOW
293306
- **basedOnSubmissions:** Total comment count in scope
294307

308+
> **Pipeline-scoped counts.** `TopicSource.commentCount` is derived from `TopicAssignment` rows filtered by **both** `topic.id IN (...)` and `submission.id IN (pipelineSubmissionIds)`**not** from the `Topic.docCount` column. `Topic` is a shared entity: multiple pipelines across different faculty can produce assignments against the same topic, and `docCount` is a global counter over all of them. Scoping by `submissionIds` prevents cross-faculty evidence leakage and makes `confidenceLevel` reflect the current pipeline's evidence rather than the topic's global activity. Any future consumer of topic-derived evidence must apply the same scoping.
309+
295310
### Output Schema
296311

297312
Actions follow the `RecommendedActionItem` schema:

docs/architecture/analytics.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,9 +81,11 @@ All endpoints require `DEAN` or `SUPER_ADMIN` role. Dean scope is enforced via `
8181
| Method | Path | Query Params | Description |
8282
| ------ | ---------------------- | -------------------------------------- | ------------------------------------------------- |
8383
| GET | `/analytics/overview` | `semesterId` (required), `programCode` | Department overview with per-faculty stats |
84-
| GET | `/analytics/attention` | `semesterId` (required) | Faculty flagged for review with attention flags |
84+
| GET | `/analytics/attention` | `semesterId` (required), `programCode` | Faculty flagged for review with attention flags |
8585
| GET | `/analytics/trends` | `semesterId`, `minSemesters`, `minR2` | Faculty trend data with linear regression results |
8686

87+
`programCode` on both `overview` and `attention` is trimmed, required non-empty, and capped at 20 characters.
88+
8789
### Department Overview (`/analytics/overview`)
8890

8991
Returns per-faculty stats for a semester with computed fields:
@@ -116,3 +118,15 @@ Falls back to the latest semester for scope resolution when `semesterId` is omit
116118
## Scope Resolution
117119

118120
Unlike `FacultyModule` and `CurriculumModule` which resolve to department UUIDs, the `AnalyticsService` resolves to **department codes** (via `ResolveDepartmentCodes()`). This is because the materialized views use `department_code_snapshot` (a string snapshot from submission time) rather than foreign key references to the live department table.
121+
122+
### Program-Level Scope Check
123+
124+
When callers pass `programCode` on `overview` or `attention`, the service validates it against `ScopeResolverService.ResolveProgramCodes(semesterId)`:
125+
126+
- `null` (super admin / dean) — any `programCode` accepted.
127+
- `string[]` (chairperson) — `programCode` must be in the list.
128+
- Out-of-scope requests **do not 403**. They short-circuit and return a well-formed empty payload with `lastRefreshedAt` populated.
129+
130+
The silent short-circuit avoids leaking existence information (a 403 tells the caller "that program exists but you can't see it"; an empty result does not). Chairpersons already cannot enumerate programs outside their scope via `/curriculum/programs` — that endpoint applies the same `ResolveProgramIds` filter.
131+
132+
`GetAttentionList` adds `AND program_code_snapshot = ?` to the `mv_faculty_semester_stats` source of the consistency-gap and skipped-signals subqueries. The trend-based signal joins `mv_faculty_trends` against `mv_faculty_semester_stats` on `(faculty_id, department_code_snapshot)` so trend rows can be filtered by the per-semester program snapshot — trend rows are not scoped to a single program by themselves.

docs/architecture/audit-trail.md

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Append-only, immutable. Does **not** extend `CustomBaseEntity` (no `updatedAt`,
4949

5050
Queries must use `filters: { softDelete: false }` to bypass the global soft-delete filter.
5151

52-
## MVP Actions
52+
## Action Codes
5353

5454
```typescript
5555
export const AuditAction = {
@@ -65,9 +65,16 @@ export const AuditAction = {
6565
ANALYSIS_PIPELINE_CREATE: 'analysis.pipeline.create',
6666
ANALYSIS_PIPELINE_CONFIRM: 'analysis.pipeline.confirm',
6767
ANALYSIS_PIPELINE_CANCEL: 'analysis.pipeline.cancel',
68+
MOODLE_PROVISION_CATEGORIES: 'moodle.provision.categories',
69+
MOODLE_PROVISION_COURSES: 'moodle.provision.courses',
70+
MOODLE_PROVISION_QUICK_COURSE: 'moodle.provision.quick-course',
71+
MOODLE_PROVISION_USERS: 'moodle.provision.users',
72+
MOODLE_BULK_PROVISION_COURSES: 'moodle.provision.bulk-courses',
6873
} as const;
6974
```
7075

76+
The `moodle.provision.*` actions are emitted by the Moodle seeding toolkit — see [Moodle Provisioning](../moodle/provisioning.md).
77+
7178
## Interceptor Path Detail
7279

7380
Endpoints are tagged with the `@Audited({ action, resource? })` decorator, which sets Reflector metadata. The `AuditInterceptor` reads this metadata and, on successful response (RxJS `tap`, not `finalize`), enqueues an audit event.
@@ -115,3 +122,55 @@ Audit failures never break the request:
115122
1. `AuditService.Emit()` wraps `queue.add()` in try/catch — logs a warning, returns void.
116123
2. `AuditInterceptor` wraps the entire `tap` callback in try/catch — errors are logged, never propagated.
117124
3. The `.catch()` on the `Emit()` promise handles async rejections.
125+
126+
## Query API
127+
128+
`AuditController` exposes read-only query endpoints for operators. All routes require `SUPER_ADMIN` — any other role receives `403 Forbidden`.
129+
130+
| Method | Path | Description |
131+
| ------ | ----------------- | ------------------------------------------------ |
132+
| GET | `/audit-logs` | Paginated, filterable list of audit records |
133+
| GET | `/audit-logs/:id` | Fetch a single record by UUID (`404` if missing) |
134+
135+
### List Filters (`ListAuditLogsQueryDto`)
136+
137+
| Field | Match type | Notes |
138+
| ---------------- | -------------------------------------------------------------- | -------------------------------------------------- |
139+
| `action` | Exact | e.g., `auth.login.success` |
140+
| `actorId` | Exact (UUID) | |
141+
| `actorUsername` | Case-insensitive partial (`$ilike %value%`) | Trimmed; `%`, `_`, `\` are escaped before matching |
142+
| `resourceType` | Exact | e.g., `User`, `AnalysisPipeline` |
143+
| `resourceId` | Exact | |
144+
| `from` / `to` | Inclusive range on `occurredAt` | ISO 8601 date strings |
145+
| `search` | OR `$ilike` across `actorUsername` / `action` / `resourceType` | Same escape rules |
146+
| `page` / `limit` | Inherited from `PaginationQueryDto` | Defaults `page=1`, `limit=10`; `limit` max `100` |
147+
148+
Explicit filters are combined with AND; `search` is always wrapped in its own `$or` so operators can express "admin login in January" by combining `search=login` with `from/to`.
149+
150+
### Ordering & Pagination
151+
152+
Results are ordered `occurredAt DESC, id DESC`. The secondary sort on `id` is load-bearing: audit writes land at sub-millisecond precision, so ordering by `occurredAt` alone would yield non-deterministic paging for bursty activity (logins, sync kickoff).
153+
154+
`findAndCount` is issued with `filters: { softDelete: false }` — belt-and-suspenders, since the entity does not extend `CustomBaseEntity` and cannot be soft-deleted today.
155+
156+
### Response Shapes
157+
158+
```ts
159+
// GET /audit-logs
160+
{
161+
data: AuditLogItemResponseDto[],
162+
meta: {
163+
totalItems: number,
164+
itemCount: number,
165+
itemsPerPage: number,
166+
totalPages: number,
167+
currentPage: number,
168+
},
169+
}
170+
```
171+
172+
`AuditLogItemResponseDto` and `AuditLogDetailResponseDto` currently share the same shape (`id`, `action`, `actorId?`, `actorUsername?`, `resourceType?`, `resourceId?`, `metadata?`, `browserName?`, `os?`, `ipAddress?`, `occurredAt`). They are kept as separate DTOs on purpose: the list view may later strip heavy fields (`metadata`, `ipAddress`) for bandwidth/privacy without breaking the single-record contract.
173+
174+
### LIKE-Pattern Escaping
175+
176+
User-supplied strings are trimmed and sanitized before being wrapped in `%…%`. `%`, `_`, and `\` are replaced with their backslash-escaped variants so that a username containing `%` cannot silently widen the match to every row.

0 commit comments

Comments
 (0)