You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/ROADMAP.md
+16-13Lines changed: 16 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,20 @@
1
1
# Roadmap: api.faculytics
2
2
3
-
This roadmap tracks the implementation status of the `api.faculytics` backend against the product direction. It reflects the checked-in `develop` branch as of 2026-03-31.
3
+
This roadmap tracks the implementation status of the `api.faculytics` backend against the product direction. It reflects the checked-in `develop` branch as of 2026-04-12.
4
4
5
5
## Project Vision
6
6
7
7
Provide a robust, analytics-driven bridge between Moodle learning environments and institutional assessment workflows, enabling data-informed decisions through synchronized academic data, asynchronous AI enrichment, and structured feedback collection from direct submissions and file-based ingestion.
| Phase 1. Foundation & Core Synchronization | Complete | Core auth, Moodle sync, hydration, scheduling, and resilience are in place. |
14
+
| Phase 2. Questionnaire & Ingestion Engine | Mostly complete | Questionnaire management, draft/submit flows, and CSV ingestion are live; self-serve file mapping is still pending. |
15
+
| Phase 3. AI & Inference Pipeline | Mostly complete | End-to-end pipeline is shipped; production worker rollout and operator monitoring remain open. |
16
+
| Phase 4. Analytics & Reporting Infrastructure | In progress | Materialized-view analytics, faculty reports, and PDF export are live; Excel export and long-term analytics scaling remain open. |
17
+
| Phase 5. Governance & Ecosystem | In progress | Scoped access, admin tooling, audit logging, and audit query endpoints are implemented; finer-grained permissions and ecosystem integrations remain open. |
18
18
19
19
Cross-cutting platform capabilities already present in the codebase but not treated as separate roadmap phases include Redis-backed caching and throttling, structured health checks, request-scoped CLS metadata, and the authenticated `ChatKit` endpoint.
20
20
@@ -28,7 +28,8 @@ Cross-cutting platform capabilities already present in the codebase but not trea
28
28
-[x]**Institutional Hierarchy Sync:** Campuses, semesters, departments, programs, courses, and enrollments are mirrored from Moodle.
29
29
-[x]**Per-User Hydration on Login:** Moodle logins refresh the user's courses, enrollments, sections, and institutional roles.
30
30
-[x]**Section Sync from Moodle Groups:** Course groups are materialized locally as `Section` and attached to enrollments.
31
-
-[x]**Institutional Authority Mapping:** Dean and chairperson scope is derived from Moodle category structure, with support for manual dean assignment.
31
+
-[x]**Institutional Authority Mapping:** Dean and chairperson scope is derived from Moodle category structure, with support for manual dean assignment and a server-resolved dean-eligibility lookup for the admin UI.
32
+
-[x]**User Scope & Role Backfill During Sync:** Enrollment sync now populates `user.campus/program/department` and derives `user.roles` from enrollments + institutional roles (protecting manually granted `SUPER_ADMIN`/`ADMIN`) as explicit post-enrollment phases.
32
33
-[x]**Dynamic Sync Scheduling & SyncLog Observability:** Sync cadence is runtime-configurable and every run is recorded with per-phase metrics.
33
34
-[x]**Semester Label Enrichment:** Moodle semester codes are parsed into display labels and academic year metadata.
34
35
-[x]**Moodle Connectivity Resilience (FAC-33):** 10-second request timeouts and connectivity-specific failures prevent hanging auth and sync paths.
@@ -38,7 +39,7 @@ Cross-cutting platform capabilities already present in the codebase but not trea
-[x]**Dimension Registry & Admin API:** Canonical dimensions are seeded and can be managed through the dimensions module.
41
-
-[x]**Questionnaire Lifecycle Management:** Questionnaires support creation, update, archive, publish, deprecate, and version detailflows.
42
+
-[x]**Questionnaire Lifecycle Management:** Questionnaires support creation, update, archive, publish, deprecate, version detail, and version-from-template flows (draft seeded from any prior published version).
42
43
-[x]**Institutional Snapshotting:** Submissions persist faculty, department, program, campus, and semester snapshots for historical stability.
43
44
-[x]**Draft Save/Resume Flow:** Respondents can save, retrieve, list, and delete drafts before final submission.
@@ -56,7 +57,8 @@ Cross-cutting platform capabilities already present in the codebase but not trea
56
57
-[x]**Topic Modeling (FAC-46):** Topic discovery persists assignments, keywords, and run provenance.
57
58
-[x]**Topic Labeling:** Topic clusters are labeled before recommendation generation.
58
59
-[x]**Embedding Generation (FAC-46):** pgvector-backed embeddings are stored and upserted per submission.
59
-
-[x]**Recommendations Engine v2 (FAC-55):** Recommendations are generated directly via OpenAI with structured output, confidence, and supporting evidence.
60
+
-[x]**Recommendations Engine v2 (FAC-55):** Recommendations are generated directly via OpenAI with structured output, confidence, and pipeline-scoped supporting evidence (topic counts narrowed to the pipeline's `submissionIds`, preventing cross-faculty leakage).
61
+
-[x]**LLM Worker Hardening:** Sentiment processor pins responses to the dispatched `submissionId` set, drops hallucinated IDs with observability logs, and terminally fails the stage when a batch is 100% hallucinated (retrying the LLM is counter-productive).
60
62
-[x]**Worker Contracts & Inference Versioning:** Zod-validated contracts and version fields exist across pipeline runs.
61
63
-[x]**Local Worker Simulation:**`mock-worker/` supports local development without deployed inference workers.
62
64
-[ ]**RunPod / Production Worker Rollout:** Sentiment and topic-model stages still need production endpoint deployment and cutover.
@@ -81,9 +83,10 @@ Cross-cutting platform capabilities already present in the codebase but not trea
81
83
-[x]**Scoped Dean/Chairperson Access:**`ScopeResolverService` restricts analytics, curriculum, and faculty queries to authorized departments and programs.
82
84
-[x]**Institutional Role Administration:** Super admins can assign and remove manual dean/chairperson roles through admin endpoints.
83
85
-[x]**Admin Directory APIs:** Super-admin endpoints support user listing, filtering, and institutional role management workflows.
84
-
-[x]**Append-Only Audit Trail:** Auth, sync, questionnaire, and analysis actions are captured through the global audit pipeline.
86
+
-[x]**Append-Only Audit Trail:** Auth, sync, questionnaire, analysis, and Moodle provisioning actions are captured through the global audit pipeline.
87
+
-[x]**Audit Review Surface:**`GET /audit-logs` and `GET /audit-logs/:id` expose filterable, paginated audit queries (super-admin only) with stable ordering and LIKE-pattern sanitization.
88
+
-[x]**Moodle Seeding Toolkit:** API-native provisioning of categories, bulk/quick courses, and fake users replaces the external Rust CLI, with live Moodle tree inspection and cascading admin filters.
85
89
-[ ]**Fine-Grained Permission Model:** Access control is still role-centric rather than permission-centric.
86
-
-[ ]**Audit Review Surface:** The write path exists, but there are no audit query/reporting endpoints for operators yet.
87
90
-[ ]**Notification Engine:** Automated reminders and outbound notifications are still pending.
88
91
-[ ]**External SIS Integration:** Moodle remains the only production integration surface for institutional data.
Copy file name to clipboardExpand all lines: docs/architecture/ai-inference-pipeline.md
+15Lines changed: 15 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -119,6 +119,19 @@ The recommendations stage does **not** use the batch message contract — see [R
119
119
120
120
See `docs/worker-contracts/` for full per-worker contracts.
121
121
122
+
### Dispatch-Set Pinning (LLM Workers)
123
+
124
+
Zod validates the **shape** of a worker response but cannot validate that the `submissionId` keys actually correspond to rows the API dispatched. For LLM-backed workers this matters: under some prompts the model hallucinates UUIDs that don't exist in the dispatched batch, and persisting them causes PostgreSQL FK violations that abort the whole batch transaction — losing even the valid results.
125
+
126
+
`SentimentProcessor.Persist()` pins the response against a dispatch set:
127
+
128
+
1. Build `dispatchedIds = new Set(job.data.items.map(i => i.submissionId))` before any DB work.
129
+
2. Drop every result whose `submissionId` is not in `dispatchedIds`. Log `warn "Dropped X of Y sentiment results for run {runId} (unknown submissionIds)"` whenever the drop count is non-zero.
130
+
3. If **all** results are dropped, call `orchestrator.OnStageFailed(pipelineId, 'sentiment_analysis', ...)` and return. Retry is not useful — more LLM calls will produce more hallucinations.
131
+
4. The pre-existing `sentimentResultItemSchema.safeParse` loop still runs on the filtered set as a second validation layer.
132
+
133
+
Treat any new LLM-backed processor under `BaseAnalysisProcessor` as needing the same pattern. See [Decision #41 — LLM-Backed Worker Dispatch-Set Pinning](../decisions/decisions.md#41-llm-backed-worker-dispatch-set-pinning).
134
+
122
135
## 4. Sentiment Gate
123
136
124
137
Between sentiment analysis and topic modeling, a **sentiment gate** filters the corpus:
@@ -292,6 +305,8 @@ Each `RecommendedAction` stores a `supportingEvidence` JSONB column with:
292
305
-**Confidence level:** HIGH / MEDIUM / LOW
293
306
-**basedOnSubmissions:** Total comment count in scope
294
307
308
+
> **Pipeline-scoped counts.**`TopicSource.commentCount` is derived from `TopicAssignment` rows filtered by **both**`topic.id IN (...)` and `submission.id IN (pipelineSubmissionIds)` — **not** from the `Topic.docCount` column. `Topic` is a shared entity: multiple pipelines across different faculty can produce assignments against the same topic, and `docCount` is a global counter over all of them. Scoping by `submissionIds` prevents cross-faculty evidence leakage and makes `confidenceLevel` reflect the current pipeline's evidence rather than the topic's global activity. Any future consumer of topic-derived evidence must apply the same scoping.
309
+
295
310
### Output Schema
296
311
297
312
Actions follow the `RecommendedActionItem` schema:
| GET |`/analytics/overview`|`semesterId` (required), `programCode`| Department overview with per-faculty stats |
84
-
| GET |`/analytics/attention`|`semesterId` (required)| Faculty flagged for review with attention flags |
84
+
| GET |`/analytics/attention`|`semesterId` (required), `programCode`| Faculty flagged for review with attention flags |
85
85
| GET |`/analytics/trends`|`semesterId`, `minSemesters`, `minR2`| Faculty trend data with linear regression results |
86
86
87
+
`programCode` on both `overview` and `attention` is trimmed, required non-empty, and capped at 20 characters.
88
+
87
89
### Department Overview (`/analytics/overview`)
88
90
89
91
Returns per-faculty stats for a semester with computed fields:
@@ -116,3 +118,15 @@ Falls back to the latest semester for scope resolution when `semesterId` is omit
116
118
## Scope Resolution
117
119
118
120
Unlike `FacultyModule` and `CurriculumModule` which resolve to department UUIDs, the `AnalyticsService` resolves to **department codes** (via `ResolveDepartmentCodes()`). This is because the materialized views use `department_code_snapshot` (a string snapshot from submission time) rather than foreign key references to the live department table.
121
+
122
+
### Program-Level Scope Check
123
+
124
+
When callers pass `programCode` on `overview` or `attention`, the service validates it against `ScopeResolverService.ResolveProgramCodes(semesterId)`:
125
+
126
+
-`null` (super admin / dean) — any `programCode` accepted.
127
+
-`string[]` (chairperson) — `programCode` must be in the list.
128
+
- Out-of-scope requests **do not 403**. They short-circuit and return a well-formed empty payload with `lastRefreshedAt` populated.
129
+
130
+
The silent short-circuit avoids leaking existence information (a 403 tells the caller "that program exists but you can't see it"; an empty result does not). Chairpersons already cannot enumerate programs outside their scope via `/curriculum/programs` — that endpoint applies the same `ResolveProgramIds` filter.
131
+
132
+
`GetAttentionList` adds `AND program_code_snapshot = ?` to the `mv_faculty_semester_stats` source of the consistency-gap and skipped-signals subqueries. The trend-based signal joins `mv_faculty_trends` against `mv_faculty_semester_stats` on `(faculty_id, department_code_snapshot)` so trend rows can be filtered by the per-semester program snapshot — trend rows are not scoped to a single program by themselves.
The `moodle.provision.*` actions are emitted by the Moodle seeding toolkit — see [Moodle Provisioning](../moodle/provisioning.md).
77
+
71
78
## Interceptor Path Detail
72
79
73
80
Endpoints are tagged with the `@Audited({ action, resource? })` decorator, which sets Reflector metadata. The `AuditInterceptor` reads this metadata and, on successful response (RxJS `tap`, not `finalize`), enqueues an audit event.
@@ -115,3 +122,55 @@ Audit failures never break the request:
115
122
1.`AuditService.Emit()` wraps `queue.add()` in try/catch — logs a warning, returns void.
116
123
2.`AuditInterceptor` wraps the entire `tap` callback in try/catch — errors are logged, never propagated.
117
124
3. The `.catch()` on the `Emit()` promise handles async rejections.
125
+
126
+
## Query API
127
+
128
+
`AuditController` exposes read-only query endpoints for operators. All routes require `SUPER_ADMIN` — any other role receives `403 Forbidden`.
|`from` / `to`| Inclusive range on `occurredAt`| ISO 8601 date strings |
145
+
|`search`| OR `$ilike` across `actorUsername` / `action` / `resourceType`| Same escape rules |
146
+
|`page` / `limit`| Inherited from `PaginationQueryDto`| Defaults `page=1`, `limit=10`; `limit` max `100`|
147
+
148
+
Explicit filters are combined with AND; `search` is always wrapped in its own `$or` so operators can express "admin login in January" by combining `search=login` with `from/to`.
149
+
150
+
### Ordering & Pagination
151
+
152
+
Results are ordered `occurredAt DESC, id DESC`. The secondary sort on `id` is load-bearing: audit writes land at sub-millisecond precision, so ordering by `occurredAt` alone would yield non-deterministic paging for bursty activity (logins, sync kickoff).
153
+
154
+
`findAndCount` is issued with `filters: { softDelete: false }` — belt-and-suspenders, since the entity does not extend `CustomBaseEntity` and cannot be soft-deleted today.
155
+
156
+
### Response Shapes
157
+
158
+
```ts
159
+
// GET /audit-logs
160
+
{
161
+
data: AuditLogItemResponseDto[],
162
+
meta: {
163
+
totalItems: number,
164
+
itemCount: number,
165
+
itemsPerPage: number,
166
+
totalPages: number,
167
+
currentPage: number,
168
+
},
169
+
}
170
+
```
171
+
172
+
`AuditLogItemResponseDto` and `AuditLogDetailResponseDto` currently share the same shape (`id`, `action`, `actorId?`, `actorUsername?`, `resourceType?`, `resourceId?`, `metadata?`, `browserName?`, `os?`, `ipAddress?`, `occurredAt`). They are kept as separate DTOs on purpose: the list view may later strip heavy fields (`metadata`, `ipAddress`) for bandwidth/privacy without breaking the single-record contract.
173
+
174
+
### LIKE-Pattern Escaping
175
+
176
+
User-supplied strings are trimmed and sanitized before being wrapped in `%…%`. `%`, `_`, and `\` are replaced with their backslash-escaped variants so that a username containing `%` cannot silently widen the match to every row.
0 commit comments