Skip to content

Commit a010f4e

Browse files
y4nderclaudeayacoders
authored
Release April 14, 2026 -- FAC-122 to FAC-133 (#338)
* FAC-122 fix: harden sentiment pipeline against hallucinated submission IDs (#293) (#294) Adds a defensive ID filter in SentimentProcessor.Persist that validates worker results against the dispatched submission ID set before any DB work, preventing FK violations caused by LLM UUID drift. * chore:update docs (#295) * [STAGING] FAC-122.1 to FAC-123.2 fix schema drift (#311) * FAC-122.1 chore: add @Index expression metadata for soft-delete partial indexes (#308) * FAC-122.2 fix: explicitly type CustomBaseEntity.deletedAt to fix varchar(255) reflection (#309) TypeScript's emitDecoratorMetadata can't reflect optional Date types without initializers, causing MikroORM to fall back to varchar(255). This adds explicit type: 'datetime' to the deletedAt decorator and includes a migration to convert 29 affected tables from varchar(255) to timestamptz. Closes #306 * FAC-122.3 fix: resolve MikroORM migration drift for audit_log, recommended_action, and playing_with_neon (#310) - Add defaultRaw: 'now()' and length: 6 to AuditLog.occurredAt entity - Add Opt type annotation and app-level default to match codebase pattern - Update snapshot to align audit_log.occurred_at.default with entity - Fix Migration20260412153923 to drop/recreate matviews before altering deleted_at columns - Clean Migration20260412161915 to remove redundant deleted_at statements - Add SET DEFAULT now() to audit_log.occurred_at in migration UP Verified: npx mikro-orm migration:check exits 0, all 885 tests pass * [STAGING] FAC-123 to FAC-130 (#324) * FAC-123 feat: add user.home_department_id column and entity field (#312) Add nullable FK column for stable institutional home department, separate from sync-derived department field. Includes B-tree index for future dean authorization scoping queries. * FAC-124 refactor: stop deriving user scope fields from enrollment counts (#313) Remove backfillUserScopes and deriveUserScopes methods that derived user.department_id, user.program_id, and user.campus_id based on "primary program wins" logic. These fields represented teaching load rather than institutional belonging. - Delete backfillUserScopes from EnrollmentSyncService (Phase 4) - Delete deriveUserScopes from MoodleUserHydrationService - Remove unused Campus/Program imports - Update Phase 5 to Phase 4 in code comments - Update documentation with historical context notes Existing scope field values remain frozen as fallback seed data for the FAC-125 home_department_id backfill. * FAC-125 feat: source tracking + enrollment-based scope derivation (#314) Drops home_department_id (FAC-123) and adds department_source/program_source columns. Restores enrollment-based derivation in both cron and login paths through a shared deriveUserScopes() helper, with an atomic source guard so manual overrides survive sync, an equality guard so no-op runs do not bump updatedAt, and an env-stable moodleCategoryId tiebreaker. Adds a fill-if-null campus backfill in the cron path that mirrors UserRepository.UpsertFromMoodle's username-prefix lookup, so cron-discovered users get a campus before they ever log in without overwriting manual reassignments. * FAC-126 fix: enforce role-vs-type and scope on questionnaire submissions (#317) Adds an authorization gate inside QuestionnaireService.submitQuestionnaire that rejects role/type mismatches and dean/chairperson out-of-scope faculty selections. Ingestion-engine and admin-generate bypass the gate via a new skipAuthorization options-bag flag with a logger.warn audit trail. Two pre-existing identity-related holes uncovered during review (body-trust respondentId and super-admin spoof bypass) are tracked separately as #315 and #316. * FAC-127 feat: admin UI for manual faculty scope override (#318) Adds PATCH /admin/users/:id/scope-assignment for super admins to manually override user.department / user.program, flipping the matching source column to 'manual' so the next Moodle sync won't clobber the correction. Explicit null resets a field to auto-derived. Wires CurrentUserInterceptor + CommonModule/DataLoaderModule into AdminModule so audit log rows capture actorId, and introduces the first inline AuditService.Emit() usage with a pinned changedFields contract (SCOPE_FIELD_NAMES) for future inline emits to follow. * FAC-128 feat: snapshot faculty home department on submissions (#320) Adds nullable faculty_department_id FK + code/name snapshot columns to questionnaire_submission so analytics can attribute a submission to the faculty member's home department rather than the course-owner department. Populated at submission-creation time from faculty.department; emits a plain-string logger.warn with locked grep-key [submission.faculty_department_missing] (before em.persist) when null, so the signal survives flush-time exceptions. Strictly additive — the existing department assignment, unique constraint, and historical rows are untouched. FAC-130 will migrate analytics reads to COALESCE(faculty_department_id, department_id). Also removes the stale publish-contract.yml workflow. * FAC-129 refactor: filter dean faculty listing by home department (#322) Rewire `GET /faculty` primary query to filter by `user.department_id` instead of deriving faculty scope from enrollment→course→program→ department joins. Home-dept faculty with zero scope-visible teaching now appear on the dean's roster; faculty teaching outside their home dept no longer leak into other deans' lists. Preserve the legacy enrollment-join semantics as a secondary endpoint `GET /faculty/cross-department-teaching`, narrowed to true cross-dept faculty only (home dept ≠ course-owning dept, home dept not NULL and not soft-deleted). * FAC-130 refactor: aggregate analytics materialized view by faculty home department (#323) * FAC-130 refactor: aggregate analytics materialized view by faculty home department Recreate mv_faculty_semester_stats to group by COALESCE(faculty_department_code_snapshot, department_code_snapshot) so dean dashboards aggregate by the faculty's institutional home department instead of the course-owner department. Column names are preserved, so analytics.service.ts, DTOs, and the frontend contract stay untouched. Historical submissions predating FAC-128 fall back to the course-owner code via COALESCE. mv_faculty_trends is recreated verbatim and inherits the new semantics transitively. * FAC-130 fix: hide _atLeastOneField synthetic prop from Swagger schema The synthetic _atLeastOneField placeholder (carrier for class-validator's class-level AtLeastOneField constraint) was being auto-reflected by the @nestjs/swagger CLI plugin. Its 'never' type caused SchemaObjectFactory to recurse and trigger a circular-dependency error every time /swagger was accessed. @ApiHideProperty() tells the plugin to skip it; runtime validation behavior is unchanged. * FAC-130.1 fix: status rate limiting (#326) (#327) * FAC-130.2 refactor coverage stats handling for pipeline status queries (#328) (#330) Coverage stats (submissionCount, totalEnrolled, commentCount, responseRate) were computed once in CreatePipeline and cached on the AnalysisPipeline entity. GetPipelineStatus read them from the entity and never refreshed, so pipelines created early in data collection reported stale numbers forever — appearing as a hard "limit" to users when more submissions arrived afterwards. Now, while a pipeline is still in AWAITING_CONFIRMATION, GetPipelineStatus recomputes coverage live and persists the fresh values (including warnings) back to the entity so what the user sees matches what will be locked in at confirmation time. After confirmation, the stored snapshot is preserved untouched — it represents the corpus that was actually analyzed and must not drift. Refactors: - Extract BuildScopeFromPipeline helper (entity -> ScopeFilter) - Extract BuildCoverageWarnings helper (reused by CreatePipeline and GetPipelineStatus) Tests: 2 new cases cover the fresh-recompute path and the snapshot-preserved path for confirmed pipelines. https://claude.ai/code/session_01AsGM2DbyriRMyLWHr7Hwdw Co-authored-by: Claude <noreply@anthropic.com> * [STAGING] FAC-131 feat: add campus head role + local user provisioning (#331) (#332) - Add CAMPUS_HEAD to UserRole enum and ScopeResolverService (Semester → Campus traversal) - Add POST /admin/users for non-Moodle local user provisioning (bcrypt + reserved "local-" prefix) - Add GET /admin/institutional-roles/campus-head-eligible-categories for depth-1 promotion - Add User.campus_source column + migration, mirror departmentSource/programSource pattern - Enforce local- namespace across Moodle inflows (sync skip guard, seed-users DTO rejection) - Extend controller guards (analytics/faculty/reports/curriculum) to allow CAMPUS_HEAD - Deny questionnaire submissions from CAMPUS_HEAD at service layer with clear message - Emit admin.user.create audit event manually from AdminUserService * [STAGING] FAC-133 feat: add faculty enrollments by id endpoint#335 * [STAGING] FAC-132 feat: role-aware analysis pipeline triggering and output surfacing (#336) (#337) Backend half of the FAC-132 integration slice — wires role + scope guards onto the analysis pipeline endpoints and exposes the list/discovery surface the frontend needs. - AnalysisController: @UseJwtGuard(DEAN, CHAIRPERSON, CAMPUS_HEAD, SUPER_ADMIN) at the class level with method-level widening to include FACULTY for GET reads; new GET /analysis/pipelines list endpoint; ParseUUIDPipe on :id params. - PipelineOrchestratorService: scope-authorization helpers (assertCanCreatePipeline, fillAndAssertListScope, assertCanAccessPipeline) gate Create/Confirm/Cancel/GetPipelineStatus/ GetRecommendations; scoped roles (DEAN/CHAIRPERSON/CAMPUS_HEAD) tried before FACULTY/STUDENT so multi-role Moodle users (DEAN+FACULTY) aren't falsely rejected; auto-fills the scope axis when the caller has exactly one assigned scope, else 400. - TD-8: partial unique index uq_analysis_pipeline_active_scope enforces one active pipeline per (semester, scope) tuple at the DB with a text-literal 'NONE' sentinel for nullable FKs. CreatePipeline wraps flush in a try/catch for UniqueConstraintViolationException and re- fetches the winner (idempotent race recovery). existingFilter now binds every non-provided scope field to null, matching the index. - TD-9: pipeline-status response 'scope' replaced with paired IDs + display values so the frontend can use IDs for cache keys and display values for UI. Create/Confirm/Cancel now also return PipelineSummary shape. - ScopeResolverService: add public ResolveCampusIds(semesterId) helper scoped to campuses hosting the given semester. - DI: AnalysisModule registers User (for RolesGuard.UserRepository) and imports DataLoaderModule (for CurrentUserInterceptor.UserLoader). - Tests: 62/62 passing — scope authorization matrix across all roles, 404-precedes-403 (AC-17), unique-index race handling, DEAN+FACULTY multi-role precedence, list-endpoint delegation. - Docs: Access Control section in analysis-pipeline.md and pipeline- scope addendum in scope-resolution.md. --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Aya <ayacoders@gmail.com>
1 parent 240b8a0 commit a010f4e

92 files changed

Lines changed: 12556 additions & 1112 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/publish-contract.yml

Lines changed: 0 additions & 96 deletions
This file was deleted.

_bmad-output/implementation-artifacts/tech-spec-fac-125-department-source-tracking.md

Lines changed: 570 additions & 0 deletions
Large diffs are not rendered by default.

_bmad-output/implementation-artifacts/tech-spec-fac-126-questionnaire-submission-authorization.md

Lines changed: 438 additions & 0 deletions
Large diffs are not rendered by default.

_bmad-output/implementation-artifacts/tech-spec-fac-127-admin-manual-scope-override.md

Lines changed: 798 additions & 0 deletions
Large diffs are not rendered by default.

_bmad-output/implementation-artifacts/tech-spec-fac-129-faculty-listing-home-department.md

Lines changed: 348 additions & 0 deletions
Large diffs are not rendered by default.

_bmad-output/implementation-artifacts/tech-spec-fac-130-analytics-mv-home-department.md

Lines changed: 429 additions & 0 deletions
Large diffs are not rendered by default.

_bmad-output/implementation-artifacts/tech-spec-fac-131-campus-head-role.md

Lines changed: 1689 additions & 0 deletions
Large diffs are not rendered by default.

_bmad-output/implementation-artifacts/tech-spec-fac-132-analysis-pipeline-interaction.md

Lines changed: 824 additions & 0 deletions
Large diffs are not rendered by default.

docs/ROADMAP.md

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
# Roadmap: api.faculytics
22

3-
This roadmap tracks the implementation status of the `api.faculytics` backend against the product direction. It reflects the checked-in `develop` branch as of 2026-03-31.
3+
This roadmap tracks the implementation status of the `api.faculytics` backend against the product direction. It reflects the checked-in `develop` branch as of 2026-04-12.
44

55
## Project Vision
66

77
Provide a robust, analytics-driven bridge between Moodle learning environments and institutional assessment workflows, enabling data-informed decisions through synchronized academic data, asynchronous AI enrichment, and structured feedback collection from direct submissions and file-based ingestion.
88

99
## Status Snapshot
1010

11-
| Phase | Status | Notes |
12-
| --------------------------------------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
13-
| Phase 1. Foundation & Core Synchronization | Complete | Core auth, Moodle sync, hydration, scheduling, and resilience are in place. |
14-
| Phase 2. Questionnaire & Ingestion Engine | Mostly complete | Questionnaire management, draft/submit flows, and CSV ingestion are live; self-serve file mapping is still pending. |
15-
| Phase 3. AI & Inference Pipeline | Mostly complete | End-to-end pipeline is shipped; production worker rollout and operator monitoring remain open. |
16-
| Phase 4. Analytics & Reporting Infrastructure | In progress | Materialized-view analytics, faculty reports, and PDF export are live; Excel export and long-term analytics scaling remain open. |
17-
| Phase 5. Governance & Ecosystem | In progress | Scoped access, admin tooling, and audit logging are implemented; finer-grained permissions and ecosystem integrations remain open. |
11+
| Phase | Status | Notes |
12+
| --------------------------------------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
13+
| Phase 1. Foundation & Core Synchronization | Complete | Core auth, Moodle sync, hydration, scheduling, and resilience are in place. |
14+
| Phase 2. Questionnaire & Ingestion Engine | Mostly complete | Questionnaire management, draft/submit flows, and CSV ingestion are live; self-serve file mapping is still pending. |
15+
| Phase 3. AI & Inference Pipeline | Mostly complete | End-to-end pipeline is shipped; production worker rollout and operator monitoring remain open. |
16+
| Phase 4. Analytics & Reporting Infrastructure | In progress | Materialized-view analytics, faculty reports, and PDF export are live; Excel export and long-term analytics scaling remain open. |
17+
| Phase 5. Governance & Ecosystem | In progress | Scoped access, admin tooling, audit logging, and audit query endpoints are implemented; finer-grained permissions and ecosystem integrations remain open. |
1818

1919
Cross-cutting platform capabilities already present in the codebase but not treated as separate roadmap phases include Redis-backed caching and throttling, structured health checks, request-scoped CLS metadata, and the authenticated `ChatKit` endpoint.
2020

@@ -28,7 +28,8 @@ Cross-cutting platform capabilities already present in the codebase but not trea
2828
- [x] **Institutional Hierarchy Sync:** Campuses, semesters, departments, programs, courses, and enrollments are mirrored from Moodle.
2929
- [x] **Per-User Hydration on Login:** Moodle logins refresh the user's courses, enrollments, sections, and institutional roles.
3030
- [x] **Section Sync from Moodle Groups:** Course groups are materialized locally as `Section` and attached to enrollments.
31-
- [x] **Institutional Authority Mapping:** Dean and chairperson scope is derived from Moodle category structure, with support for manual dean assignment.
31+
- [x] **Institutional Authority Mapping:** Dean and chairperson scope is derived from Moodle category structure, with support for manual dean assignment and a server-resolved dean-eligibility lookup for the admin UI.
32+
- [x] **User Scope & Role Backfill During Sync:** Enrollment sync now populates `user.campus/program/department` and derives `user.roles` from enrollments + institutional roles (protecting manually granted `SUPER_ADMIN`/`ADMIN`) as explicit post-enrollment phases.
3233
- [x] **Dynamic Sync Scheduling & SyncLog Observability:** Sync cadence is runtime-configurable and every run is recorded with per-phase metrics.
3334
- [x] **Semester Label Enrichment:** Moodle semester codes are parsed into display labels and academic year metadata.
3435
- [x] **Moodle Connectivity Resilience (FAC-33):** 10-second request timeouts and connectivity-specific failures prevent hanging auth and sync paths.
@@ -38,7 +39,7 @@ Cross-cutting platform capabilities already present in the codebase but not trea
3839

3940
- [x] **Recursive Schema Validation:** Questionnaire schemas enforce leaf-only questions and exact weight totals.
4041
- [x] **Dimension Registry & Admin API:** Canonical dimensions are seeded and can be managed through the dimensions module.
41-
- [x] **Questionnaire Lifecycle Management:** Questionnaires support creation, update, archive, publish, deprecate, and version detail flows.
42+
- [x] **Questionnaire Lifecycle Management:** Questionnaires support creation, update, archive, publish, deprecate, version detail, and version-from-template flows (draft seeded from any prior published version).
4243
- [x] **Institutional Snapshotting:** Submissions persist faculty, department, program, campus, and semester snapshots for historical stability.
4344
- [x] **Draft Save/Resume Flow:** Respondents can save, retrieve, list, and delete drafts before final submission.
4445
- [x] **Submission & Scoring:** Finalized submissions validate answers, compute normalized scores, and enforce duplicate-submission rules.
@@ -56,7 +57,8 @@ Cross-cutting platform capabilities already present in the codebase but not trea
5657
- [x] **Topic Modeling (FAC-46):** Topic discovery persists assignments, keywords, and run provenance.
5758
- [x] **Topic Labeling:** Topic clusters are labeled before recommendation generation.
5859
- [x] **Embedding Generation (FAC-46):** pgvector-backed embeddings are stored and upserted per submission.
59-
- [x] **Recommendations Engine v2 (FAC-55):** Recommendations are generated directly via OpenAI with structured output, confidence, and supporting evidence.
60+
- [x] **Recommendations Engine v2 (FAC-55):** Recommendations are generated directly via OpenAI with structured output, confidence, and pipeline-scoped supporting evidence (topic counts narrowed to the pipeline's `submissionIds`, preventing cross-faculty leakage).
61+
- [x] **LLM Worker Hardening:** Sentiment processor pins responses to the dispatched `submissionId` set, drops hallucinated IDs with observability logs, and terminally fails the stage when a batch is 100% hallucinated (retrying the LLM is counter-productive).
6062
- [x] **Worker Contracts & Inference Versioning:** Zod-validated contracts and version fields exist across pipeline runs.
6163
- [x] **Local Worker Simulation:** `mock-worker/` supports local development without deployed inference workers.
6264
- [ ] **RunPod / Production Worker Rollout:** Sentiment and topic-model stages still need production endpoint deployment and cutover.
@@ -81,9 +83,10 @@ Cross-cutting platform capabilities already present in the codebase but not trea
8183
- [x] **Scoped Dean/Chairperson Access:** `ScopeResolverService` restricts analytics, curriculum, and faculty queries to authorized departments and programs.
8284
- [x] **Institutional Role Administration:** Super admins can assign and remove manual dean/chairperson roles through admin endpoints.
8385
- [x] **Admin Directory APIs:** Super-admin endpoints support user listing, filtering, and institutional role management workflows.
84-
- [x] **Append-Only Audit Trail:** Auth, sync, questionnaire, and analysis actions are captured through the global audit pipeline.
86+
- [x] **Append-Only Audit Trail:** Auth, sync, questionnaire, analysis, and Moodle provisioning actions are captured through the global audit pipeline.
87+
- [x] **Audit Review Surface:** `GET /audit-logs` and `GET /audit-logs/:id` expose filterable, paginated audit queries (super-admin only) with stable ordering and LIKE-pattern sanitization.
88+
- [x] **Moodle Seeding Toolkit:** API-native provisioning of categories, bulk/quick courses, and fake users replaces the external Rust CLI, with live Moodle tree inspection and cascading admin filters.
8589
- [ ] **Fine-Grained Permission Model:** Access control is still role-centric rather than permission-centric.
86-
- [ ] **Audit Review Surface:** The write path exists, but there are no audit query/reporting endpoints for operators yet.
8790
- [ ] **Notification Engine:** Automated reminders and outbound notifications are still pending.
8891
- [ ] **External SIS Integration:** Moodle remains the only production integration surface for institutional data.
8992

docs/architecture/ai-inference-pipeline.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,19 @@ The recommendations stage does **not** use the batch message contract — see [R
119119

120120
See `docs/worker-contracts/` for full per-worker contracts.
121121

122+
### Dispatch-Set Pinning (LLM Workers)
123+
124+
Zod validates the **shape** of a worker response but cannot validate that the `submissionId` keys actually correspond to rows the API dispatched. For LLM-backed workers this matters: under some prompts the model hallucinates UUIDs that don't exist in the dispatched batch, and persisting them causes PostgreSQL FK violations that abort the whole batch transaction — losing even the valid results.
125+
126+
`SentimentProcessor.Persist()` pins the response against a dispatch set:
127+
128+
1. Build `dispatchedIds = new Set(job.data.items.map(i => i.submissionId))` before any DB work.
129+
2. Drop every result whose `submissionId` is not in `dispatchedIds`. Log `warn "Dropped X of Y sentiment results for run {runId} (unknown submissionIds)"` whenever the drop count is non-zero.
130+
3. If **all** results are dropped, call `orchestrator.OnStageFailed(pipelineId, 'sentiment_analysis', ...)` and return. Retry is not useful — more LLM calls will produce more hallucinations.
131+
4. The pre-existing `sentimentResultItemSchema.safeParse` loop still runs on the filtered set as a second validation layer.
132+
133+
Treat any new LLM-backed processor under `BaseAnalysisProcessor` as needing the same pattern. See [Decision #41 — LLM-Backed Worker Dispatch-Set Pinning](../decisions/decisions.md#41-llm-backed-worker-dispatch-set-pinning).
134+
122135
## 4. Sentiment Gate
123136

124137
Between sentiment analysis and topic modeling, a **sentiment gate** filters the corpus:
@@ -292,6 +305,8 @@ Each `RecommendedAction` stores a `supportingEvidence` JSONB column with:
292305
- **Confidence level:** HIGH / MEDIUM / LOW
293306
- **basedOnSubmissions:** Total comment count in scope
294307

308+
> **Pipeline-scoped counts.** `TopicSource.commentCount` is derived from `TopicAssignment` rows filtered by **both** `topic.id IN (...)` and `submission.id IN (pipelineSubmissionIds)`**not** from the `Topic.docCount` column. `Topic` is a shared entity: multiple pipelines across different faculty can produce assignments against the same topic, and `docCount` is a global counter over all of them. Scoping by `submissionIds` prevents cross-faculty evidence leakage and makes `confidenceLevel` reflect the current pipeline's evidence rather than the topic's global activity. Any future consumer of topic-derived evidence must apply the same scoping.
309+
295310
### Output Schema
296311

297312
Actions follow the `RecommendedActionItem` schema:

0 commit comments

Comments
 (0)