AcademySoftwareFoundation · thc1006 · Apr 18, 2026 · Apr 18, 2026 · Apr 18, 2026 · Apr 18, 2026
diff --git a/DEPLOYMENT.md b/DEPLOYMENT.md
@@ -230,6 +230,52 @@ echo -n "new-value" | gcloud secrets versions add SECRET_NAME --data-file=-
 
 ---
 
+## Transcript Publishing Setup (optional, issue #120)
+
+`POST /playlists/{id}/publish-transcript` is feature-flagged off by default.
+Turn it on only after the ShotGrid site is prepared.
+
+### ShotGrid site-side checklist
+
+1. In **Site Preferences -> Entities**, enable one of the `CustomEntityNN`
+   slots and set its display name (e.g. "DNA Note"). Note the slot
+   number — the API still addresses it as `CustomEntityNN`, not the
+   display name.
+2. On that custom entity, add the following fields:
+   - `code` (text, built-in)
+   - `project` (entity link -> Project, built-in)
+   - `sg_playlist` (entity link -> Playlist)
+   - `sg_versions` (multi-entity link -> Version)
+   - `sg_meeting_id` (text)
+   - `sg_meeting_date` (date)
+   - `sg_platform` (list: `google_meet`, `teams`)
+   - `sg_summary` (text, long; left blank by V1, users fill in manually)
+   - `sg_transcript_body` (text, long)
+3. Grant the DNA script user read/create/update on the new entity.
+
+### DNA side
+
+Set both variables. The endpoint stays 404 without the flag.
+
+```
+DNA_ENABLE_TRANSCRIPT_PUBLISH=true
+SHOTGRID_TRANSCRIPT_ENTITY=CustomEntity05   # whichever slot you enabled
+```
+
+For the frontend build, also set the Vite flag so the Publish button
+renders:
+
+```
+VITE_ENABLE_TRANSCRIPT_PUBLISH=true
+```
+
+If the flag is off or the custom entity has not been provisioned, the
+backend returns 404 on that route; the frontend does not show the
+Publish button. Dropping the flag reverts behaviour with no data
+migration.
+
+---
+
 ## Authentication Setup
 
 DNA uses Google OAuth for authentication. Users sign in with their Google accounts, and the backend validates Google tokens.

diff --git a/QUICKSTART.md b/QUICKSTART.md
@@ -152,6 +152,8 @@ The React app will be available at `http://localhost:5173`.
 | `GEMINI_MODEL` | No | `gemini-2.5-flash` | Gemini model to use when `LLM_PROVIDER=gemini` |
 | `GEMINI_TIMEOUT` | No | `30.0` | Request timeout in seconds when `LLM_PROVIDER=gemini` |
 | `GEMINI_URL` | No | `https://generativelanguage.googleapis.com/v1beta/openai/` | Override the Gemini OpenAI-compatible base URL |
+| `DNA_ENABLE_TRANSCRIPT_PUBLISH` | No | `false` | Set to `true` to enable `POST /playlists/{id}/publish-transcript`. When off, the endpoint returns 404. |
+| `SHOTGRID_TRANSCRIPT_ENTITY` | No | `CustomEntity01` | ShotGrid custom entity slot used when publishing transcripts. Match whichever `CustomEntityNN` the site admin has enabled. |
 | `PYTHONUNBUFFERED` | No | `1` | Disable Python output buffering |
 
 ### Vexa Service (`vexa` service)

diff --git a/backend/docs/TRANSCRIPTION_PIPELINE.md b/backend/docs/TRANSCRIPTION_PIPELINE.md
@@ -1340,3 +1340,95 @@ logging.getLogger("dna.events.event_publisher").setLevel(logging.DEBUG)
 - The bot remains in the meeting during pause, ready to resume instantly
 - `transcription_resumed_at` prevents replay of stale segments
 - Minimal state changes: only a boolean flag and an optional timestamp
+
+---
+
+## Publishing to the Production Tracking System
+
+Tracked by issue #120. Off by default behind `DNA_ENABLE_TRANSCRIPT_PUBLISH=true`.
+
+### Pipeline
+
+```
+POST /playlists/{playlist_id}/publish-transcript {version_id}
+  -> storage.get_playlist_metadata(playlist_id)           # meeting_id, platform
+  -> storage.get_segments_for_version(...)                # existing call
+  -> build_transcript_payload(segments)                   # pure, dedupe + collapse
+  -> storage.get_published_transcript(...)                # bookkeeping lookup
+  -> prodtrack.publish_transcript(entity_type from env, ...)
+       # create path: reads SHOTGRID_TRANSCRIPT_ENTITY
+     / prodtrack.update_transcript(entity_type=existing.sg_entity_type, ...)
+       # update path: honours the bookkeeping row, not the current env
+  -> storage.upsert_published_transcript(...)
+  -> { transcript_entity_id, outcome: created | updated | skipped }
+```
+
+### Collections touched
+
+| Collection | Used for |
+|------------|----------|
+| `segments` | Source of the transcript body (read-only here) |
+| `playlist_metadata` | Pulls `meeting_id` + `platform` |
+| `published_transcripts` | Stores the SG entity ID and body_hash per `(playlist_id, version_id, meeting_id)` |
+
+### ShotGrid side
+
+Publishes a row into `SHOTGRID_TRANSCRIPT_ENTITY` (default `CustomEntity01`).
+Payload mapping:
+
+| DNA field | ShotGrid field |
+|-----------|----------------|
+| `code` (auto) | `code` |
+| `project_id` | `project` |
+| `playlist_id` | `sg_playlist` |
+| `[version_id]` | `sg_versions` |
+| `meeting_id` | `sg_meeting_id` |
+| `meeting_date` | `sg_meeting_date` |
+| `platform` | `sg_platform` |
+| `body` | `sg_transcript_body` |
+
+`sg_summary` is intentionally left blank in V1 so studio staff can fill it
+on the ShotGrid side without the publisher overwriting it.
+
+### ADR-005: Custom entity, not a ShotGrid Note
+
+**Decision:** Transcripts live in a custom entity (configurable via
+`SHOTGRID_TRANSCRIPT_ENTITY`), not as ShotGrid `Note` rows.
+
+**Rationale:**
+- Notes are tied to review addressings and read state; transcripts are
+  reference material with different fields.
+- Admins can restrict the custom-entity page per the mockup on #120
+  without affecting Notes.
+- The field shape (playlist link + multi-version link + `sg_platform`
+  list + long `sg_transcript_body`) does not fit Note's schema.
+
+### ADR-006: Idempotence via body_hash in MongoDB, not SG lookup
+
+**Decision:** Track which `(playlist, version, meeting)` tuples have
+been published in a local Mongo collection. Skip re-publish when the
+new body_hash matches the stored one. The bookkeeping row also stores
+`sg_entity_type`; the update path uses that value instead of the
+current `SHOTGRID_TRANSCRIPT_ENTITY` env so studios can migrate to a
+new custom-entity slot without breaking updates on already-published
+rows.
+
+**Rationale:**
+- SG is not efficiently queryable for "has this been published before".
+- The existing DraftNote publish path uses the same pattern
+  (`published_note_id` on the draft).
+- Loss of the Mongo row is a known edge-case; duplicate SG rows in that
+  scenario are an acceptable V1 trade-off documented on issue #120.
+- Pinning the entity_type to the bookkeeping row (not env) prevents
+  misdirected updates after a slot migration.
+
+### ADR-007: Build publishable body at publish time, not ingest time
+
+**Decision:** `build_transcript_payload` is called inside the publish
+endpoint, not in the ingest pipeline.
+
+**Rationale:**
+- Dedup rules may change once issue #135 lands (Vexa-side segment IDs
+  become authoritative). Keeping the builder isolated means that change
+  is one file here rather than a re-ingest.
+- The builder is pure and trivially testable, unlike the ingest loop.
diff --git a/backend/example.docker-compose.local.yml b/backend/example.docker-compose.local.yml
@@ -14,3 +14,8 @@ services:
       - VEXA_API_URL=http://vexa:8056
       - OPENAI_API_KEY=your-openai-api-key
       - AUTH_PROVIDER=none
+      # Transcript publishing (V1, disabled by default). Set to "true" to
+      # expose POST /playlists/{id}/publish-transcript. See DEPLOYMENT.md
+      # for the ShotGrid site-setup checklist the custom entity depends on.
+      - DNA_ENABLE_TRANSCRIPT_PUBLISH=false
+      - SHOTGRID_TRANSCRIPT_ENTITY=CustomEntity01
diff --git a/backend/src/dna/models/__init__.py b/backend/src/dna/models/__init__.py
@@ -27,6 +27,10 @@
     PlaylistMetadata,
     PlaylistMetadataUpdate,
 )
+from dna.models.published_transcript import (
+    PublishedTranscript,
+    PublishedTranscriptUpdate,
+)
 from dna.models.requests import (
     CreateNoteRequest,
     EntityLink,
@@ -36,6 +40,8 @@
     GenerateNoteResponse,
     PublishNotesRequest,
     PublishNotesResponse,
+    PublishTranscriptRequest,
+    PublishTranscriptResponse,
     SearchRequest,
     SearchResult,
     StatusOption,
@@ -69,6 +75,7 @@
     "Version",
     "Playlist",
     "User",
+    "Transcript",
     "DNAEntity",
     "ENTITY_MODELS",
     "EntityLink",
@@ -82,13 +89,17 @@
     "StatusOption",
     "PublishNotesRequest",
     "PublishNotesResponse",
+    "PublishTranscriptRequest",
+    "PublishTranscriptResponse",
     "DraftNote",
     "DraftNoteBase",
     "DraftNoteCreate",
     "DraftNoteLink",
     "DraftNoteUpdate",
     "PlaylistMetadata",
     "PlaylistMetadataUpdate",
+    "PublishedTranscript",
+    "PublishedTranscriptUpdate",
     "StoredSegment",
     "StoredSegmentCreate",
     "generate_segment_id",
@@ -97,7 +108,6 @@
     "BotStatusEnum",
     "DispatchBotRequest",
     "Platform",
-    "Transcript",
     "TranscriptSegment",
     "UserSettings",
     "UserSettingsUpdate",

diff --git a/backend/src/dna/models/published_transcript.py b/backend/src/dna/models/published_transcript.py
@@ -0,0 +1,45 @@
+"""Published transcript bookkeeping model.
+
+Tracks which (playlist, version, meeting) has already been pushed to the
+production tracking system so re-publishing can be idempotent. The actual
+transcript content lives in SG; here we only keep the reference plus a
+body_hash used to skip no-op re-publishes.
+"""
+
+from datetime import datetime
+from typing import Optional
+
+from pydantic import BaseModel, ConfigDict, Field
+
+
+class PublishedTranscriptUpdate(BaseModel):
+    """Upsert payload for the published_transcripts collection."""
+
+    playlist_id: int
+    version_id: int
+    meeting_id: str
+    sg_entity_type: str = Field(
+        description="Custom entity type in the tracking system (e.g. CustomEntity01)"
+    )
+    sg_entity_id: int = Field(description="ID of the row created in tracking system")
+    author_email: str
+    body_hash: str = Field(description="sha256 of the published body for idempotence")
+    segments_count: int
+
+
+class PublishedTranscript(BaseModel):
+    """Full record for a row we have pushed to the tracking system."""
+
+    model_config = ConfigDict(populate_by_name=True)
+
+    id: str = Field(alias="_id")
+    playlist_id: int
+    version_id: int
+    meeting_id: str
+    sg_entity_type: str
+    sg_entity_id: int
+    author_email: str
+    body_hash: str
+    segments_count: int
+    created_at: datetime
+    updated_at: datetime
diff --git a/backend/src/dna/models/requests.py b/backend/src/dna/models/requests.py
@@ -117,3 +117,20 @@ class PublishNotesResponse(BaseModel):
     skipped_count: int
     failed_count: int
     total: int
+
+
+class PublishTranscriptRequest(BaseModel):
+    """Request to publish a version's captured transcript."""
+
+    version_id: int = Field(description="Version whose segments to publish")
+
+
+class PublishTranscriptResponse(BaseModel):
+    """Response from the publish-transcript endpoint."""
+
+    transcript_entity_id: int = Field(
+        description="Entity ID of the row in the tracking system"
+    )
+    outcome: str = Field(description="created | updated | skipped")
+    skipped_reason: Optional[str] = None
+    segments_count: int
diff --git a/backend/src/dna/prodtrack_providers/mock_provider.py b/backend/src/dna/prodtrack_providers/mock_provider.py
@@ -596,3 +596,15 @@ def attach_file_to_note(
         self, note_id: int, file_path: str, display_name: str
     ) -> bool:
         return True
+
+    def publish_transcript(self, **_: object) -> int:
+        raise NotImplementedError(
+            "Transcript publishing requires a live ShotGrid connection. "
+            "Set PRODTRACK_PROVIDER=shotgrid to use it."
+        )
+
+    def update_transcript(self, **_: object) -> bool:
+        raise NotImplementedError(
+            "Transcript publishing requires a live ShotGrid connection. "
+            "Set PRODTRACK_PROVIDER=shotgrid to use it."
+        )
diff --git a/backend/src/dna/prodtrack_providers/prodtrack_provider_base.py b/backend/src/dna/prodtrack_providers/prodtrack_provider_base.py
@@ -1,4 +1,5 @@
 import os
+from datetime import date
 from typing import TYPE_CHECKING, Any
 
 if TYPE_CHECKING:
@@ -190,6 +191,43 @@ def attach_file_to_note(
         """
         raise NotImplementedError("Subclasses must implement this method.")
 
+    def publish_transcript(
+        self,
+        *,
+        project_id: int,
+        playlist_id: int,
+        version_id: int,
+        meeting_id: str,
+        meeting_date: date,
+        platform: str,
+        body: str,
+    ) -> int:
+        """Create a transcript row in the production tracking system.
+
+        Returns the entity ID of the newly-created row.
+        """
+        raise NotImplementedError("Subclasses must implement this method.")
+
+    def update_transcript(
+        self,
+        *,
+        entity_type: str,
+        entity_id: int,
+        body: str,
+        meeting_date: date,
+    ) -> bool:
+        """Update body + meeting_date on an existing transcript entity.
+
+        `entity_type` must come from the caller's bookkeeping (whichever
+        custom-entity slot the row was originally created in). Reading the
+        current env var here would misfire if studios migrate between slots.
+
+        Only body and meeting_date are touched on purpose; summary and other
+        fields are left alone so manual edits on the tracking-system side
+        survive a re-publish.
+        """
+        raise NotImplementedError("Subclasses must implement this method.")
+
 
 def get_prodtrack_provider() -> ProdtrackProviderBase:
     """Get the production tracking provider."""