Conversation
…CI/CD - OpenAI streaming chat endpoint with get_events and provide_links tools - Server actions for knowledge CRUD, reseed, import, usage metrics - Event filtering/formatting utilities with timezone-aware LA time handling - System prompt builder with profile-aware personalization and prefix caching - Vector search context retrieval with retry/backoff - Knowledge base JSON (55 entries: FAQ, tracks, judging, submission, general) - CI/CD seed scripts for hackbot_knowledge to hackbot_docs - Auth session extended with position, is_beginner, name fields - Tailwind hackbot-slide-in animation keyframe - Dependencies: ai@6, @ai-sdk/openai
|
Closes #441 |
…o hackbot-server-core
There was a problem hiding this comment.
Pull request overview
Introduces the initial “HackBot” server core: a streaming chat API route backed by OpenAI + MongoDB vector search, plus supporting hackbot utilities/actions and deployment seeding.
Changes:
- Added
/api/hackbot/streamstreaming endpoint withget_events/provide_linkstools and custom data-stream output. - Implemented hackbot utilities (system prompt builder, event filtering/formatting, retry/backoff, embeddings) + server actions for knowledge/metrics.
- Added CI/CD seed scripts and workflow steps to embed knowledge docs into
hackbot_docs; added new AI SDK dependencies.
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| tailwind.config.ts | Adds hackbot-slide-in keyframe + animation for HackBot UI. |
| scripts/hackbotSeedCI.mjs | CI seeding script to upsert knowledge docs into hackbot_docs with embeddings. |
| scripts/hackbotSeed.mjs | Local interactive seeding script for hackbot docs. |
| package.json | Adds hackbot:seed script + AI SDK dependencies. |
| package-lock.json | Locks new dependency graph for ai / @ai-sdk/openai and transitive deps. |
| auth.ts | Extends NextAuth user/session/jwt fields for HackBot personalization. |
| app/_types/hackbot.ts | Adds shared HackBot types for docs/messages/events/links. |
| app/_data/hackbot_knowledge_import.json | Adds initial knowledge base content for importing. |
| app/(api)/api/hackbot/stream/route.ts | Implements streaming HackBot endpoint + get_events and provide_links tools. |
| app/(api)/_utils/hackbot/systemPrompt.ts | Adds system prompt builder with profile/page-context personalization and caching strategy. |
| app/(api)/_utils/hackbot/retryWithBackoff.ts | Adds retry/backoff helper used by vector search embedding step. |
| app/(api)/_utils/hackbot/eventFormatting.ts | Adds LA-timezone-aware date parsing/formatting helpers. |
| app/(api)/_utils/hackbot/eventFiltering.ts | Adds profile relevance/recommendation and time-filtering helpers. |
| app/(api)/_utils/hackbot/embedText.ts | Adds embedding helper using the ai SDK + OpenAI embedding model. |
| app/(api)/_datalib/hackbot/getHackbotContext.ts | Adds vector-search context retrieval from hackbot_docs. |
| app/(api)/_actions/hackbot/saveKnowledgeDoc.ts | Adds server action to create/update knowledge docs + embeddings. |
| app/(api)/_actions/hackbot/reseedHackbot.ts | Adds server action to re-embed all knowledge docs into hackbot_docs. |
| app/(api)/_actions/hackbot/importKnowledgeDocs.ts | Adds server action to bulk import knowledge docs + embeddings. |
| app/(api)/_actions/hackbot/getUsageMetrics.ts | Adds server action to aggregate token usage metrics. |
| app/(api)/_actions/hackbot/getKnowledgeDocs.ts | Adds server action to list knowledge docs for admin UI. |
| app/(api)/_actions/hackbot/getHackerProfile.ts | Adds server action to read profile fields from session for prompt personalization. |
| app/(api)/_actions/hackbot/deleteKnowledgeDoc.ts | Adds server action to delete knowledge docs and embedded docs. |
| app/(api)/_actions/hackbot/clearKnowledgeDocs.ts | Adds server action to clear knowledge + embedded docs. |
| .github/workflows/staging.yaml | Runs hackbot seeding during deploy and syncs OpenAI-related env vars. |
| .github/workflows/production.yaml | Runs hackbot seeding during deploy and syncs OpenAI-related env vars. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const { messages, currentPath } = await request.json(); | ||
|
|
||
| if (!Array.isArray(messages) || messages.length === 0) { | ||
| return Response.json({ error: 'Invalid request' }, { status: 400 }); | ||
| } | ||
|
|
||
| const lastMessage = messages[messages.length - 1]; | ||
|
|
||
| if (lastMessage.role !== 'user') { | ||
| return Response.json( | ||
| { error: 'Last message must be from user.' }, | ||
| { status: 400 } | ||
| ); |
There was a problem hiding this comment.
The request validation only enforces that the last message is from the user; earlier entries in messages can still have role: "system" (or other unexpected roles), letting a caller inject/override your system prompt. Sanitize messages to allow only expected roles (user/assistant) and expected shapes before passing them to the model.
| // Run auth and context retrieval in parallel to save ~400 ms. | ||
| let session; | ||
| let docs; | ||
| try { | ||
| [session, { docs }] = await Promise.all([ | ||
| auth(), | ||
| isSimpleGreeting | ||
| ? Promise.resolve({ docs: [] }) | ||
| : retrieveContext(lastMessage.content), | ||
| ]); | ||
| } catch (e) { | ||
| console.error('[hackbot][stream] Context retrieval error', e); | ||
| return Response.json( | ||
| { error: 'Search backend unavailable. Please contact an organizer.' }, | ||
| { status: 500 } | ||
| ); | ||
| } | ||
|
|
||
| // Build profile from session (after auth resolves) | ||
| const sessionUser = (session as any)?.user as any; | ||
| const profile: HackerProfile | null = sessionUser | ||
| ? { | ||
| name: sessionUser.name ?? undefined, | ||
| position: sessionUser.position ?? undefined, | ||
| is_beginner: sessionUser.is_beginner ?? undefined, | ||
| } | ||
| : null; |
There was a problem hiding this comment.
Because the endpoint proceeds even when auth() returns no session, unauthenticated callers can still trigger OpenAI calls and incur cost. If this route is intended for logged-in hackers only, return 401/403 when session is missing; otherwise add rate limiting / abuse protection (IP-based throttling, CAPTCHA, etc.).
| // Run auth and context retrieval in parallel to save ~400 ms. | ||
| let session; | ||
| let docs; | ||
| try { | ||
| [session, { docs }] = await Promise.all([ | ||
| auth(), | ||
| isSimpleGreeting | ||
| ? Promise.resolve({ docs: [] }) | ||
| : retrieveContext(lastMessage.content), | ||
| ]); | ||
| } catch (e) { | ||
| console.error('[hackbot][stream] Context retrieval error', e); | ||
| return Response.json( | ||
| { error: 'Search backend unavailable. Please contact an organizer.' }, | ||
| { status: 500 } | ||
| ); | ||
| } |
There was a problem hiding this comment.
Promise.all([auth(), retrieveContext(...)]) errors are all reported as a "Context retrieval" failure, even if the underlying failure was auth() (or JSON parsing). Split the try/catch so auth failures and search failures return accurate errors/status codes.
| "seed": "run-script-os", | ||
| "seed:nix": "node --env-file='.env' \"scripts/dbSeed.mjs\"", | ||
| "seed:windows": "node --env-file=\".\\.env\" \".\\scripts\\dbSeed.mjs\"", | ||
| "hackbot:seed": "node --env-file='.env' scripts/hackbotSeedCI.mjs", |
There was a problem hiding this comment.
In CI, npm run hackbot:seed runs node --env-file='.env' ... but the workflow injects env vars directly and the repo does not include a .env file. This can cause the seed step to fail (or depend on a file that won't exist in Actions). Consider removing --env-file from hackbot:seed (or making it optional) so CI relies on the provided environment.
| "hackbot:seed": "node --env-file='.env' scripts/hackbotSeedCI.mjs", | |
| "hackbot:seed": "node scripts/hackbotSeedCI.mjs", |
| declare module 'next-auth/jwt' { | ||
| interface JWT { | ||
| id: string; | ||
| email: string; | ||
| role: string; | ||
| position?: string; | ||
| is_beginner?: boolean; | ||
| } |
There was a problem hiding this comment.
JWT is missing the name field, but the callbacks assign token.name and later read it to populate session.user.name. This will cause TypeScript errors (and makes the augmented types inaccurate). Add name?: string | null to the next-auth/jwt JWT interface (and keep the User/Session typings consistent).
| ]; | ||
| } | ||
| if (tags && tags.length > 0) | ||
| query.tags = { $in: tags.map((t) => t.toLowerCase()) }; |
There was a problem hiding this comment.
The tags tool parameter is documented as requiring events to have ALL specified tags, but the Mongo query uses $in, which matches events having any one of the tags. Use $all (or an $and of $in clauses) to match the documented behavior, or update the tool schema/description to reflect OR semantics.
| query.tags = { $in: tags.map((t) => t.toLowerCase()) }; | |
| query.tags = { $all: tags.map((t) => t.toLowerCase()) }; |
| export function isEventRelevantToProfile( | ||
| ev: any, | ||
| profile: HackerProfile | null | ||
| ): boolean { | ||
| const tags: string[] = Array.isArray(ev.tags) | ||
| ? ev.tags.map((t: string) => t.toLowerCase()) | ||
| : []; | ||
| const roleTags = tags.filter((t) => ROLE_TAGS.has(t)); | ||
| // No role tags → relevant to everyone | ||
| if (roleTags.length === 0) return true; | ||
| // No profile or profile has no useful fields → show everything | ||
| if (!profile) return false; | ||
| if (!profile.position && profile.is_beginner === undefined) return true; | ||
| if (profile.position && roleTags.includes(profile.position.toLowerCase())) | ||
| return true; | ||
| if (profile.is_beginner && roleTags.includes('beginner')) return true; | ||
| return false; |
There was a problem hiding this comment.
The comment says "No profile or profile has no useful fields → show everything", but the implementation returns false when profile is null and the event has role tags. Either adjust the comment or the logic so they match (this affects how unauthenticated/unknown users see tagged events).
| const docs: KnowledgeDoc[] = raw.map((d: any) => ({ | ||
| id: String(d._id), | ||
| type: d.type, | ||
| title: d.title, | ||
| content: d.content, | ||
| url: d.url ?? null, | ||
| createdAt: d.createdAt?.toISOString?.() ?? new Date().toISOString(), | ||
| updatedAt: d.updatedAt?.toISOString?.() ?? new Date().toISOString(), | ||
| })); |
There was a problem hiding this comment.
When mapping DB docs, missing createdAt/updatedAt are replaced with new Date().toISOString(), which can misrepresent the true timestamps and make the UI look like docs were just updated. Prefer returning null/omitting the field, or explicitly indicating the timestamp is unknown.
HackBot server core: API route, actions, utils, types, data, and CI/CD
What's Added:
All work was done on
hackbotbranch and moved to this one for PR purposes