Pre release#50
Conversation
Dead-Bytes
commented
May 15, 2026
- Changes regarding factories for better handling in the pre-release
[TEST] feat/per-call-llm-creds — per-call LLM provider + API key overrides
[TEST] feat/pullfactory — PullFactory extension point + widened LLM payload overrides
… new phases and context
There was a problem hiding this comment.
Pull request overview
Pre-release refactor that introduces factory-based injection seams in the GitHub ingest pipeline, threads per-job LLM credential overrides from job payloads through every LLM call site, splits KnowledgeDoc into source (upstream/state) + info (repo coordinates), and adds a bootstrap hook (bootstrapRuntime + seedConfig + seedLoggerFactory) so downstream consumers can wire in their own config/logger without env reads.
Changes:
- Moves
repoUrl/branchoffGithubKnowledgeSourceonto a new requiredKnowledgeInfofield and updates all readers (routes, neo4j, mongo, pull pipeline). - Adds
PullFactoryinjection (mirror ofSourceFactory) and anAskLlmOptions"llmCallContext" bag plumbed payload →StrategyContext→ everyaskJsonLLM/askYesNoLLMcall. - Adds
seedConfig/seedLoggerFactory(with a Proxylogger), abootstrapRuntimeentry point, a hand-written.d.tspublic-surface shim for@bb/ingest-github, and refactorsanalyse-changedto read content throughSourceReaderinstead of disk.
Reviewed changes
Copilot reviewed 59 out of 61 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/types/src/{knowledge,job,index,README}.ts | New KnowledgeInfo, PayloadLlmOverrides; source no longer carries repoUrl/branch |
| packages/types/README.md | Docs for the new knowledge split and LLM override mixin |
| packages/server/src/{githubIndex,githubPull,githubCommits,localIndex}Route.ts | Read repo coordinates from info and 422 when missing |
| packages/neo4j/src/knowledge.ts | upsertKnowledgeNode reads info.repoUrl/info.branch |
| packages/mongo/src/processingStats.ts | Repo-name derivation now reads info.repoUrl |
| packages/logger/src/{logger,index}.ts + README | seedLoggerFactory, LoggerFactory type, Proxy default logger |
| packages/llm/src/{client,openrouter,index,README}.* | apiKey/provider per-call overrides; switch internal logging from console to logger |
| packages/llm/package.json | New @bb/logger workspace dep |
| packages/ingest-github/types/{index.d.ts,context.md} | Hand-written public .d.ts shim (mostly any-typed) |
| packages/ingest-github/src/types/{strategy,pipeline}.ts | llmCallContext on StrategyContext/ScanDeps/FileAnalyzer/SkipDeciderInput; new PullFactory* types |
| packages/ingest-github/src/pipeline/{pull,pull-helpers,run,scan,skip-decisions,README}.ts | runPull accepts a PullFactory; per-job LLM context extracted from payload; helpers extracted |
| packages/ingest-github/src/strategies/flat-folder/**/*.ts | Threads llmCallContext through every phase; analyse-changed now reads via SourceReader |
| packages/ingest-github/src/{adapters/llm-file-analyzer,bootstrap,index}.ts | Analyzer takes llmCallContext; new bootstrapRuntime; wider public re-exports |
| packages/ingest-github/{README,package.json} | pullFactory docs; .d.ts entry point and exports map |
| packages/config/src/{loader,writer,index}.ts + README | seedConfig / __isSeeded / ConfigSeededError (writes disabled when seeded) |
| packages/cli/src/output.d.ts | New generated-style declarations file |
| package.json | Adds --no-warn-ignored to the lint-staged eslint hook |
Comments suppressed due to low confidence (3)
packages/ingest-github/src/strategies/flat-folder/analyse-changed.ts:131
- Binary detection is now run on
Buffer.from(content, "utf8")wherecontentcame fromsource.readFile(...)which decodes as UTF-8 (the disk reader usesreadFile(..., "utf8")). Any non-UTF-8 bytes in a binary file are replaced with U+FFFD during decode, so the round-trip buffer no longer contains the original bytes andlooksBinary(which typically tests for NUL bytes / control chars in the raw bytes) will frequently miss real binaries it would previously have rejected. The contract forSourceReader.readFilereturningstringmakes binary detection at this layer unreliable — consider adding a raw-bytes read onSourceReader, or moving the binary check inside the reader.
if (looksBinary(Buffer.from(content, "utf8"))) {
skipped += 1;
continue;
}
packages/ingest-github/src/strategies/flat-folder/analyse-changed.ts:166
absolutePath: relativePathmislabels the field —ScannedFile.absolutePathis documented as an absolute path elsewhere (and the index-side path inpipeline/scan.tspopulates it withabs). Anything downstream that usesscanned.absolutePath(for example to re-read the file from disk, or to compute on-disk paths in the big-file phase) will now silently receive a relative path during pull, which can either crash or read from the worker's cwd. If the new design is thatabsolutePathis no longer meaningful when content is fetched through aSourceReader, the field should be made optional onScannedFile(or renamed) rather than populated with a misnomer.
const scanned: ScannedFile = {
kind: "file",
relativePath,
absolutePath: relativePath,
sizeBytes,
content,
};
packages/ingest-github/src/strategies/flat-folder/analyse-changed.ts:108
- Empty files are now silently counted as
skippedand never produce aCondensedFileAnalysis. Previously, a zero-byte file passed straight throughlooksBinary/countLines/tokenLen(all returning 0) and was analysed normally. This is a behavior change in the pull pipeline — an existing condensed analysis for a path that just got truncated to empty will not be invalidated/replaced. Consider whether the desired behavior is to fall through (analyse and produce an empty/condensed record) or to explicitly delete the prior analysis when a file is emptied.
if (content.length === 0) {
skipped += 1;
continue;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| logger.info(`pull: phase repo summary starting`); | ||
| throwIfCancelled(knowledgeId); | ||
| const orgId = resolveOrgId({ ...(knowledge.source.kind === "github" ? {} : {}) }); |
| @@ -114,19 +125,10 @@ export async function analyseChangedFiles(input: AnalyseChangedInput): Promise<A | |||
| continue; | |||
| } | |||
| export interface KnowledgeInfo { | ||
| repoUrl?: string; | ||
| branch?: string; | ||
| git_url?: string; | ||
| githubInfo?: { commitId?: string; commitHashes?: string[]; branchName?: string }; | ||
| [key: string]: unknown; | ||
| } | ||
|
|
||
| export interface KnowledgeDoc { | ||
| knowledgeId: string; | ||
| source: KnowledgeSource; | ||
| status: { state: KnowledgeState; totalFiles?: number; processedFiles?: number }; | ||
| createdAt: Date; | ||
| updatedAt: Date; | ||
| info: KnowledgeInfo; | ||
| } |
| export interface KnowledgeInfo { | ||
| repoUrl?: string; | ||
| branch?: string; | ||
| git_url?: string; | ||
| githubInfo?: { commitId?: string; commitHashes?: string[]; branchName?: string }; | ||
| [key: string]: unknown; | ||
| } |
| @@ -177,7 +177,7 @@ function deriveRepoName(doc: KnowledgeDoc): string { | |||
| } catch { | |||
| // fall through | |||
| } | |||
| return doc.source.repoUrl; | |||
| return doc.info.repoUrl ?? ""; | |||
| export const logger = new Proxy({} as winston.Logger, { | ||
| get(_target, prop, receiver) { | ||
| const actual = getLogger("server"); | ||
| const value = Reflect.get(actual, prop, receiver); | ||
| return typeof value === "function" ? (value as (...args: unknown[]) => unknown).bind(actual) : value; | ||
| }, | ||
| }); |
| import { seedConfig } from "@bb/config"; | ||
| import { seedLoggerFactory, type LoggerFactory } from "@bb/logger"; | ||
| import { connectMongo } from "@bb/mongo"; | ||
| import { connectNeo4j } from "@bb/neo4j"; | ||
|
|
||
| export interface BootstrapRuntimeOptions { | ||
| config: unknown; | ||
| loggerFactory: LoggerFactory; | ||
| } | ||
|
|
||
| export async function bootstrapRuntime(opts: BootstrapRuntimeOptions): Promise<void> { | ||
| seedConfig(opts.config); | ||
| seedLoggerFactory(opts.loggerFactory); | ||
| await connectMongo(); | ||
| await connectNeo4j(); | ||
| } |
| export function seedConfig(value: unknown): BytebellConfig { | ||
| cached = configSchema.parse(value); | ||
| seeded = true; | ||
| return cached; | ||
| } |
| export function registerGithubWorkers(deps: RegisterGithubWorkersDeps = {}): void { | ||
| const runner = buildRunner(deps.sourceFactory); | ||
| registerWorker(JobType.GithubIndex, createGithubIngestHandler({ runner })); | ||
| registerWorker(JobType.GithubPull, runPull); | ||
| const pullFactory = deps.pullFactory; | ||
| registerWorker(JobType.GithubPull, (msg) => runPull(msg, pullFactory)); | ||
| } |
| - **[knowledge.ts](knowledge.ts)** — the `KnowledgeState` enum modeling | ||
| the lifecycle in [CLAUDE.md](../../../CLAUDE.md). v0 only ships the | ||
| enum; the full `Knowledge` document interface lands when domain CRUD | ||
| helpers in `@bb/mongo` need it. | ||
| the lifecycle in [CLAUDE.md](../../../CLAUDE.md), plus the | ||
| `KnowledgeDoc` document interface and its substructures: | ||
| - `KnowledgeSource` is a discriminated union (`GithubKnowledgeSource | LocalKnowledgeSource`) | ||
| that captures **what kind of upstream produced this knowledge** plus per-kind | ||
| state. For github: `commitId` (current head) and `commitHashes` (history). | ||
| For local: `sourcePath`. `source` does **not** carry `repoUrl` or `branch` — | ||
| those live on `info` (see below). | ||
| - `KnowledgeInfo` carries the human-readable repo coordinates the pipeline | ||
| needs every run: `repoUrl`, `branch`, plus an open index signature so | ||
| downstream consumers can stash extra fields without forcing schema changes | ||
| here. The pull pipeline reads `knowledge.info.repoUrl` / `knowledge.info.branch` | ||
| directly — that's the single source of truth for the URL/branch, no fallback. | ||
| - `KnowledgeDoc` carries both: `source` for upstream-type + indexed-commit | ||
| state, `info` for repo coordinates. Both are required on every doc. |
[TEST] feature/progress-factory — progress reporting extension port for flat-folder strategy
| export async function setKnowledgeBranch(knowledgeId: string, branch: string): Promise<void> { | ||
| const result = await _getDb() | ||
| .collection(Collections.Knowledge) | ||
| .updateOne({ knowledgeId }, { $set: { "source.branch": branch, updatedAt: new Date() } }); | ||
| if (result.matchedCount === 0) { | ||
| throw new KnowledgeNotFoundError(knowledgeId); | ||
| } | ||
| } |
| progressContext.phaseChanged("indexing"); | ||
| logger.info(`pull: phase repo summary starting`); | ||
| throwIfCancelled(knowledgeId); | ||
| const orgId = resolveOrgId({ ...(knowledge.source.kind === "github" ? {} : {}) }); |
| export interface RepoEntry { | ||
| knowledgeId: string; | ||
| source: | ||
| | { kind: "github"; repoUrl: string; branch?: string; commitId?: string; commitHashes?: string[] } | ||
| | { kind: "local"; sourcePath: string }; | ||
| state: string; | ||
| createdAt: string; | ||
| updatedAt: string; | ||
| fileCount: number; | ||
| } |
| /** | ||
| * Commit fetching from GitHub REST API. | ||
| * | ||
| * SPDX-License-Identifier: AGPL-3.0-only WITH non-commercial-clause | ||
| */ | ||
|
|
||
| import { parseGithubRepo, USER_AGENT } from "./githubUrl.ts"; | ||
|
|
||
| export interface CommitEntry { | ||
| hash: string; | ||
| shortHash: string; | ||
| subject: string; | ||
| author: string; | ||
| date: string; | ||
| } | ||
|
|
||
| export type FetchCommitsResult = | ||
| | { status: "ok"; commits: CommitEntry[] } | ||
| | { status: "not_found" } | ||
| | { status: "unauthorized" } | ||
| | { status: "rate_limited" } | ||
| | { status: "error"; message: string }; | ||
|
|
||
| const COMMITS_PAGE_SIZE = 100; | ||
|
|
||
| interface GithubCommitPayload { | ||
| sha?: unknown; | ||
| commit?: { | ||
| message?: unknown; | ||
| author?: { name?: unknown; date?: unknown } | null; | ||
| committer?: { name?: unknown; date?: unknown } | null; | ||
| }; | ||
| } | ||
|
|
||
| /** | ||
| * Fetches up to `limit` commits on `branch` via GitHub's REST API. The | ||
| * server route uses this in place of `git log` against a shallow local | ||
| * clone — the picker should not depend on clone state. | ||
| * | ||
| * Paginates over `/commits` (capped at 100 per page) until either `limit` | ||
| * is reached or the upstream returns a short page. Unauthenticated calls | ||
| * work for public repos; private repos answer 404 until a token is | ||
| * supplied, at which point the CLI re-requests with `Authorization`. | ||
| */ | ||
| export async function fetchRecentCommits( | ||
| repoUrl: string, | ||
| branch: string, | ||
| limit: number, | ||
| gitToken?: string, | ||
| ): Promise<FetchCommitsResult> { | ||
| const parsed = parseGithubRepo(repoUrl); | ||
| if (parsed === null) { | ||
| return { status: "error", message: `unparseable github url: ${repoUrl}` }; | ||
| } | ||
| if (limit <= 0) { | ||
| return { status: "ok", commits: [] }; | ||
| } | ||
|
|
||
| const headers: Record<string, string> = { | ||
| Accept: "application/vnd.github+json", | ||
| "User-Agent": USER_AGENT, | ||
| "X-GitHub-Api-Version": "2022-11-28", | ||
| }; | ||
| if (gitToken !== undefined && gitToken.length > 0) { | ||
| headers["Authorization"] = `Bearer ${gitToken}`; | ||
| } | ||
|
|
||
| const collected: CommitEntry[] = []; | ||
| let page = 1; | ||
| while (collected.length < limit) { | ||
| const remaining = limit - collected.length; | ||
| const perPage = Math.min(COMMITS_PAGE_SIZE, remaining); | ||
| const url = | ||
| `https://api.github.com/repos/${parsed.owner}/${parsed.repo}/commits` + | ||
| `?sha=${encodeURIComponent(branch)}&per_page=${perPage}&page=${page}`; | ||
|
|
||
| let response: Response; | ||
| try { | ||
| response = await fetch(url, { headers }); | ||
| } catch (cause: unknown) { | ||
| const msg = cause instanceof Error ? cause.message : String(cause); | ||
| return { status: "error", message: `github fetch failed: ${msg}` }; | ||
| } | ||
|
|
||
| if (response.status === 404) { | ||
| return { status: "not_found" }; | ||
| } | ||
| if (response.status === 401) { | ||
| return { status: "unauthorized" }; | ||
| } | ||
| if (response.status === 403 && response.headers.get("x-ratelimit-remaining") === "0") { | ||
| return { status: "rate_limited" }; | ||
| } | ||
| if (!response.ok) { | ||
| const body = await response.text().catch(() => ""); | ||
| return { status: "error", message: `github ${response.status}: ${body.slice(0, 200)}` }; | ||
| } | ||
|
|
||
| const payload = (await response.json()) as GithubCommitPayload[]; | ||
| if (!Array.isArray(payload) || payload.length === 0) { | ||
| break; | ||
| } | ||
| for (const item of payload) { | ||
| const entry = toCommitEntry(item); | ||
| if (entry !== null) { | ||
| collected.push(entry); | ||
| if (collected.length >= limit) { | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| if (payload.length < perPage) { | ||
| break; | ||
| } | ||
| page += 1; | ||
| } | ||
|
|
||
| return { status: "ok", commits: collected }; | ||
| } | ||
|
|
||
| function toCommitEntry(raw: GithubCommitPayload): CommitEntry | null { | ||
| const sha = raw.sha; | ||
| if (typeof sha !== "string" || sha.length === 0) { | ||
| return null; | ||
| } | ||
| const message = typeof raw.commit?.message === "string" ? raw.commit.message : ""; | ||
| const subjectLine = message.split("\n", 1)[0] ?? ""; | ||
| const authorName = | ||
| typeof raw.commit?.author?.name === "string" | ||
| ? raw.commit.author.name | ||
| : typeof raw.commit?.committer?.name === "string" | ||
| ? raw.commit.committer.name | ||
| : ""; | ||
| const authorDate = | ||
| typeof raw.commit?.author?.date === "string" | ||
| ? raw.commit.author.date | ||
| : typeof raw.commit?.committer?.date === "string" | ||
| ? raw.commit.committer.date | ||
| : ""; | ||
| return { | ||
| hash: sha, | ||
| shortHash: sha.slice(0, 7), | ||
| subject: subjectLine, | ||
| author: authorName, | ||
| date: authorDate, | ||
| }; | ||
| } |
| export interface KnowledgeInfo { | ||
| repoUrl?: string; | ||
| branch?: string; | ||
| git_url?: string; | ||
| githubInfo?: { commitId?: string; commitHashes?: string[]; branchName?: string }; | ||
| [key: string]: unknown; | ||
| } |
| export interface KnowledgeInfo { | ||
| repoUrl?: string; | ||
| branch?: string; | ||
| git_url?: string; |
| "noEmit": false, | ||
| "emitDeclarationOnly": true |
| "@bb/config": "workspace:*", | ||
| "@bb/errors": "workspace:*", | ||
| "@bb/types": "workspace:*", | ||
| "@bb/mongo": "workspace:*", |
| const commits = result.commits.map((c) => ({ | ||
| hash: c.sha, | ||
| shortHash: c.sha.slice(0, 7), | ||
| subject: c.message.split("\n")[0] ?? "", | ||
| author: c.author, | ||
| date: c.timestamp, | ||
| })); | ||
| const payload: CommitsResponse = { knowledgeId, branch, commits }; |
| const body = req.body as ProbeBody; | ||
| if (typeof body.repoUrl !== "string" || body.repoUrl.length === 0) { | ||
| res.status(400).json({ error: "repoUrl required" }); | ||
| return; | ||
| } | ||
| const repoUrl = body.repoUrl; | ||
| const gitToken = typeof body.gitToken === "string" && body.gitToken.length > 0 ? body.gitToken : undefined; | ||
| const targetBranch = typeof body.branch === "string" && body.branch.length > 0 ? body.branch : undefined; |
[DONE] feature/tokensa — implement token usage tracking and fix queue-bootstrap schema mismatches
Fix/orgpaths
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 168 out of 173 changed files in this pull request and generated 13 comments.
Comments suppressed due to low confidence (1)
packages/types/src/knowledge.ts:37
KnowledgeInfomixes three overlapping representations of the same data:
repoUrlandgit_url(snake_case vs camelCase for the same concept),branchandgithubInfo.branchName,githubInfo.commitId/githubInfo.commitHashesduplicate the canonical fields onGithubKnowledgeSource.
The README explicitly states info is the single source of truth for repo coordinates with "no fallback" — but the shape published here invites the opposite: multiple writers will populate different fields, readers won't know which to trust, and the duplicated commit info will drift from source.commitId/source.commitHashes. Recommend removing git_url and githubInfo entirely and keeping KnowledgeInfo to { repoUrl, branch } (plus the open index signature).
Also, repoUrl is optional on the type while githubPullRoute.ts and githubCommitsRoute.ts now treat it as required (returning 422 when missing). Either make it required for github-kind docs, or model KnowledgeInfo as a discriminated union mirroring KnowledgeSource.
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| "noEmit": false, | ||
| "emitDeclarationOnly": true |
| const commits = result.commits.map((c) => ({ | ||
| hash: c.sha, | ||
| shortHash: c.sha.slice(0, 7), | ||
| subject: c.message.split("\n")[0] ?? "", | ||
| author: c.author, | ||
| date: c.timestamp, | ||
| })); | ||
| const payload: CommitsResponse = { knowledgeId, branch, commits }; |
| router.post("/api/v1/github/probe", async (req: Request, res: Response) => { | ||
| const body = req.body as ProbeBody; | ||
| if (typeof body.repoUrl !== "string" || body.repoUrl.length === 0) { | ||
| res.status(400).json({ error: "repoUrl required" }); | ||
| return; | ||
| } | ||
| const repoUrl = body.repoUrl; | ||
| const gitToken = typeof body.gitToken === "string" && body.gitToken.length > 0 ? body.gitToken : undefined; | ||
| const targetBranch = typeof body.branch === "string" && body.branch.length > 0 ? body.branch : undefined; | ||
|
|
||
| const result = await fetchDefaultBranch(repoUrl, gitToken); |
| const result = await fetchDefaultBranch(repoUrl, gitToken); | ||
| switch (result.status) { | ||
| case "ok": { | ||
| const defaultBranch = result.branch; | ||
| const branchesResult = await fetchBranches(repoUrl, gitToken); | ||
| const branches = branchesResult.status === "ok" ? branchesResult.branches : []; | ||
|
|
||
| if (targetBranch !== undefined && !branches.includes(targetBranch)) { | ||
| const suggestions = branches | ||
| .filter((b: string) => b.toLowerCase().includes(targetBranch.toLowerCase())) | ||
| .slice(0, 10); | ||
| res.status(404).json({ | ||
| status: "branch_not_found", | ||
| message: `Branch '${targetBranch}' not found.`, | ||
| branches: suggestions.length > 0 ? suggestions : branches.slice(0, 20), | ||
| }); | ||
| return; | ||
| } | ||
|
|
||
| res.status(200).json({ status: "ok", defaultBranch, branches }); | ||
| break; | ||
| } |
| source: { kind: "github" }, | ||
| info: { repoUrl, ...(branch !== undefined ? { branch } : {}) }, |
| "imports": { | ||
| "#src/*": "./src/*" | ||
| }, |
refactor: enhance LLM usage reporting and remove deprecated pricing logic
…e analysis phases
…r summary processing
…y across multiple files
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 180 out of 185 changed files in this pull request and generated 21 comments.
Comments suppressed due to low confidence (1)
packages/server/src/githubIndexRoute.ts:41
KnowledgeDocnow requires aninfoobject (repoUrl/branch live there), butupsertKnowledge()currently only writes{ source, status, updatedAt }and does not persistdoc.info. That will dropinfo.repoUrl/info.branchon insert, causing later pull/commits routes (which readknowledge.info.*) to fail with 422 for newly indexed repos.
| "incremental": true, | ||
| "noEmit": true | ||
| "noEmit": false, | ||
| "emitDeclarationOnly": true |
| { | ||
| "extends": "../../tsconfig.base.json", | ||
| "compilerOptions": { | ||
| "rootDir": "./src", | ||
| "outDir": "./dist" | ||
| }, | ||
| "include": ["src/**/*"] | ||
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| { | ||
| "extends": "../../tsconfig.base.json", | ||
| "compilerOptions": { | ||
| "rootDir": "./src", | ||
| "outDir": "./dist" | ||
| }, | ||
| "include": ["src/**/*"], | ||
| "references": [{ "path": "../ingest-github" }] | ||
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| { | ||
| "extends": "../../tsconfig.base.json", | ||
| "compilerOptions": { | ||
| "rootDir": "./src", | ||
| "outDir": "./dist" | ||
| }, | ||
| "include": ["src/**/*"] | ||
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| { | ||
| "extends": "../../tsconfig.base.json", | ||
| "compilerOptions": { | ||
| "rootDir": "./src", | ||
| "outDir": "./dist" | ||
| }, | ||
| "include": ["src/**/*"] | ||
| "extends": "../../../../tsconfig.base.json", | ||
| "include": ["src/**/*.ts", "src/**/*.tsx", "src/**/*.json"] |
| /** | ||
| * Updates the branch name of a GitHub knowledge entry. | ||
| */ | ||
| export async function setKnowledgeBranch(knowledgeId: string, branch: string): Promise<void> { | ||
| const result = await _getDb() | ||
| .collection(Collections.Knowledge) | ||
| .updateOne({ knowledgeId }, { $set: { "source.branch": branch, updatedAt: new Date() } }); | ||
| if (result.matchedCount === 0) { | ||
| throw new KnowledgeNotFoundError(knowledgeId); | ||
| } |
| const doc: KnowledgeDoc = { | ||
| knowledgeId, | ||
| source: { kind: "github", repoUrl, ...(branch !== undefined ? { branch } : {}) }, | ||
| source: { kind: "github" }, | ||
| info: { repoUrl, ...(branch !== undefined ? { branch } : {}) }, | ||
| status: { state: KnowledgeState.Created }, |
| "main": "./src/index.ts", | ||
| "types": "./src/index.ts", | ||
| "exports": { | ||
| ".": "./src/index.ts" | ||
| }, | ||
| "imports": { | ||
| "#src/*": "./src/*" |
| export async function upsertKnowledgeNode(doc: KnowledgeDoc): Promise<void> { | ||
| const sourceKind = doc.source.kind; | ||
| const sourceUrl = doc.source.kind === "github" ? doc.source.repoUrl : doc.source.sourcePath; | ||
| const branch = doc.source.kind === "github" ? (doc.source.branch ?? null) : null; | ||
| const sourceUrl = doc.source.kind === "github" ? (doc.info.repoUrl ?? "") : doc.source.sourcePath; | ||
| const branch = doc.source.kind === "github" ? (doc.info.branch ?? null) : null; | ||
| await _runCypher(UPSERT_KNOWLEDGE, { |
| function buildLlmOptions(payload: BusinessContextProcessingPayload): BusinessContextLlmOptions { | ||
| const opts: BusinessContextLlmOptions = {}; | ||
| if (payload.llmApiKey !== undefined) { | ||
| opts.apiKey = payload.llmApiKey; | ||
| } | ||
| if (payload.llmProvider !== undefined) { | ||
| opts.provider = payload.llmProvider; | ||
| } | ||
| if (payload.llmModel !== undefined) { | ||
| opts.model = payload.llmModel; | ||
| } |
[TEST] improvement/flat-folder-throughput — parallel scan, batched folder summaries, batched Neo4j upserts, concurrency-limited backfill