Skip to content

Merge test/fix index dorivika #59

Open
Dead-Bytes wants to merge 2 commits into
mainfrom
merge-test/fix_index_Dorivika
Open

Merge test/fix index dorivika #59
Dead-Bytes wants to merge 2 commits into
mainfrom
merge-test/fix_index_Dorivika

Conversation

@Dead-Bytes
Copy link
Copy Markdown
Collaborator

What changed

  • Added shared commit metadata helpers to validate and normalize indexed Git commit hashes.
  • Updated /api/v1/repos response shaping so legacy commitId: "latest" values do not leak into bytebell ls.
  • Updated bytebell ls and repo selector details to display the resolved indexed commit SHA instead of trusting raw source.commitId.
  • Updated GitHub pull no-op logic to use the same resolved indexed commit and normalize target SHAs before comparing.
  • Added write-side guards so invalid GitHub commit values such as "latest" cannot be persisted through setKnowledgeCommit or upsertKnowledge.
  • Added focused Bun tests for legacy "latest" metadata, mixed commit history shapes, invalid hashes, uppercase SHAs, dedupe, and empty/missing commit history.

Why

Fixes #55.

bytebell ls could show latest as the indexed commit for GitHub repositories. The issue was in the metadata/listing path: consumers trusted source.commitId directly, so a stale sentinel value like "latest" was rendered as if it were a real commit.

The fix makes commit display resolve only from valid 40-character Git SHAs. If commitId is invalid, ByteBell falls back to the newest valid SHA in commitHashes. Existing metadata is repaired at read time without requiring a Mongo migration, and future invalid writes are rejected.

Dorivika and others added 2 commits May 20, 2026 19:28
[TEST] bugfix/indexed-commit-display — resolve bytebell ls commit display
@Dead-Bytes Dead-Bytes requested review from Copilot and nitesh32 May 21, 2026 11:28
@Dead-Bytes Dead-Bytes added enhancement New feature or request Ready for Review Ready for review from reviewers testing changes are being tested labels May 21, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes issue #55 where GitHub repositories could display commitId: "latest" as if it were a real indexed commit. It introduces shared helpers to validate/normalize full 40-character Git SHAs, resolves the indexed commit from valid data only, and hardens both API responses and write paths to prevent invalid commit values from leaking or being persisted.

Changes:

  • Added commit SHA validation/normalization helpers (isFullCommitHash, normalizeCommitHashes, resolveIndexedCommit) and tests in @bb/types.
  • Normalized /api/v1/repos source payloads (including legacy shapes) so commitId="latest" doesn’t reach CLI consumers.
  • Updated CLI listing/details and GitHub pull no-op/validation logic to use resolved, normalized SHAs; added write-side guards in Mongo persistence.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/types/src/knowledge.ts Adds commit-hash helpers and extends commitHashes typing to support legacy record entries.
packages/types/src/knowledge.test.ts Adds unit tests covering valid/invalid hashes, legacy shapes, normalization, and resolution behavior.
packages/types/src/index.ts Re-exports new helpers/types from @bb/types.
packages/server/src/reposRoute.ts Normalizes source in /api/v1/repos responses to avoid leaking legacy latest.
packages/server/src/knowledgeSourcePresenter.ts Adds presenter utilities to merge/normalize legacy repo metadata into a clean KnowledgeSource.
packages/server/src/knowledgeSourcePresenter.test.ts Adds tests ensuring legacy latest doesn’t leak and current sources are normalized.
packages/server/src/githubPullRoute.ts Uses shared hash validation + resolved indexed commit for no-op comparisons and validation.
packages/mongo/src/knowledge.ts Adds write-side commit validation/normalization guards for setKnowledgeCommit and upsertKnowledge.
packages/ingest-github/src/pipeline/run.ts Validates and lowercases resolved HEAD commit before persisting.
packages/ingest-github/src/pipeline/pull.ts Validates target/HEAD SHAs and uses resolved current commit for no-op checks.
packages/cli/src/repoSelectorPrompt.ts Displays resolved commit and normalized commit count in repo selector details.
packages/cli/src/LsCommand.ts Displays resolved commit and normalized commit count in bytebell ls.
Comments suppressed due to low confidence (2)

packages/server/src/githubPullRoute.ts:70

  • The 422 response message says the knowledge "has no commitId", but this branch now triggers when there is no resolvable full SHA (e.g. legacy commitId="latest" with empty/invalid commitHashes). Updating the message to reflect "no valid indexed commit" would make the error accurate and easier to debug.
    const currentCommit = resolveIndexedCommit(knowledge.source);
    if (currentCommit === undefined) {
      res.status(422).json({
        error: "knowledge not yet indexed",
        message: "pull requires a previously-indexed commit; this knowledge has no commitId. Run github_index first.",
      });

packages/ingest-github/src/pipeline/pull.ts:63

  • The error text says the knowledge "has no commitId", but the new logic rejects when there is no resolvable full SHA (including legacy commitId="latest"). Consider updating the message to something like "no valid indexed commit" so it matches the actual failure condition.
  const currentCommit = resolveIndexedCommit(knowledge.source);
  if (currentCommit === undefined) {
    throw new IngestError(
      knowledgeId,
      "pull requires a previously-indexed commit; this knowledge has no commitId. Run github_index first.",
    );

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +70
const hashes: string[] = [];
const seen = new Set<string>();
for (const item of value) {
const hash =
typeof item === "string"
? item
: typeof item === "object" && item !== null && typeof (item as { hash?: unknown }).hash === "string"
? (item as { hash: string }).hash
: "";
if (isFullCommitHash(hash)) {
const normalized = hash.toLowerCase();
if (!seen.has(normalized)) {
seen.add(normalized);
hashes.push(normalized);
}
}
Comment on lines 27 to 47
/**
* Records that this knowledge is now indexed at `commitHash`. Sets it as the
* current head pointer (`source.commitId`) and appends to the deduped history
* current indexed commit pointer (`source.commitId`) and appends to the deduped history
* array (`source.commitHashes`). Idempotent: re-recording the same commit is
* a no-op except for the `updatedAt` bump.
*
* Throws `KnowledgeNotFoundError` if the document doesn't exist.
*/
export async function setKnowledgeCommit(knowledgeId: string, commitHash: string): Promise<void> {
if (!isFullCommitHash(commitHash)) {
throw new Error(`invalid commit hash for knowledge ${knowledgeId}: ${commitHash}`);
}
const normalizedCommitHash = commitHash.toLowerCase();
const result = await _getDb()
.collection(Collections.Knowledge)
.updateOne(
{ knowledgeId },
{
$set: { "source.commitId": commitHash, updatedAt: new Date() },
$addToSet: { "source.commitHashes": commitHash },
$set: { "source.commitId": normalizedCommitHash, updatedAt: new Date() },
$addToSet: { "source.commitHashes": normalizedCommitHash },
},
@Dead-Bytes Dead-Bytes added Ready For Merge and removed testing changes are being tested labels May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Ready For Merge Ready for Review Ready for review from reviewers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] All the repositories shows the commit as "latest"

3 participants