Skip to content

fix(backend): decode percent-encoded repo names for generic Git URLs#1415

Closed
rachit367 wants to merge 2 commits into
sourcebot-dev:mainfrom
rachit367:rachit367/fix-generic-git-url-decode
Closed

fix(backend): decode percent-encoded repo names for generic Git URLs#1415
rachit367 wants to merge 2 commits into
sourcebot-dev:mainfrom
rachit367:rachit367/fix-generic-git-url-decode

Conversation

@rachit367

Copy link
Copy Markdown

Fixes #1384

Problem

A generic Git connection pointing directly at an HTTP(S) remote whose path contains encoded characters (e.g. https://github.com/test/Project%20Name%20With%20Spaces.git) stores the repo name with the encoding intact — github.com/test/Project%20Name%20With%20Spaces — in name, displayName, and the zoekt metadata.

compileGenericGitHostConfig_url derives the name from remoteUrl.pathname without decoding, while the file-origin path (compileGenericGitHostConfig_file) already calls decodeURIComponent. So the same remote produces different Sourcebot/zoekt identifiers depending on whether it's configured as a direct URL or discovered from a local repository origin.

(Re: @brendan-kellam's question on the issue — that inconsistency across the two code paths is the concrete impact.)

Fix

Decode remoteUrl.pathname in compileGenericGitHostConfig_url before building the name, mirroring the existing file-origin behavior.

Tests

Added a compileGenericGitHostConfig_url case asserting a %20-encoded URL yields github.com/test/Project Name With Spaces in name, displayName, and zoekt.name — matching the existing _file test. tsc --noEmit is clean for the backend package.

(Note: the repo's generic-git _url tests assert forward-slash-joined names and only pass on POSIX CI, since path.join uses the platform separator; they fail identically on Windows before this change. The decode logic itself is platform-independent.)

rachit367 added 2 commits July 2, 2026 13:54
compileGenericGitHostConfig_url derived the repo name from the raw
remoteUrl.pathname, so a direct URL like .../Project%20Name.git kept
%20 in name, displayName and zoekt metadata. The file-based path already
decodes via decodeURIComponent, so the same remote produced inconsistent
identifiers depending on how it was configured. Decode the pathname to
match.

Fixes sourcebot-dev#1384
@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ae5840f9-a96a-4d8d-95c7-c1f9a1e0eea8

📥 Commits

Reviewing files that changed from the base of the PR and between fd6720f and 1aaac6e.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • packages/backend/src/repoCompileUtils.test.ts
  • packages/backend/src/repoCompileUtils.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@rachit367

Copy link
Copy Markdown
Author

Closing in favor of #1389, which predates this (opened 2026-06-29) and fixes the same issue more completely — it wraps the decode in a helper that preserves malformed percent-escapes (e.g. Project%GGName) instead of letting decodeURIComponent throw during config compilation, which my version here doesn't handle. Deferring to @DivyamTalwar's PR. Thanks!

@rachit367 rachit367 closed this Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generic Git URL configs keep percent-encoded repo names

1 participant