feat: hybrid search implementation by barbara-celi · Pull Request #82 · vtexdocs/components

barbara-celi · 2026-05-05T19:13:32Z

Description

This PR adds hybrid search backend support to @vtexdocs/components as an alternative to Algolia, enabling both Help Center and Dev Portal to use the new VTEX Docs Hybrid Search API.

Changes:

Extended SearchConfig with new backend option: { backend: 'hybrid', hybrid: {...} }
Implemented hybrid search adapter that translates InstantSearch queries to /api/search calls and transforms responses to Algolia-compatible format.
Exported new types: HybridSearchConfig and SearchBackendConfig.
Maintained full backward compatibility. Existing Algolia implementations work unchanged.
The hybrid backend is opt-in via configuration, with no breaking changes to component APIs.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Requires change to documentation, which has been updated accordingly.

This commit introduces a new hybrid search adapter for the `@vtexdocs/components` package, allowing integration with the Help Center's API while maintaining backward compatibility with Algolia. Key changes include the addition of a new `HybridSearchConfig` interface, updates to the `search-config.ts` file to support hybrid search, and modifications to the `SearchConfig` function to handle both Algolia and hybrid configurations. The implementation aims for minimal code changes and reuses existing components.

This update modifies the request selection logic in the `search-config.ts` file to prioritize requests with a non-empty query. If no such request is found, it defaults to the first request in the array. This change enhances the hybrid search functionality by ensuring more relevant queries are processed.

…etch size

… type assertion

PedroAntunesCosta · 2026-05-20T15:36:55Z

+export interface HybridSearchConfig {
+  apiEndpoint: string
+  source: 'help-center' | 'dev-portal'
+  defaultLimit?: number


The helpcenter consumer (PR vtexdocs/helpcenter#456, src/utils/libraryConfig.ts:13) passes itemsPerPage: 10 into this config object, expecting it to set the default page size. With the field named defaultLimit here, that value is silently ignored — the destructure on line 77 falls back to the hardcoded 10.

Suggest aligning the names so the contract is consistent. Two options:

Rename here to itemsPerPage?: number (matches the surrounding InstantSearch / Algolia vocabulary; no consumer change needed).

Keep defaultLimit and update libraryConfig.ts on the helpcenter side to use the same name.

Either works; option 1 is friendlier to existing widget terminology, but option 2 keeps the components-side semantics explicit. Worth picking one before merge so the field actually has effect.

Another factor for option 2 is to keep components compatible with other portals.

PedroAntunesCosta · 2026-05-20T16:40:01Z

+    content: result.snippet || result.content || '',
+    hierarchy,
+    language: result.metadata?.locale || 'en',
+    type: 'content',
+    _highlightResult: {
+      content: {
+        value: result.snippet || result.content || '',
+        matchLevel: 'full',
+        fullyHighlighted: false,
+        matchedWords: [],
+      },
+      hierarchy: {
+        lvl0: {
+          value: hierarchy.lvl0,
+          matchLevel: 'none',
+        },
+        lvl1: {
+          value: hierarchy.lvl1,
+          matchLevel: result.title ? 'partial' : 'none',
+        },
+      },
+    },
+    _snippetResult: {
+      content: {
+        value: result.snippet || '',
+        matchLevel: 'full',
+      },
+    },
+  }


Issue: snippets render as raw markdown in the Help Center search results

A query like sku surfaces snippets such as | sku_manufacturer_code | character varying(65535) | Code used by merchant to reference the manufacturer. — literal markdown table syntax instead of plain text.

Why

The upstream /api/hybrid-search returns each hit's snippet as a raw substring of the indexed .md source. In transformHybridToAlgolia, that raw string is forwarded into the InstantSearch hit shape at three points:

content (line 337)

_highlightResult.content.value (line 343)

_snippetResult.content.value (line 361)

connectHighlight then renders _highlightResult.content.value as plain text inside SearchCard, so markdown characters appear verbatim. Algolia does not have this problem because its indexing pipeline strips markdown before storing the content attribute, so the search client only ever sees plain text.

Recommendation

Strip markdown inside transformHybridToAlgolia before assigning the snippet:

const cleanedSnippet = stripMarkdown(result.snippet || result.content || '') // use `cleanedSnippet` for `content`, `_highlightResult.content.value`, `_snippetResult.content.value`

A small regex pass (headings, emphasis, links, code fences, table pipes) or a strip-markdown + remark round-trip is enough.

Doing it here — rather than in customHighlight.tsx — avoids corrupting InstantSearch's highlight boundaries, which by that point are already split fragments. The adapter is the single choke point through which every hybrid hit flows, so the fix stays isolated.

Long-term

Proper fix is upstream in vtexdocs/vtexdocs-mcp-app — either index plain text or return a sanitized snippet. Same surface as vtexdocs-mcp-app#46 (server-side facet counts) and tracked in EDU-18399. Once that lands, the adapter-level stripping can be removed.

barbara-celi added 2 commits May 5, 2026 13:02

barbara-celi requested a review from PedroAntunesCosta May 5, 2026 19:13

barbara-celi self-assigned this May 5, 2026

barbara-celi added the release-minor Minor version bump label May 5, 2026

github-actions Bot approved these changes May 5, 2026

View reviewed changes

barbara-celi changed the title ~~[EDU-17906] - feat: hybrid search implementation~~ feat: hybrid search implementation May 5, 2026

barbara-celi mentioned this pull request May 5, 2026

[EDU-17906] feat: implement Hybrid Search API integration vtexdocs/helpcenter#456

Open

4 tasks

docs: remove outdated hybrid search implementation documentation

946079e

barbara-celi mentioned this pull request May 5, 2026

feat(hybrid-search): implement hybrid search API and documentation vtexdocs/devportal#1250

Draft

4 tasks

barbara-celi added 3 commits May 13, 2026 10:00

feat: enhance hybrid search configuration with caching and upstream f…

dfe539e

…etch size

fix: update search results pagination and improve request handling

96433c2

refactor: streamline InfiniteHits component export and add TypeScript…

977f53a

… type assertion

PedroAntunesCosta reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: hybrid search implementation#82

feat: hybrid search implementation#82
barbara-celi wants to merge 6 commits into
mainfrom
feat/hybrid-search

barbara-celi commented May 5, 2026

Uh oh!

PedroAntunesCosta May 20, 2026

Uh oh!

PedroAntunesCosta May 20, 2026

Uh oh!

PedroAntunesCosta May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

barbara-celi commented May 5, 2026

Description

Changes:

Types of changes

Uh oh!

PedroAntunesCosta May 20, 2026

Choose a reason for hiding this comment

Uh oh!

PedroAntunesCosta May 20, 2026

Choose a reason for hiding this comment

Uh oh!

PedroAntunesCosta May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants