Skip to content

feat(ai-index): add configurable generation options#46

Merged
rubenmarcus merged 3 commits into
multivmlabs:mainfrom
KimHyeongRae0:feat/ai-index-config-options
May 14, 2026
Merged

feat(ai-index): add configurable generation options#46
rubenmarcus merged 3 commits into
multivmlabs:mainfrom
KimHyeongRae0:feat/ai-index-config-options

Conversation

@KimHyeongRae0
Copy link
Copy Markdown
Contributor

@KimHyeongRae0 KimHyeongRae0 commented May 6, 2026

Implements the approved ai-index configuration options from #32.

Changes:

  • Adds aiIndex.maxChunkLength and aiIndex.maxKeywords to config types and resolved defaults.
  • Threads the resolved options through page-based and content-directory ai-index generation.
  • Documents the new options in the configuration reference.
  • Adds tests for defaults, partial overrides, chunk count, and keyword caps.

Validation:

  • npm run test -- --run src/core/ai-index.test.ts src/core/utils.test.ts
  • npx tsc --noEmit
  • npm run test -- --run
  • npm run build
  • git diff --check

Closes #32

@vercel
Copy link
Copy Markdown

vercel Bot commented May 6, 2026

@KimHyeongRae0 is attempting to deploy a commit to the Cytonic Team on Vercel.

A member of the Team first needs to authorize it.

@KimHyeongRae0 KimHyeongRae0 marked this pull request as ready for review May 9, 2026 15:31
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 9, 2026

Greptile Summary

This PR adds two configurable options — aiIndex.maxChunkLength and aiIndex.maxKeywords — to control content chunking and keyword extraction in ai-index.json generation, implementing the design approved in #32.

  • Types & resolution: AeoConfig gains an optional aiIndex block; ResolvedAeoConfig gains the same block as required with defaults of 2000 / 10. resolveConfig wires the ??-based defaults and validateConfig warns on non-positive values for both options.
  • Core logic: chunkContent and extractKeywords drop their internal default parameters and instead receive values threaded from config.aiIndex, applied consistently in both the page-based and content-directory code paths.
  • Docs & tests: The configuration reference now accurately describes maxChunkLength as a soft (paragraph-boundary) limit; all test fixtures are updated and new scenario tests cover chunk count, keyword cap, and partial overrides.

Confidence Score: 5/5

The change is well-scoped and safe to merge.

The new configuration fields are resolved with ??-safe defaults, threaded correctly through both generation paths, and validated with appropriate warnings. All test fixtures compile, new tests exercise the key scenarios, and the documentation accurately describes the soft-limit behaviour. No functional regressions were found in the changed code paths.

No files require special attention; the one minor gap is a missing negative-value assertion in the maxKeywords test in src/core/utils.test.ts.

Important Files Changed

Filename Overview
src/core/ai-index.ts Threads maxChunkLength and maxKeywords from config.aiIndex into chunkContent and extractKeywords; removes hardcoded defaults from both functions. Correct.
src/core/utils.ts Adds aiIndex resolution block in resolveConfig with ??-based defaults, and two validation guards in validateConfig for non-positive values. Clean and consistent with existing patterns.
src/types.ts Adds optional aiIndex block to AeoConfig and required aiIndex block to ResolvedAeoConfig. Naming is distinct from the boolean generators.aiIndex toggle.
src/core/ai-index.test.ts Adds aiIndex to baseConfig fixture and two new scenario tests for chunk-count and keyword-cap behaviour. Logic is correct.
src/core/utils.test.ts New validateConfig suite covers defaults, partial overrides, and valid/omitted cases well; maxKeywords negative-value test is missing despite the description claiming it.
website/src/content/docs/reference/configuration.mdx New aiIndex reference table accurately documents both options with defaults and explicit soft-limit semantics for maxChunkLength.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["AeoConfig"] -->|"resolveConfig (??defaults 2000/10)"| B["ResolvedAeoConfig"]
    A -->|"validateConfig (warns if <= 0)"| W["warnings[]"]
    B --> G["generateAIIndex(config)"]
    G --> P["pages loop"]
    G --> C["collectAIIndexEntries (contentDir)"]
    P --> CC["chunkContent(content, maxChunkLength)"]
    P --> EK["extractKeywords(content, maxKeywords)"]
    C --> CC2["chunkContent(mainContent, maxChunkLength)"]
    C --> EK2["extractKeywords(mainContent, maxKeywords)"]
    CC --> E["AIIndexEntry[]"]
    EK --> E
    CC2 --> E
    EK2 --> E
    E --> D["deduplicate + sort"]
    D --> J["ai-index.json"]
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/core/utils.test.ts:94-98
The test description says "warns when aiIndex.maxKeywords is zero or negative", but only the zero case is exercised. The parallel `maxChunkLength` test covers both `0` and `-1`; `maxKeywords` should do the same to match the description and stay consistent.

```suggestion
    it('warns when aiIndex.maxKeywords is zero or negative', () => {
      expect(validateConfig({ aiIndex: { maxKeywords: 0 } })).toEqual(
        expect.arrayContaining([expect.stringContaining('aiIndex.maxKeywords')])
      );
      expect(validateConfig({ aiIndex: { maxKeywords: -1 } })).toEqual(
        expect.arrayContaining([expect.stringContaining('aiIndex.maxKeywords')])
      );
    });
```

Reviews (4): Last reviewed commit: "chore(ai-index): address Greptile P2 nit..." | Re-trigger Greptile

Comment thread src/core/utils.ts
Comment thread website/src/content/docs/reference/configuration.mdx
@rubenmarcus
Copy link
Copy Markdown
Member

Thanks @KimHyeongRae0 — this is clean. The implementation matches the spec exactly: aiIndex added to both AeoConfig and ResolvedAeoConfig, defaults wired in resolveConfig, options threaded through both generateAIIndex and collectAIIndexEntries, and the configuration.mdx update is a nice bonus.

Greptile passed and flagged only two non-blocking P2s:

  1. Optional: Add a validateConfig warning when maxChunkLength or maxKeywords is ≤ 0 (mirrors the existing warning for negative crawlDelay). Sub-1 values still produce sensible behavior, but the warning would catch typos.
  2. Doc tweak: maxChunkLength description could say "Target chunk length (splits on paragraph boundaries — a single long paragraph can exceed this)" to avoid implying a hard cap.

Both are nice-to-haves, not merge blockers. Happy to either:

  • Merge as-is and you (or I) follow up with the two nits in a small PR
  • Wait if you want to fold them in here

Your call. Closes #32 either way.

1. validateConfig now warns when aiIndex.maxChunkLength or maxKeywords
   is ≤ 0 (mirrors the existing warning pattern for negative crawlDelay).
   Sub-1 values still produce sensible behavior at runtime — the warning
   surfaces typos instead of letting them silently produce odd output.

2. configuration.mdx maxChunkLength description now explicitly calls
   out that it's a soft limit (chunks split on paragraph boundaries,
   so a single long paragraph can exceed the value). Avoids the
   "hard cap" reading.

Added 4 validateConfig tests covering the new warnings + omission cases.
176 tests pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rubenmarcus
Copy link
Copy Markdown
Member

Applied both Greptile P2 nits directly: validateConfig warning for sub-1 maxChunkLength/maxKeywords (with 4 new tests) and the configuration.mdx wording fix. 176 tests pass.

@greptileai re-review please.

@rubenmarcus rubenmarcus merged commit 7281ef0 into multivmlabs:main May 14, 2026
2 of 4 checks passed
@KimHyeongRae0
Copy link
Copy Markdown
Contributor Author

Thanks for folding those in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: add configurable ai-index generation options

2 participants