Skip to content

fix(audit): align schema pattern detection#50

Open
KimHyeongRae0 wants to merge 1 commit into
multivmlabs:mainfrom
KimHyeongRae0:fix/audit-schema-detection
Open

fix(audit): align schema pattern detection#50
KimHyeongRae0 wants to merge 1 commit into
multivmlabs:mainfrom
KimHyeongRae0:fix/audit-schema-detection

Conversation

@KimHyeongRae0
Copy link
Copy Markdown
Contributor

@KimHyeongRae0 KimHyeongRae0 commented May 12, 2026

Fixes audit/schema FAQ and HowTo detection drift. Validation: npm run lint, npm run test -- --run, npm run build.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

@KimHyeongRae0 is attempting to deploy a commit to the Cytonic Team on Vercel.

A member of the Team first needs to authorize it.

@KimHyeongRae0 KimHyeongRae0 force-pushed the fix/audit-schema-detection branch from fb7823c to 0663375 Compare May 12, 2026 16:48
@KimHyeongRae0 KimHyeongRae0 changed the title [codex] fix audit schema detection fix(audit): align schema pattern detection May 12, 2026
@rubenmarcus
Copy link
Copy Markdown
Member

Strong fix @KimHyeongRae0 — the audit was using regex shortcuts to detect FAQ/HowTo while the schema generator used proper detection logic, so the audit could report "FAQPage schema eligible" while the schema generator would silently produce nothing. Extracting detectFaqPatterns and detectHowToSteps into schema-patterns.ts and having both call sites consume the same source is exactly the right shape.

Three new tests covering the three cases (FAQ without answer, single-step "HowTo", real FAQ) make the regression boundary clear.

Approving. Will merge once the open queue clears.

@KimHyeongRae0 KimHyeongRae0 marked this pull request as ready for review May 14, 2026 07:35
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@KimHyeongRae0
Copy link
Copy Markdown
Contributor Author

Thanks. Marked this ready for review.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR eliminates the drift between how audit.ts detected FAQ/HowTo content and how schema.ts actually generated structured data. The two private detection functions (detectFaqPatterns, detectHowToSteps) are extracted from schema.ts into a new shared schema-patterns.ts module, and audit.ts now calls those shared functions instead of its own broader inline regexes.

  • Behavioral tightening in audit: the old audit FAQ regex matched any heading ending with ? (e.g. ## Pricing?); the new check requires a recognised question word and a non-empty answer paragraph, matching the actual schema-generation criteria. Similarly the old HowTo regex matched a single ## How to … heading; the new check requires at least two numbered step headings.
  • Refactor in schema.ts: the two private functions are deleted and re-imported from the shared module — schema generation behaviour is unchanged.
  • New tests cover two negative cases (non-question heading, single HowTo step) and one positive FAQ case, but there is no positive test for the HowTo 2-step pass path.

Confidence Score: 4/5

Safe to merge; the refactoring faithfully replicates the original detection logic in both consumers, and the intentional tightening of the audit criteria aligns it with what schema generation actually produces.

The only gap is a missing positive test for the HowTo 2-step path — if a future change accidentally breaks the steps.length >= 2 guard or the step regex, no test would catch it on the audit side.

src/core/audit.test.ts — the HowTo passing path is the one area that would benefit from an additional test case.

Important Files Changed

Filename Overview
src/core/schema-patterns.ts New shared module exporting detectFaqPatterns and detectHowToSteps, extracted verbatim from schema.ts; functions are identical to their originals.
src/core/audit.ts Replaced inline regexes in auditSchemaPresence and auditCitability with calls to shared detection functions; behavioral narrowing is intentional alignment but the HowTo passing path lacks a test.
src/core/schema.ts Removed both private detection functions and added import from schema-patterns.ts; behavior of schema generation is unchanged.
src/core/audit.test.ts Adds three schema-presence tests (two negative, one positive FAQ), but omits a positive HowTo test (2+ steps) leaving that pass path uncovered.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[auditSite] --> B[auditSchemaPresence]
    A --> C[auditCitability]

    B --> D{any page matches?}
    D -->|FAQ path| E["detectFaqPatterns\nquestion-word heading + answer text"]
    D -->|HowTo path| F["detectHowToSteps\nnumbered steps, min 2 required"]
    E --> G[hasFaqOrHowTo]
    F --> G

    C --> H{any page matches?}
    H -->|FAQ path| I["detectFaqPatterns\nsame shared function"]
    I --> J[hasFaq]

    subgraph SP["schema-patterns.ts (shared)"]
        E
        F
        I
    end

    OLD1["audit.ts old: broad regex\nany heading ending with ?"] -."replaced by".-> E
    OLD2["audit.ts old: How to heading\nor Step N heading"] -."replaced by".-> F
    OLD3["schema.ts old: private fns\nidentical logic"] -."extracted to".-> SP
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/core/audit.test.ts:111-135
**Missing positive HowTo test case**

The three new tests cover: a non-question heading that should fail, a single-step heading that should fail, and a valid FAQ that should pass — but there is no test asserting that content with **two or more step headings** produces `passed: true`. Because `detectHowToSteps` returns `[]` for fewer than 2 steps, the only way to confirm the HowTo pass path works is to test it with at least two `## Step N:` headings. Without this, a regression in the `>= 2` guard or the step regex could silently go undetected.

Reviews (1): Last reviewed commit: "fix(audit): align schema pattern detecti..." | Re-trigger Greptile

Comment thread src/core/audit.test.ts
Comment on lines +111 to +135
it('requires at least two HowTo steps when auditing schema presence', () => {
const config = makeConfig({
pages: [{
pathname: '/install',
content: '## Step 1: Install\n\nRun npm install aeo.js.',
}],
});
const result = auditSite(config);
const schemaPresence = result.categories.find(c => c.name === 'Schema Presence')!;
const faqOrHowToCheck = schemaPresence.checks.find(c => c.label === 'FAQPage or HowTo schema');
expect(faqOrHowToCheck?.passed).toBe(false);
});

it('passes schema presence for FAQ content that can generate FAQPage schema', () => {
const config = makeConfig({
pages: [{
pathname: '/faq',
content: '## What is aeo.js?\n\naeo.js generates answer-engine assets for websites.',
}],
});
const result = auditSite(config);
const schemaPresence = result.categories.find(c => c.name === 'Schema Presence')!;
const faqOrHowToCheck = schemaPresence.checks.find(c => c.label === 'FAQPage or HowTo schema');
expect(faqOrHowToCheck?.passed).toBe(true);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing positive HowTo test case

The three new tests cover: a non-question heading that should fail, a single-step heading that should fail, and a valid FAQ that should pass — but there is no test asserting that content with two or more step headings produces passed: true. Because detectHowToSteps returns [] for fewer than 2 steps, the only way to confirm the HowTo pass path works is to test it with at least two ## Step N: headings. Without this, a regression in the >= 2 guard or the step regex could silently go undetected.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/core/audit.test.ts
Line: 111-135

Comment:
**Missing positive HowTo test case**

The three new tests cover: a non-question heading that should fail, a single-step heading that should fail, and a valid FAQ that should pass — but there is no test asserting that content with **two or more step headings** produces `passed: true`. Because `detectHowToSteps` returns `[]` for fewer than 2 steps, the only way to confirm the HowTo pass path works is to test it with at least two `## Step N:` headings. Without this, a regression in the `>= 2` guard or the step regex could silently go undetected.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants