Fix zero-row populate retry and refresh model config by giaphutran12 · Pull Request #117 · tinyfish-io/bigset

giaphutran12 · 2026-06-02T18:55:09Z

Summary

make refresh agents use the configured investigate model instead of hardcoded Qwen
tighten populate instructions so search/fetch-only runs must hand concrete leads to subagents
retry populate once with stricter instructions when the first pass inserts zero rows, then fail clearly if still empty

Verification

backend: npm ci --cache /private/tmp/bigset-npm-cache
backend: npm run build
git diff --check origin/main...HEAD
public PR gate: git diff --name-status origin/main...HEAD

coderabbitai · 2026-06-02T18:55:22Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 24a54e67-23a3-4f72-bb64-b15378ae695d

📥 Commits

Reviewing files that changed from the base of the PR and between a79cf22 and 3d87f56.

📒 Files selected for processing (2)

backend/src/mastra/agents/refresh.ts
backend/src/mastra/workflows/populate.ts

🚧 Files skipped from review as they are similar to previous changes (2)

backend/src/mastra/workflows/populate.ts
backend/src/mastra/agents/refresh.ts

📝 Walkthrough

Walkthrough

This PR improves the populate orchestration pipeline and aligns agent model configuration. The populate agent's INSTRUCTIONS prompt is restructured into explicit numbered workflow steps with a "CRITICAL" checklist that enforces subagent invocation for concrete leads and clarifies row-limit termination. The populate workflow is enhanced with row-count verification after generation: if zero rows are inserted, it logs a warning and retries with stricter subagent requirements; if still empty after retry, it throws an error; otherwise it returns combined output. The refresh agent now derives its OpenRouter model from configured authContext instead of using a hardcoded value.

Sequence Diagram

sequenceDiagram
  participant PopulateWorkflow as Populate Workflow
  participant PopulateAgent as Populate Agent
  participant Subagent as Subagent
  participant Dataset as Dataset
  PopulateWorkflow->>PopulateAgent: agent.generate(initial prompt) with run_subagent requirement
  PopulateAgent->>Subagent: run_subagent(concrete leads)
  Subagent->>Dataset: insert rows
  PopulateWorkflow->>Dataset: count inserted rows
  alt rows == 0
    PopulateWorkflow->>PopulateAgent: agent.generate(retry prompt) with stricter subagent requirements
    PopulateAgent->>Subagent: run_subagent(3-5 candidates)
    Subagent->>Dataset: insert additional rows
    PopulateWorkflow->>Dataset: count rows again
    alt still 0 rows
      PopulateWorkflow->>PopulateWorkflow: throw error
    else rows > 0
      PopulateWorkflow->>PopulateWorkflow: return original + retry output
    end
  else rows > 0
    PopulateWorkflow->>PopulateWorkflow: return output
  end

Possibly related PRs

tinyfish-io/bigset#111: Overlaps on populate orchestrator updates including explicit stopping at ROW_LIMIT_REACHED / 100-row cap and run_subagent dispatch behavior.
tinyfish-io/bigset#85: Related changes to the populate workflow and agent execution/error handling.
tinyfish-io/bigset#81: Related orchestration and investigate_row subagent architecture that this PR builds on.

Suggested reviewers

simantak-dabhade

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main changes: fixing zero-row populate retry logic and refresh model configuration.
Description check	✅ Passed	The description is directly related to the changeset, providing clear context for all three modified files and explaining the rationale behind each change.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/fix-agent-zero-row-model-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/src/mastra/agents/refresh.ts (1)
59-68: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate the investigateSubagent model slug before calling openrouter()

In backend/src/mastra/agents/refresh.ts, authContextSchema defines modelConfig and enforces investigateSubagent: z.string().min(1), so the authContext.modelConfig! non-null assertion should be safe. What’s still missing is validation that investigateSubagent is a valid OpenRouter model identifier/format—invalid strings can still fail at runtime when passed to openrouter(modelSlug). Add schema refinement (or a runtime guard) to enforce the expected model-id format/range.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/mastra/agents/refresh.ts` around lines 59 - 68, Validate the
investigateSubagent value before passing it to openrouter: add a schema
refinement to authContextSchema (or a runtime guard near refresh agent creation)
that enforces the allowed OpenRouter model-id format/range, then use the
validated value instead of raw authContext.modelConfig!.investigateSubagent;
specifically update the authContextSchema or add a helper that checks
authContext.modelConfig.investigateSubagent (the investigateSubagent string)
against the OpenRouter model-id pattern/range and throw/handle a clear error if
invalid before calling openrouter(modelSlug) in the Agent constructor.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/mastra/workflows/populate.ts`:
- Around line 252-285: The retry call to the orchestrator uses the wrong step
budget—change the second agent.generate call that produces retryResult to use
maxSteps: 80 (matching the initial run) instead of 40; update the call site
where retryResult is created (agent.generate(retryPrompt, { maxSteps: 40 })) so
the retry has the full orchestrator budget and then continue to record metrics
and re-check rowCount as currently done.

---

Outside diff comments:
In `@backend/src/mastra/agents/refresh.ts`:
- Around line 59-68: Validate the investigateSubagent value before passing it to
openrouter: add a schema refinement to authContextSchema (or a runtime guard
near refresh agent creation) that enforces the allowed OpenRouter model-id
format/range, then use the validated value instead of raw
authContext.modelConfig!.investigateSubagent; specifically update the
authContextSchema or add a helper that checks
authContext.modelConfig.investigateSubagent (the investigateSubagent string)
against the OpenRouter model-id pattern/range and throw/handle a clear error if
invalid before calling openrouter(modelSlug) in the Agent constructor.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5c3eb323-cf1e-4082-8d13-9a84f32e7116

📥 Commits

Reviewing files that changed from the base of the PR and between 07b496c and 6725f4c.

📒 Files selected for processing (3)

backend/src/mastra/agents/populate.ts
backend/src/mastra/agents/refresh.ts
backend/src/mastra/workflows/populate.ts

giaphutran12 · 2026-06-06T02:40:23Z

Rechecked this head before asking for review.

Current state:

CodeRabbit, TruffleHog, and OSV all pass.
backend: npm ci && npm run build passes in a clean temp worktree.
Earlier CodeRabbit points are addressed in head: retry maxSteps is now 80, and investigateSubagent is validated before openrouter(modelSlug).

@MMeteorL @simantak-dabhade could one of you do the non-author review when you have a minute? Looks good from my side.

Fix zero-row populate retry and refresh model config

6725f4c

giaphutran12 self-assigned this Jun 2, 2026

giaphutran12 requested review from MMeteorL and simantak-dabhade June 2, 2026 18:55

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread backend/src/mastra/workflows/populate.ts

simantak-dabhade and others added 2 commits June 2, 2026 16:09

Merge branch 'main' into codex/fix-agent-zero-row-model-config

a79cf22

Address CodeRabbit populate retry comments

3d87f56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix zero-row populate retry and refresh model config#117

Fix zero-row populate retry and refresh model config#117
giaphutran12 wants to merge 3 commits into
mainfrom
codex/fix-agent-zero-row-model-config

giaphutran12 commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Walkthrough

Sequence Diagram

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

giaphutran12 commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

giaphutran12 commented Jun 2, 2026

Summary

Verification

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

giaphutran12 commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading