Skip to content

fix: crawl 時のリポジトリ不在エラーハンドリング改善 (#275)#280

Merged
coji merged 4 commits intomainfrom
fix/crawl-missing-repo-275
Apr 7, 2026
Merged

fix: crawl 時のリポジトリ不在エラーハンドリング改善 (#275)#280
coji merged 4 commits intomainfrom
fix/crawl-missing-repo-275

Conversation

@coji
Copy link
Copy Markdown
Owner

@coji coji commented Apr 7, 2026

Closes #275

Summary

  • paginateGraphQLisResourceMissing オプションを追加。repository === null を検出したら GraphQLResourceMissingError を即座に throw し、無駄なページサイズ縮小リトライを回避
  • GraphQL 呼び出しのラベルに owner/repo を含めて、どのリポで失敗したか特定可能に
  • crawl ジョブの per-repo ループを try/catch で囲み、1 リポの失敗で全体が止まらないように。失敗情報は output.failedRepos に格納

Test plan

  • pnpm typecheck
  • pnpm lint
  • pnpm vitest run batch/github
  • 削除済みリポを含む組織で crawl が完走することを本番で確認

🤖 Generated with Claude Code

Summary by CodeRabbit

  • 新機能

    • ジョブ完了時に失敗したリポジトリの詳細(リポジトリラベルとエラー内容)を一覧で返すようになりました。
    • 欠落した外部リソースを検出すると該当処理を即時中止して失敗として扱う判定が追加されました。
  • バグ修正

    • リポジトリ単位の例外を適切に捕捉し、影響を受けたリポジトリのみスキップして残りの処理を継続するよう改善しました。

- paginateGraphQL に isResourceMissing オプションを追加し、repository
  null をページサイズ縮小ループに入れずに即座に GraphQLResourceMissingError
  として throw
- ラベルに owner/repo を含めてエラー特定を容易に
- crawl ジョブの per-repo ループを try/catch で囲み、失敗リポはスキップ
  して他のリポの処理を継続。失敗情報は output.failedRepos に格納

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

Warning

Rate limit exceeded

@coji has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 22 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 1 minutes and 22 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dcedfcbd-01db-481b-9f1b-206f4067c449

📥 Commits

Reviewing files that changed from the base of the PR and between 333c10e and e7a5c15.

📒 Files selected for processing (4)
  • app/services/jobs/crawl.server.ts
  • batch/commands/crawl.ts
  • batch/github/fetcher.test.ts
  • batch/github/fetcher.ts
📝 Walkthrough

Walkthrough

クローラーのリポジトリ単位処理で例外を局所化し継続処理するようにし、GraphQL のリソース不在(repository == null)を早期検出して専用例外を投げ、ジョブ出力に失敗リポジトリ一覧を追加する変更。

Changes

Cohort / File(s) Summary
クローラージョブのエラーハンドリング
app/services/jobs/crawl.server.ts
各リポジトリ処理を try/catch でラップし、例外発生時はそのリポジトリをスキップして failedRepos: { repoLabel, error }[] をジョブ出力に追加。ストア/フェッチャ生成とウォーターマーク更新を try 内へ移動。(+114/-102)
GraphQL ページングとリソース不在検出
batch/github/fetcher.ts
新規エクスポート GraphQLResourceMissingError を導入。paginateGraphQLisResourceMissing オプションを追加して、repository == null 等を検出したら即座に GraphQLResourceMissingError を投げてページング・リトライを中断。ラベルに owner/repo を含めるよう更新。(+46/-6)
テスト追加/更新
batch/github/fetcher.test.ts
GraphQLResourceMissingError を検証するテストを追加:isResourceMissing が真の場合に最初の GraphQL 呼び出しでエラーが投げられること、およびエラーメッセージにラベルが含まれることを確認。(+50/-2)

Sequence Diagram(s)

sequenceDiagram
    participant Job as CrawlJob
    participant RepoLoop as Per-Repo Loop
    participant Fetcher as GitHub Fetcher
    participant API as GitHub GraphQL API
    participant Output as Job Output

    Job->>RepoLoop: Iterate repositories
    activate RepoLoop
    RepoLoop->>Fetcher: Start crawl for owner/repo
    activate Fetcher
    Fetcher->>API: GraphQL query (pulls connection)
    alt repository exists
        API-->>Fetcher: repository != null (connection present)
        Fetcher->>Fetcher: paginateGraphQL normal flow
        Fetcher-->>RepoLoop: success
    else repository missing
        API-->>Fetcher: repository == null
        Fetcher->>Fetcher: isResourceMissing -> true
        Fetcher-->>RepoLoop: throw GraphQLResourceMissingError
    end
    deactivate Fetcher
    RepoLoop->>RepoLoop: catch error, append to failedRepos
    RepoLoop->>Job: continue next repository
    deactivate RepoLoop
    Job->>Output: return results (including failedRepos)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 ほら見てごらん、消えたリポがいても、
エラーはメモして、また次へ跳ねるよ。
失敗はリストに、希望は続行に、
クローラーは今日も穴を通り抜ける。 ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR タイトルが日本語で記載されており、リポジトリ不在エラーハンドリング改善という main change を的確に表現しています。
Linked Issues check ✅ Passed 変更内容が linked issue #275 のすべての主要要件を満たしています: (1) repository === null の即座検出 [#275]、(2) per-repo try/catch でループ全体停止回避 [#275]、(3) エラーメッセージに owner/repo 含有 [#275]。
Out of Scope Changes check ✅ Passed すべての変更が issue #275 の要件に直結しています。3ファイルの修正はすべて、repository null 検出の強化、per-repo エラーキャプチャ、エラーメッセージ改善という要件スコープ内です。
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/crawl-missing-repo-275

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
app/services/jobs/crawl.server.ts (2)

191-196: getErrorMessage() の使用を検討してください

コーディングガイドラインでは app/libs/error-message.tsgetErrorMessage() を使用することが推奨されています。現在のパターン e instanceof Error ? e.message : String(e) は動作しますが、プロジェクト全体の一貫性のために getErrorMessage() の使用を検討してください。

同様のパターンが Line 164 にもあります。

♻️ 提案される修正
+import { getErrorMessage } from '~/app/libs/error-message'

// Line 164
-                step.log.warn(
-                  `Failed to fetch ${repoLabel}#${pr.number}: ${e instanceof Error ? e.message : e}`,
-                )
+                step.log.warn(
+                  `Failed to fetch ${repoLabel}#${pr.number}: ${getErrorMessage(e)}`,
+                )

// Line 192
-        const message = e instanceof Error ? e.message : String(e)
+        const message = getErrorMessage(e)

As per coding guidelines: "Use getErrorMessage() from app/libs/error-message.ts to extract error messages".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/services/jobs/crawl.server.ts` around lines 191 - 196, Replace the manual
error-extraction pattern (e instanceof Error ? e.message : String(e)) with the
shared helper getErrorMessage() from app/libs/error-message.ts; specifically,
import getErrorMessage and use it to produce the message passed into
step.log.error and the error value pushed into failedRepos in the catch block
around the crawl logic (and the similar catch at the earlier occurrence near
line 164), so both step.log.error(...) and failedRepos.push({ repoLabel, error:
message }) use message = getErrorMessage(e).

251-251: batch/commands/crawl.tsfailedRepos を CLI 出力に含めることを検討してください

app/services/jobs/crawl.server.ts から failedRepos がジョブ出力に含まれるようになりましたが、batch/commands/crawl.ts の CLI 出力(85行目)では fetchedRepospullCount のみをログ出力しており、失敗したリポジトリの情報は報告されていません。ユーザーに対してクロール失敗の詳細を伝えるべきか検討してください。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/services/jobs/crawl.server.ts` at line 251, The CLI output in
batch/commands/crawl.ts currently logs only fetchedRepos and pullCount while
app/services/jobs/crawl.server.ts now returns failedRepos; update the logging in
the command handler (the spot that logs fetchedRepos and pullCount) to also
include failedRepos when present—format the message to list the count and/or
identifiers from failedRepos and handle the case where it's empty or undefined
so the CLI prints a clear summary of failed repositories alongside fetchedRepos
and pullCount.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@app/services/jobs/crawl.server.ts`:
- Around line 191-196: Replace the manual error-extraction pattern (e instanceof
Error ? e.message : String(e)) with the shared helper getErrorMessage() from
app/libs/error-message.ts; specifically, import getErrorMessage and use it to
produce the message passed into step.log.error and the error value pushed into
failedRepos in the catch block around the crawl logic (and the similar catch at
the earlier occurrence near line 164), so both step.log.error(...) and
failedRepos.push({ repoLabel, error: message }) use message =
getErrorMessage(e).
- Line 251: The CLI output in batch/commands/crawl.ts currently logs only
fetchedRepos and pullCount while app/services/jobs/crawl.server.ts now returns
failedRepos; update the logging in the command handler (the spot that logs
fetchedRepos and pullCount) to also include failedRepos when present—format the
message to list the count and/or identifiers from failedRepos and handle the
case where it's empty or undefined so the CLI prints a clear summary of failed
repositories alongside fetchedRepos and pullCount.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bbdd1f27-9827-4575-b197-c7912b8e684c

📥 Commits

Reviewing files that changed from the base of the PR and between 5d10bde and ed15723.

📒 Files selected for processing (2)
  • app/services/jobs/crawl.server.ts
  • batch/github/fetcher.ts

coji and others added 3 commits April 7, 2026 10:55
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- crawl の per-repo ハンドリングで getErrorMessageForLog を使用
- GraphQLResourceMissingError は step.run 内で sentinel として返し、
  外で再 throw する。durably による無駄なリトライを防ぐ
- fetcher.ts の repositoryMissing 述語を共通化(4 箇所の重複削減)
- テストの inline 型定義を共通化
- 不要なコメントを削除

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coji coji merged commit 4eb2a2f into main Apr 7, 2026
7 checks passed
@coji coji deleted the fix/crawl-missing-repo-275 branch April 7, 2026 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

crawl: リポジトリ不在時のエラーハンドリング改善

1 participant