Skip to content

feat: enhanced error analysis with sub-job log redirection#112

Open
donhui wants to merge 11 commits intojenkinsci:mainfrom
donhui:subjob-log-extraction
Open

feat: enhanced error analysis with sub-job log redirection#112
donhui wants to merge 11 commits intojenkinsci:mainfrom
donhui:subjob-log-extraction

Conversation

@donhui
Copy link
Member

@donhui donhui commented Mar 12, 2026

Issue #114

🎯 What problem does this PR solve?

In Jenkins pipelines that include sub-jobs, when a sub-job fails, the parent job's error report cannot directly locate the sub-job's logs, requiring users to manually search for them. This PR enhances the explainError function to recursively collect logs from downstream failed jobs and provide direct redirection to the actual error source via "View failure output".

🧩 Core Changes (3 dimensions)

1. 🚀 New Features

  • Downstream log collection (disabled by default, requires opt-in): Recursively scan logs of downstream failed jobs
  • Regex filtering: Only collect logs from specified downstream jobs (avoid scanning unrelated jobs)
  • Smart redirection: When a downstream job fails, "View failure output" directly points to the downstream logs

2. 🔒 Permissions & Security

  • Follows Jenkins permission model: Only collect logs from downstream jobs that the current user has permission to view
  • Placeholder shown for unauthorized jobs, preventing log content exposure
  • Supports authentication credential propagation

3. ⚙️ Performance & Stability

  • Configurable recursion depth and log line limits to prevent infinite scanning
  • Fast-failed downstream jobs marked as non-root-cause to reduce noise
  • Reuse existing ErrorExplanationAction from downstream runs to avoid duplicate analysis

Testing done

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests that demonstrate the feature works or the issue is fixed

@donhui donhui requested a review from a team as a code owner March 12, 2026 02:29
@donhui donhui force-pushed the subjob-log-extraction branch 2 times, most recently from 62ccf9d to 22b4375 Compare March 12, 2026 02:54
@shenxianpeng shenxianpeng requested review from Copilot and removed request for a team March 12, 2026 21:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds recursive downstream failure log discovery to improve AI explanations when an upstream build fails due to a triggered sub-job (e.g., build step or Cause.UpstreamCause).

Changes:

  • Extend PipelineLogExtractor to recursively collect failing downstream run logs (bounded depth + visited de-dup) and optionally reuse an existing ErrorExplanationAction from the sub-job.
  • Update BaseAIProvider prompt to correctly interpret downstream log sections and reused AI explanation blocks.
  • Add tests covering downstream inclusion/exclusion, fail-fast abort labeling logic, recursion guard, de-dup, and explanation reuse vs raw-log fallback; add optional dependency on pipeline-build-step.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/main/java/io/jenkins/plugins/explain_error/PipelineLogExtractor.java Implements downstream run discovery (DownstreamBuildAction + UpstreamCause scan), recursion/de-dup, fail-fast abort labeling, and explanation reuse/URL redirection.
src/main/java/io/jenkins/plugins/explain_error/provider/BaseAIProvider.java Enhances the LLM prompt so downstream sections (and reused sub-job explanations) are interpreted as intended.
src/test/java/io/jenkins/plugins/explain_error/PipelineLogExtractorTest.java Adds unit/integration tests for downstream recursion behavior, fail-fast abort detection, depth guard, de-dup, and explanation reuse.
pom.xml Declares optional dependency on pipeline-build-step to support DownstreamBuildAction discovery when available.

You can also share your feedback on Copilot code review. Take the survey.

donhui added 2 commits March 13, 2026 14:21
When a build fails due to a downstream job triggered via the build step or Cause.UpstreamCause, error logs from the failing sub-job are now collected recursively and included in the AI analysis.

Changes:

- Add optional pipeline-build-step dependency to read DownstreamBuildAction when installed

- PipelineLogExtractor: add downstream recursion with MAX_DOWNSTREAM_DEPTH=5 and visitedRunIds de-dup

- Discover sub-jobs via DownstreamBuildAction, with Cause.UpstreamCause scan as a universal fallback

- Wrap downstream sections with labeled header + Result; detect fail-fast aborts via InterruptedBuildAction and mark them non-root-cause

- Reuse ErrorExplanationAction output under an [AI explanation from sub-job] marker; otherwise extract raw logs via a child PipelineLogExtractor

- Redirect the failure URL to the first genuine downstream failure

- BaseAIProvider prompt updated to interpret downstream sections and AI explanation blocks correctly

Tests:

- PipelineLogExtractorTest: cover downstream failure inclusion, success exclusion, visitedRunIds de-dup, and explanation reuse vs raw-log fallback
- add explainError step parameters to enable downstream log collection and filter job full names with a regex

- skip UpstreamCause fallback when DownstreamBuildAction already finds matching downstream jobs

- keep scan guards for capacity, recursion depth, and per-job candidate limits

- document the new options in README and Jenkins step help files

- expand PipelineLogExtractor and step config tests for default-off and regex-filtered behavior
@donhui donhui force-pushed the subjob-log-extraction branch from b9195a5 to 3536a17 Compare March 13, 2026 06:21
donhui and others added 7 commits March 13, 2026 14:26
…actorTest.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- keep complete downstream job sections when explainError uses logPattern filtering

- continue filtering ordinary upstream lines by the configured regex

- add ErrorExplainer tests for downstream section preservation and upstream filtering behavior
- cache downstream build result before null and severity checks in appendDownstreamRunLog

- reuse local job name and build number values when building downstream section metadata
- include job and build information in downstream collection warning messages

- log the full exception stack trace via LOGGER.log(Level.WARNING, ..., e)
- add a VisibleForTesting constructor for setting downstreamDepth in PipelineLogExtractor

- update PipelineLogExtractorTest to use the test constructor instead of reflection
Primary changes:\n- thread the current Authentication through ExplainErrorStep, ErrorExplainer, ConsoleExplainErrorAction, and PipelineLogExtractor so downstream discovery and extraction run against an explicit viewer context\n- make the UpstreamCause fallback scan only jobs visible to that authentication, preventing all-job enumeration from including unreadable downstream jobs\n\nSecondary changes:\n- add a downstream run visibility check before reusing AI explanations or appending raw console output; unreadable build-step downstreams now emit a hidden placeholder instead of leaking content\n- keep DownstreamBuildAction discovery functional by resolving the downstream Run under SYSTEM, while still enforcing viewer-based permission checks before any downstream content is included\n- add Jenkins security tests covering hidden build-step downstreams and hidden UpstreamCause downstream jobs
- delete the unused private matchesDownstreamJob(Run<?, ?>) helper flagged by SpotBugs as UPM_UNCALLED_PRIVATE_METHOD\n- keep the string-based downstream job regex matcher as the single implementation path\n- no behavior changes; downstream filtering logic remains the same
@donhui donhui changed the title feat: collect downstream failure logs recursively feat: enhance downstream failure extraction with filtering and access guards Mar 13, 2026
@donhui
Copy link
Member Author

donhui commented Mar 13, 2026

@shenxianpeng The issue raised by Copilot has been fixed. How can Copilot review it again? thanks

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.


You can also share your feedback on Copilot code review. Take the survey.

donhui and others added 2 commits March 13, 2026 17:46
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- truncate downstream headers, logs, and hidden placeholders to respect maxLines without overflowing the collected output
- only recurse into nested downstream builds when capacity remains, keeping downstream sections structurally bounded
- treat invalid downstream job regex values as a warning that disables collection instead of throwing from pattern compilation
- simplify PipelineLogExtractorTest by removing parent/build stubs that are no longer needed when downstream collection is opt-in

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@donhui
Copy link
Member Author

donhui commented Mar 13, 2026

@shenxianpeng The new issue raised by Copilot has been fixed again.

@panicking
Copy link
Contributor

panicking commented Mar 14, 2026

@donhui It's very difficult for me to understand this pull request, you should open an issue and describe what you have in mind for design. If we continue to merge any kind of change we can not even check what is the status. Can you please open an issue on it and full describe it. Your proposed change is too long, you should split in multiple pull request and wait each part to be reviewed.

@donhui donhui changed the title feat: enhance downstream failure extraction with filtering and access guards feat: enhanced error analysis with sub-job log redirection Mar 15, 2026
@donhui
Copy link
Member Author

donhui commented Mar 15, 2026

@donhui It's very difficult for me to understand this pull request, you should open an issue and describe what you have in mind for design. If we continue to merge any kind of change we can not even check what is the status. Can you please open an issue on it and full describe it. Your proposed change is too long, you should split in multiple pull request and wait each part to be reviewed.

@panicking
Thanks for the suggestions! I've gone ahead and created an issue #114 , and also updated the PR description to make the purpose of this PR clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants