Skip to content

Conversation

@bripeticca
Copy link
Contributor

@bripeticca bripeticca commented Nov 11, 2025

This PR refactors diagnostic handling in the Swift build system by introducing a dedicated message handler and per-task output buffering to properly parse and emit compiler diagnostics individually.

Key Changes

SwiftBuildSystemMessageHandler

  • Introduced a new dedicated handler class to process SwiftBuildMessage events from the build operation
  • Moved message handling logic out of inline nested functions for better organization and testability
  • Maintains build state, progress animation, and diagnostic processing in a single cohesive component

Per-Task Data Buffering

  • Added taskDataBuffer struct in BuildState to capture compiler output per task signature
  • New TaskDataBuffer struct allows for using LocationContext or LocationContext2 as a subscript key to fetch the appropriate data buffer for a task, defaulting to the global buffer if no associated task or target can be identified.
  • Task output is accumulated in the buffer as .output messages arrive
  • Buffer contents are processed when tasks complete, ensuring all output is captured before parsing
  • Failed tasks with no useful or apparent message will be demoted to an info log level to avoid creating too much noise on the output.

* Built tentative test class SwiftBuildSystemOutputParser to
  handle the compiler output specifically
* Added a handleDiagnostic method to possibly substitute the
  emitEvent local scope implementation of handling a
  SwiftBuildMessage diagnostic
* the flag `appendToOutputStream` helps us to determine whether
  a diagnostic is to be emitted or whether we'll be emitting
  the compiler output via OutputInfo
* separate the emitEvent method into the SwiftBuildSystemMessageHandler
@bripeticca bripeticca force-pushed the swb/diagnosticcodesnippet branch from a20f020 to f3aaabf Compare November 20, 2025 18:56
@bripeticca bripeticca force-pushed the swb/diagnosticcodesnippet branch from 1dcaaeb to c48e606 Compare November 20, 2025 18:58
@bripeticca
Copy link
Contributor Author

@swift-ci please test

@bripeticca bripeticca changed the title [WIP] Capture code snippet from diagnostic compiler output Capture code snippet from diagnostic compiler output Nov 20, 2025
@bripeticca bripeticca marked this pull request as ready for review November 20, 2025 19:21
Copy link
Contributor

@owenv owenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thie generally lgtm but I have some concerns about the regex-based parsing when we emit the textual compiler output.

  1. Perf - It's important this is fast so that it doesn't block the end of the build if a command produces huge quantities of output. It's hard to say if this will be a real issue without some testing
  2. We're re-parsing information which we're already getting from the compiler in structured form. I see the appeal of not reporting a diagnostic twice if multiple compile jobs report it though

@jakepetroules
Copy link
Contributor

2. We're re-parsing information which we're already getting from the compiler in structured form. I see the appeal of not reporting a diagnostic twice if multiple compile jobs report it though

I also pointed this out inline, though I'm not sure why the re-parsing relates to deduplication? We should be able to deduplicate diagnostics whether we parse them again or use the existing ones.

@bripeticca
Copy link
Contributor Author

bripeticca commented Nov 21, 2025

though I'm not sure why the re-parsing relates to deduplication

(@jakepetroules @owenv -- tagging for visibility, GitHub notifications can be weird)

Re-parsing doesn't affect the de-duplication -- when tracking the data buffer per task, I also track whether we've emitted the associated output for a given task (using its signature) and guard against this before emitting for that task again. We only go down this path (emitting for a given task) once we've received the task completed event.

I mentioned this inline as well but for visibility: the re-parsing just addresses the fact that for a given task signature, we have an accumulated data buffer that contains all possible diagnostic messages coming from the compiler. I find that the user ergonomics aren't great when simply emitting the entire string blob through the observability scope, since these diagnostics can have varying severities and we'll have to decide up-front which severity to choose to emit the entire string of all diagnostics.

I do also maintain a list of the DiagnosticInfo that we omit in favour of emitting the OutputInfo containing the same diagnostic messages, but with the plus of having the pre-formatted code snippet (the DiagnosticInfo is missing the pre-formatted code snippet but contains enough information to recreate it ourselves, and it was suggested that we instead fall back to using the OutputInfo for that reason).

Perhaps some more discussion is needed here. 😄

* Remove taskStarted outputStream emissions
* Propagate exception if unable to find path
Copy link
Contributor

@bkhouri bkhouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work, but we should also include automated tests as part of this change to ensure we won't regress the behaviour.

public protocol DiagnosticsHandler: Sendable {
func handleDiagnostic(scope: ObservabilityScope, diagnostic: Diagnostic)

func printToOutput(message: String)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's the DiagnosticHandlers responsibility to print the diagnostic here. We should instead have a "LoggingHandler" which would know where to emit the "message", whether it's to stdout, stderr, to a file, or some other place.

/// Tracks the task IDs for failed tasks.
private var failedTasks: [Int] = []
/// Tracks the tasks by their signature for which we have already emitted output.
private var tasksEmitted: Set<String> = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (possibly-blocking): what is the difference between tasksEmitted and taskIDsEmitted? If they are tracking the same thing, can we remove one? If we need both, could you please provide an explanation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many taskIDs can map to a singular taskSignature, and so I initially thought it could be helpful to track both separately. However, I think simply tracking by taskSignature may be sufficient here, so I removed taskIDsEmitted in the latest change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many taskIDs can map to a singular taskSignature

I think that shouldn't be true within a single build operation. Are you seeing that to not be the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen this behaviour locally, usually between two tasks that have an execution description of Emit Swift module (arm64) and Compile <filename>.swift (arm64) respectively; they'll share a task signature but have differing task IDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a bug. The signatures are supposed to be globally unique.

cc @owenv

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we should avoid using the signatures at all right now, and use only the numeric task IDs for safety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, these two tasks also produce the same output (particularly the diagnostics) which helped to reduce duplicate logs. I'll try some alternative approaches locally but I may have to track both ID and signature for the time being.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue to investigate this in a separate PR: #9485

* Defer emission of diagnostics to completed task event
* Remove global data buffer from TaskDataBuffer
* Remove emits to observabilityScope for taskStarted events
* Move SwiftBuildSystemMessageHandler to its own file
* Rename printToOutput to print(_ output:verbose:)
* Remove unprocessedDiagnostics parameter from message handler
* Remove taskIDsEmitted in favour of using tasksEmitted
Modified the map to track these by targetID rather than name;
since the `BuildResult` is expecting a map by targetName, another
computed property representing the same map by targetName has also
been added
@bripeticca
Copy link
Contributor Author

@swift-ci please test

@bripeticca
Copy link
Contributor Author

@swift-ci please test windows

Copy link

@PhantomInTheWire PhantomInTheWire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @bripeticca just a quick one do you think your changes might fix #8877?

@bripeticca
Copy link
Contributor Author

@PhantomInTheWire I'll take a look and see if this addresses the issue, thanks for the heads up!

@bripeticca
Copy link
Contributor Author

@swift-ci please test

@bripeticca
Copy link
Contributor Author

@swift-ci please test windows

@bripeticca
Copy link
Contributor Author

Errors seem unrelated to this change, re-triggering tests

@swift-ci please test linux

@bripeticca
Copy link
Contributor Author

@swift-ci please test macOS

@bripeticca
Copy link
Contributor Author

@swift-ci please test linux

@bripeticca
Copy link
Contributor Author

@PhantomInTheWire It looks like this change will definitely affect possible solutions for #8877 , but I am skeptical that it will resolve it entirely. If you're interested in taking a look at it still, I would highly recommend doing so once this PR is merged 😄

@PhantomInTheWire
Copy link

sure i'll take a look once this is merged, @bripeticca

…t stream

Some minor fixes to how task output is handled in the SwiftBuildSystemMessageHandler.
@jakepetroules
Copy link
Contributor

@swift-ci test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants