Skip to content

fix(cli): optimize protobuf buf generate to reuse single working dir#13673

Merged
Swimburger merged 17 commits intomainfrom
devin/1773863405-parallelize-protobuf-generation
Mar 19, 2026
Merged

fix(cli): optimize protobuf buf generate to reuse single working dir#13673
Swimburger merged 17 commits intomainfrom
devin/1773863405-parallelize-protobuf-generation

Conversation

@Swimburger
Copy link
Member

@Swimburger Swimburger commented Mar 18, 2026

Description

Refs performance investigation of dual buf generate passes in OSSWorkspace.ts.

Proto files go through buf generate twice—once via protoc-gen-openapi (OpenAPI path) and once via protoc-gen-fern (IR path). This PR:

  1. Parallelizes the two passes so protoc-gen-fern runs concurrently with protoc-gen-openapi.
  2. Eliminates repeated per-file setup by preparing a single working directory once per proto root and reusing it for every proto target (instead of creating a new temp dir, copying the proto root, running which checks, writing configs, and resolving deps for each file).
  3. Groups all protobuf specs by proto root so both explicit-target and no-target specs share the same prepared directory.
  4. Caches protobuf → OpenAPI results at the OSSWorkspace level so that toFernWorkspace() and validateOSSWorkspace() share the same generated specs instead of running buf generate twice.

For N proto files, this eliminates (N−1) × the expensive setup operations: temp dir creation, recursive copy, 2 which subprocess calls, config file writes, and buf dep update. The workspace-level cache further eliminates a full duplicate pass that previously ran during validation.

Benchmark Results

Tested with 21 proto targets (all with explicit targets):

Branch Run 1 Run 2 Run 3 Average
main 7.223s 7.368s 7.371s 7.321s
PR (optimized) 4.311s 3.956s 3.893s 4.053s

44% faster (~3.3s saved) — measured before the caching fix, which eliminates an additional full duplicate pass visible in fern generate (where both toFernWorkspace and workspace validation independently called getAllOpenAPISpecs).

Changes Made

  • ProtobufOpenAPIGenerator — new prepare() / generateFromPrepared() API: prepare() does all one-time setup (temp dir, copy proto root, write buf.gen.yaml + buf.yaml, resolve buf and protoc-gen-openapi binaries via auto-download helpers, run buf dep update). generateFromPrepared() only runs buf generate <target> and renames the output to a unique temp file.
  • Auto-download integration: prepare() calls ensureBufResolved() and ensureProtocGenOpenAPIResolved() — matching the pattern in generateLocal() — so the auto-download feature from feat(cli): auto-download protoc-gen-openapi binary when not on PATH #13667 works correctly in the optimized path. If protoc-gen-openapi isn't on PATH, it's resolved via resolveProtocGenOpenAPI and the download directory is prepended to buf's PATH via envOverride.
  • Debug logging for binary resolution: ensureBufCommand() now logs the resolved path when buf is found on PATH (e.g. Found buf on PATH: /usr/local/bin/buf) and when auto-downloaded. ensureProtocGenOpenAPIResolved() does the same for protoc-gen-openapi. Helps users debug which binary is being used.
  • getAllOpenAPISpecs — group-by-root refactor: All protobuf specs (both explicit-target and no-target) are grouped by absoluteFilepathToProtobufRoot. Each group calls prepare() once, then iterates targets with generateFromPrepared(). Removed the old existingBufLockContents caching approach and convertProtobufToOpenAPI (dead code after refactor).
  • OSSWorkspace.getOpenAPISpecsCached(): New public method that caches the Promise<OpenAPISpec[]> from getAllOpenAPISpecs(), keyed by relativePathToDependency. Both getOpenAPIIr() and getIntermediateRepresentation() now use this cache, and validateOSSWorkspace calls it instead of getAllOpenAPISpecs directly. This eliminates the duplicate protoc-gen-openapi pass that occurred during fern generate.
  • Extracted makeOpenApiSpec helper: Shared logic for constructing OpenAPISpec objects from generator results.
  • OSSWorkspace.getIntermediateRepresentation: protoc-gen-fern IR generation kicked off as a promise at method start, runs concurrently with the OpenAPI pipeline, results merged after.
  • Extracted generateAllProtobufIRs helper: Processes protobuf specs sequentially (not parallel—npm install -g in ProtobufIRGenerator is not safe to run concurrently). The cross-pass parallelism with the OpenAPI path is still preserved.
  • Dead code cleanup: Removed convertProtobufToOpenAPI, unused bufLockContents field from PreparedWorkingDir, unused MaybeValid and isNonNullish imports.
  • Orphaned seed folder cleanup: Removed 3 orphaned csharp-sdk seed folders via pnpm seed clean.
  • Added versions.yml entry (4.37.1).

Important for Review

  • Merge ordering change: Protobuf IRs are now always merged after all OpenAPI/AsyncAPI/OpenRPC documents, whereas before they were interleaved in the allSpecs loop. Verified that mergeIntermediateRepresentation uses ir1.x ?? ir2.x for scalars (first-wins) and spread/concat for collections — protobuf specs were already processed after OpenAPI docs in the old ordering, so precedence is preserved.
  • Caching stores a Promise: getOpenAPISpecsCached caches the in-flight promise, so concurrent callers share the same work. If the promise rejects, subsequent callers get the same rejection (acceptable since failures are fatal).
  • Grouping assumes shared config per root: When multiple specs share the same absoluteFilepathToProtobufRoot, only the first spec's dependencies, generateLocally, and relativeFilepathToProtobufRoot are used for prepare(). This matches the data model (one proto root = one config block in generators.yml), but is not validated at runtime.
  • generateFromPrepared assumes output file exists after buf generate: If buf generate succeeds (exit code 0) but produces no output, the subsequent rename() will throw. protoc-gen-openapi always writes a minimal OpenAPI doc even for proto files with no services, and the old code had the same implicit assumption.
  • Stricter error handling in multi-file path: The old loop silently skipped files where convertProtobufToOpenAPI returned undefined; the new generateFromPrepared calls failAndThrow on non-zero exit. This is intentionally stricter.

Human Review Checklist

  • Confirm mergeIntermediateRepresentation is order-independent (protobuf IRs now merged after all other specs instead of interleaved) — verified: scalar fields use ir1 ?? ir2 (first-wins), collections use concat; old ordering already had protobuf after OpenAPI
  • Verify protoc-gen-openapi always writes output/openapi.yaml on exit code 0 — verified: plugin always writes output when invoked; old code had same assumption
  • Sanity-check that the prepare() setup logic mirrors doGenerateLocal() — verified step-by-step: air-gapped detection, tmp dir, cp, config writes, binary resolution via ensureBufResolved/ensureProtocGenOpenAPIResolved, envOverride, buf.yaml, buf dep update all match
  • Verify auto-download envOverride is correctly propagated through both prepare() and generateFromPrepared() — verified: stored in PreparedWorkingDir, passed to both buf dep update and buf generate executables
  • Verify prepare() uses ensureBufResolved() / ensureProtocGenOpenAPIResolved() (not raw which checks) so buf auto-download from feat(cli): auto-download protoc-gen-openapi binary when not on PATH #13667 is not regressed
  • Verify getOpenAPISpecsCached keying by relativePathToDependency is correct — the cache key is relativePathToDependency ?? "", so the common case (undefined) shares a single cache entry across getOpenAPIIr, getIntermediateRepresentation, and validateOSSWorkspace

Testing

  • pnpm run check (biome) passes
  • TypeScript compilation passes (tsc --noEmit)
  • All CI checks pass (compile, lint, test, test-ete, depcheck, versions.yml, seed tests)
  • Benchmarked locally with 21 proto targets: 44% faster (7.3s → 4.1s)
  • pnpm seed clean run — removed 3 orphaned seed folders
  • No unit tests added — this is a performance optimization on the protobuf generation path which requires actual proto files + buf + protoc-gen-openapi to exercise.

Link to Devin session: https://app.devin.ai/sessions/1574d18acdd340968bb63a4bf2f686fe
Requested by: @Swimburger


Open with Devin

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Code review skipped — your organization's overage spend limit has been reached.

Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit in Settings → Usage.

Once credits are available, reopen this pull request to trigger a review.

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

🌱 Seed Test Selector

Select languages to run seed tests for:

  • Python
  • TypeScript
  • Java
  • Go
  • Ruby
  • C#
  • PHP
  • Swift
  • Rust
  • OpenAPI
  • Postman

How to use: Click the ⋯ menu above → "Edit" → check the boxes you want → click "Update comment". Tests will run automatically and snapshots will be committed to this PR.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@devin-ai-integration
Copy link
Contributor

Performance Benchmark

Benchmarked fern check on the csharp-grpc-proto-exhaustive fixture (1 proto target + OpenAPI):

Main (before) PR (after)
Run 1 1656ms 1401ms
Run 2 1410ms 1407ms
Run 3 1422ms 1409ms
Run 4 1383ms 1481ms
Run 5 1429ms 1386ms
Avg 1460ms 1416ms

Result: ~44ms faster (3%) on this small fixture with only 1 proto target file.

Expected scaling with more proto files

This fixture only has 1 proto target, so the parallelization benefit is limited to running the two buf generate passes (protoc-gen-openapi and protoc-gen-fern) concurrently.

For the ~17 proto files mentioned in the original investigation, the gains are larger because:

  1. Cross-pass parallelism: Both buf generate passes now run concurrently instead of serially (~2× for that portion)
  2. Within-pass parallelism: Per-file OpenAPI generation (16 of 17 files) runs in parallel after the first file resolves buf.lock

Estimated savings with 17 proto files at ~100ms each:

  • Before: ~100ms × 17 (OpenAPI) + ~100ms × 17 (IR) = ~3.4s serial
  • After: ~100ms × 1 (first file) + ~100ms (parallel remaining) + ~100ms (parallel IR) ≈ ~300ms wall-clock

…oto files

Instead of creating a new temp dir, copying the proto root, running which
checks, writing configs, and resolving deps for EVERY proto file, we now:

1. prepare() - does all setup once: temp dir, copy, which checks, buf.yaml,
   buf dep update
2. generateFromPrepared() - only runs 'buf generate <target>' per file

For N proto files this eliminates (N-1) × expensive setup operations:
- temp dir creation + recursive copy of proto root
- 2 'which' subprocess calls (buf, protoc-gen-openapi)
- buf.yaml + buf.gen.yaml file writes
- buf dep update (network call to resolve dependencies)

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
@devin-ai-integration devin-ai-integration bot changed the title fix(cli): parallelize protobuf buf generate passes fix(cli): optimize protobuf buf generate to reuse single working dir Mar 18, 2026
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 4 commits March 18, 2026 20:59
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…epare/generateFromPrepared

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…s explicit targets

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

Open in Devin Review

Comment on lines +43 to +48
const preparedDir = await generator.prepare({
absoluteFilepathToProtobufRoot: representative.absoluteFilepathToProtobufRoot,
relativeFilepathToProtobufRoot: representative.relativeFilepathToProtobufRoot,
local: representative.generateLocally,
deps: representative.dependencies
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Grouping by proto root silently ignores differing dependencies and generateLocally of non-representative specs

When multiple ProtobufSpec objects share the same absoluteFilepathToProtobufRoot, they are grouped together and only the first spec's dependencies, relativeFilepathToProtobufRoot, and generateLocally are passed to generator.prepare() at getAllOpenAPISpecs.ts:43-48. Any other specs in the same group with different dependencies or generateLocally values have their properties silently ignored.

In the old code, each ProtobufSpec was processed independently via convertProtobufToOpenAPI, which called generator.generate() with that spec's own deps and local flag. The new grouping optimization assumes all specs sharing the same root have identical configuration, but this is not validated. If a second spec has additional or different dependencies, buf generate could fail or produce incorrect output because its deps were never resolved in the shared working directory.

Prompt for agents
In packages/cli/workspace/lazy-fern-workspace/src/utils/getAllOpenAPISpecs.ts, after selecting the representative spec (line 38) and before calling generator.prepare() (line 43), add a validation that all specs in the group have the same dependencies, generateLocally, and relativeFilepathToProtobufRoot values. If they differ, either merge the dependencies (take the union) or fall back to processing specs independently via the old generate() path. At minimum, log a warning when discrepancies are detected.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid observation but not a bug in practice — specs sharing the same absoluteFilepathToProtobufRoot come from the same protobuf block in generators.yml, so they always share the same dependencies, generateLocally, and relativeFilepathToProtobufRoot. The grouping mirrors the data model: one proto root = one config.

The old code also didn't handle this case differently — each explicit-target spec passed its own config to generate(), but those configs were always identical since they originated from the same source.

I'll leave this as-is since adding validation for an impossible state would be over-engineering, but happy to add an assertion if the reviewer prefers.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits March 18, 2026 22:45
…ace condition

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
@devin-ai-integration
Copy link
Contributor

Self-Review

Reviewed the full diff across all 4 changed files. Here are my findings:

Issues Found

1. convertProtobufToOpenAPI is now dead code
The exported convertProtobufToOpenAPI function in getAllOpenAPISpecs.ts (lines 83-110) is no longer called by anyone — the main getAllOpenAPISpecs function now uses the prepare/generateFromPrepared path directly. Grepped the entire repo and confirmed it's only referenced at its own definition. Should be removed to avoid confusion.

2. bufLockContents in generateFromPrepared() return is unused
generateFromPrepared() returns { absoluteFilepath, bufLockContents } but no caller ever reads bufLockContents from the result. In getAllOpenAPISpecs, the result is passed to makeOpenApiSpec() which only uses result.absoluteFilepath. The field exists for API consistency with generate(), but it's dead data in the new path. Could simplify or document why it's kept.

3. Missing output/ directory cleanup between generateFromPrepared() calls
After rename(outputPath, uniqueOutput), the output/ directory remains (empty) in the prepared working dir. Not a bug — buf generate recreates it on the next call — but worth noting there's no cleanup of the temp files created by tmp.file(). These persist until process exit via tmp-promise defaults, which is fine for reasonable N.

Confirmed Correct

  • Sequential generateAllProtobufIRs: Fixed from Promise.all to for...of loop per Devin Review comment — npm install -g race condition resolved.
  • envOverride propagation: Correctly flows through prepare()PreparedWorkingDirgenerateFromPrepared()buf executable. Also correctly applied to buf dep update in prepare().
  • Merge ordering: Protobuf IRs merged after OpenAPI/AsyncAPI/OpenRPC — matches old behavior where protobuf specs were processed after OpenAPI docs in the allSpecs loop.
  • Error handling: generateFromPrepared() calls failAndThrow on non-zero exit (intentionally stricter than old silent-skip behavior). Documented in PR description.
  • Group-by-root assumption: Valid per data model — specs sharing same absoluteFilepathToProtobufRoot originate from the same protobuf block in generators.yml.

Recommendation

Remove convertProtobufToOpenAPI (dead code) and optionally simplify generateFromPrepared() return type. Happy to make these changes if desired.

devin-ai-integration bot and others added 2 commits March 18, 2026 23:19
…ents from generateFromPrepared

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…on (resolve versions.yml conflict)

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
devin-ai-integration bot and others added 3 commits March 18, 2026 23:46
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…on (resolve versions.yml conflict)

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…d bufLockContents from PreparedWorkingDir

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits March 19, 2026 01:18
Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
…PATH

Co-Authored-By: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
@Swimburger Swimburger enabled auto-merge (squash) March 19, 2026 01:31
@Swimburger Swimburger merged commit 4d74a4f into main Mar 19, 2026
301 checks passed
@Swimburger Swimburger deleted the devin/1773863405-parallelize-protobuf-generation branch March 19, 2026 01:41
HoaX7 pushed a commit to HoaX7/fern that referenced this pull request Mar 25, 2026
…ern-api#13673)

Co-authored-by: Niels Swimberghe <3382717+Swimburger@users.noreply.github.com>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants