Skip to content

test(cli): support JSON stream decoding in golden tests to fix protojson instability#6433

Open
sachin9058 wants to merge 6 commits into
mindersec:mainfrom
sachin9058:fix/json-stream-tests
Open

test(cli): support JSON stream decoding in golden tests to fix protojson instability#6433
sachin9058 wants to merge 6 commits into
mindersec:mainfrom
sachin9058:fix/json-stream-tests

Conversation

@sachin9058
Copy link
Copy Markdown
Contributor

Summary

This PR fixes unstable CLI golden tests caused by protojson formatting differences and newline-delimited JSON output.

Currently, CLI outputs can be a JSON stream (multiple JSON objects), which makes string-based comparisons and JSONEq unsuitable. This results in flaky tests across environments (local vs CI).

This change introduces a decoder-driven approach using json.Decoder to parse JSON streams and compare structured data instead of raw strings.

Key changes:

  • Added support for newline-delimited JSON (JSON streams) in golden file comparison

  • Retained JSONEq for single JSON objects (backward compatibility)

  • Added graceful fallback to string comparison for non-JSON outputs (YAML, tables, etc.)

  • Removed heuristic-based detection in favor of decoder-driven parsing

  • Added unit tests for:

    • JSON stream parsing
    • Single JSON object parsing
    • Invalid JSON handling
    • Non-JSON input (YAML/text)
    • Empty JSON array edge case

This ensures deterministic and environment-independent test behavior.

Fixes #6430


Testing

The changes were tested using the existing CLI test suite and additional unit tests.

Steps:

  1. Run all tests:

    go test ./...
    
  2. Verified:

    • JSON stream outputs are correctly parsed and compared structurally
    • Single JSON outputs continue to work with JSONEq
    • Non-JSON outputs (YAML, tables) fall back to string comparison
    • No flaky failures observed across runs

All existing tests pass, and new tests validate the updated behavior.

- decode newline-delimited JSON using json.Decoder
- compare structured output instead of raw strings
- add tests for stream, single JSON, invalid, and non-JSON cases
- eliminate flakiness caused by protojson formatting
@sachin9058 sachin9058 requested a review from a team as a code owner May 1, 2026 07:52
@DharunMR
Copy link
Copy Markdown
Contributor

DharunMR commented May 1, 2026

IMO you need to add those json testdata again to verify it passes

sachin9058 added 5 commits May 1, 2026 15:55
…o-end coverage

- decode newline-delimited JSON using json.Decoder
- compare structured output instead of raw strings
- retain JSONEq for single JSON objects
- fallback to string comparison for non-JSON outputs
- add unit tests for JSON stream parsing
- add artifact list JSON golden test with fixture

fixes mindersec#6430
@DharunMR
Copy link
Copy Markdown
Contributor

DharunMR commented May 1, 2026

actually you should probably need to add the json stream testdata and those profile, repo testcase that was removed in this pr #6417 since they fall directly under the scope of what we're fixing here

@sachin9058
Copy link
Copy Markdown
Contributor Author

@DharunMR Yes, thank you. I actually forgot to add the golden earlier I had only included parsing tests. I’ve now added them, and CI is passing. However, the NATS test is still failing. It looks flaky because when I run it locally multiple times, it consistently passes. For now, I’ve retriggered the CI Let’s see what happens.

@sachin9058
Copy link
Copy Markdown
Contributor Author

sachin9058 commented May 1, 2026

Yes, Once the CI passes and it gets merged, I will rebase and revert the golden files that I removed there.

@coveralls
Copy link
Copy Markdown

Coverage Status

coverage: 60.595% (+0.04%) from 60.56% — sachin9058:fix/json-stream-tests into mindersec:main

@sachin9058
Copy link
Copy Markdown
Contributor Author

Well CI is green now ... That JSON stream error is passing now let see what @evankanderson thinks about this fix...

@sachin9058
Copy link
Copy Markdown
Contributor Author

@DharunMR
The artifact list command now emits newline-delimited JSON, so the stream parsing path is exercised in tests. This PR mainly ensures the comparison logic is robust for such outputs.
Happy to extend coverage further if needed, but wanted to keep this PR focused.

@DharunMR
Copy link
Copy Markdown
Contributor

DharunMR commented May 1, 2026

yep, i thought adding those testcase back is good but we can wait for @evankanderson

@sachin9058
Copy link
Copy Markdown
Contributor Author

@evankanderson Hey evan can u take a look on this ???

Copy link
Copy Markdown
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will probably work, but I'm concerned that we're only partially hiding the underlying non-determinism, because the golden files on-disk won't necessarily have the same formatting each time they are generated.

Comment on lines -87 to +102
cmd.Println(out)
if _, err := cmd.OutOrStdout().Write([]byte(out + "\n")); err != nil {
return cli.MessageAndError("Error writing yaml to stdout", err)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the same thing as cmd.Println() except that it logs an error message if we try to redirect to something which doesn't accept writes. I'm not sure the error message in that case is a big improvement, but I don't want to block the rest of the value of this PR on that.

if err != nil {
return cli.MessageAndError("Error getting json from proto", err)
// Emit each artifact as a separate JSON object (JSON stream)
for _, art := range artifactList.Results {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine for the output to be JSON-LD; it also seems fine to define that we'll only emit a single JSON element.

In either case, it would be great to record CLI patterns somewhere -- maybe in DEVELOPMENT.md? Right now, we're not totally consistent about any of the following:

  • Output columns and names
  • JSON output contents
  • Positional parameters and flags like --file
  • Tabular output and multi-row output

It would be great to have a "design philosophy" around these that people could reference when making changes, such that the CLI starts to feel more consistently and thoughtfully designed.

I don't think it needs to happen in this PR, but it would be useful to start making some of these decisions explicit.

Comment on lines +25 to +30
// WithCLIClient is an alias for WithRPCClient kept for backwards/alternate usage
// by tests or callers expecting a "CLI"-named helper.
func WithCLIClient[T any](ctx context.Context, client T) context.Context {
return WithRPCClient[T](ctx, client)
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we wouldn't just switch over to one consistent name here -- this is in internal, so nothing outside the codebase should be depending on it.

Comment on lines +127 to 131
// Try standard JSON comparison first
if json.Valid(expected) && json.Valid([]byte(actual)) {
// if it's valid json compare the objects (ignores spaces/newlines)
require.JSONEq(t, string(expected), actual, "JSON Output does not match golden file")
} else {
// if it's a table, txt, or yaml fallback to exact string matching
require.Equal(t, string(expected), actual, "Output does not match golden file")
return
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that using require.JSONeq here or some other non-byte comparison will cause the golden files to "churn" when we run --update.

Taking the output from protojson and running it through the standard Go encoding/json module feels like it might be the better approach (and would avoid needing to add a bunch of complex test evaluation logic)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less efficient, but we're talking about some in-memory operations on strings <10KB vs filesystem output, so the extra parsing is probably in the noise.

@evankanderson
Copy link
Copy Markdown
Member

(Sorry, was out Friday and had a busy weekend.) Thanks to @DharunMR for taking a first look at this PR and helping get some of the details correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unstable JSON output in CLI tests due to protojson formatting

4 participants