Skip to content

Improve error disambiguation in final JSON results #25

@cjonas9

Description

@cjonas9

What problem does your feature solve?

At the end of a run, blaster emits a JSON summary of how the endpoint performed. Included in this summary is a dict of errors -> error_stats (string -> struct{} -- the struct contains the error code, a count of how many times it was seen as a response, and the first/last time it was seen). With this, as an operator, it's difficult to disambiguate what this implies for the RPC's buckling point for that endpoint.

What would you like to see?

I think the most ergonomic means of resolving this is to record the RPS at which the errors occured. There are several options for this, but I propose keeping a list of ranges over which the error was encountered (e.g. 'rps_seen_at' : [8], [10, 20], [22, 30]). This gives the operator more information regarding where in the RPS range the endpoint faltered, buckled, and eventually failed at without having to sift through the logs manually.
This is a low-overhead addition because we already have a merge range helper in this repo (it's used for the ledger window functionality).

What alternatives are there?

For less granularity, we could just keep the first/last RPS we saw the error at. For more, we could create a histogram of { RPS : [% requests at the RPS that errored] }. The latter is the most useful in providing a summary of what happened at each RPS, but it also increases the verbosity of the results section; this may be a tradeoff point where the solution is just to go read the logs if one needs that much detail.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestperformance-pod-scrumBoard items for performance pod work (anything related to load testing or adjacent work)

Type

No type

Projects

Status

To Do

Relationships

None yet

Development

No branches or pull requests

Issue actions