chore(Datastore): Add Operation and Attempt metrics for HttpJson transport#12063
chore(Datastore): Add Operation and Attempt metrics for HttpJson transport#12063
Conversation
jinseopkim0
left a comment
There was a problem hiding this comment.
nit: s/Datatore/Datastore/ in title of the PR.
| <O> O invokeRpc(Callable<O> block, String startSpan, String methodName) { | ||
| TraceUtil.Span span = otelTraceUtil.startSpan(startSpan); | ||
| Stopwatch stopwatch = isHttpTransport ? Stopwatch.createStarted() : null; | ||
| String operationStatus = StatusCode.Code.OK.toString(); |
There was a problem hiding this comment.
If an exception other than RetryHelperException occurs, then we would be returning operationStatus = OK in the finally block, right? Would it be better to initially set operationStatus to something like UNKNOWN and set it to OK just before returning?
e.g.
String operationStatus = StatusCode.Code.UNKNOWN.toString();
try (TraceUtil.Scope ignored = span.makeCurrent()) {
// ...
O result = RetryHelper.runWithRetries(...);
operationStatus = StatusCode.Code.OK.toString();
return result;
} catch (RetryHelperException e) {
// ...
There was a problem hiding this comment.
I believe all exceptions get caught and re-thrown as a RetryHelperException: https://github.com/googleapis/sdk-platform-java/blob/56aa3438393e1992bb8f688291fb12030ec9bc0e/java-core/google-cloud-core/src/main/java/com/google/cloud/RetryHelper.java#L51-L55
There was a problem hiding this comment.
I see a TODO in the link:
// TODO: remove RetryHelperException, throw InterruptedException or
// ExecutionException#getCause() explicitly
I'm thinking it would be more future-proof if we do not assume that all exceptions will be RetryHelperExceptions. (Also, would it be possible for JVM throw errors like OutOfMemoryErrors?) WDYT?
There was a problem hiding this comment.
Changing that in RetryHelper may happen in the future and that change impacts quite a bit of code in java-core. I don't think this PR needs to plan for it.
If I catch for Exception, then I'll need to modify code here: https://github.com/googleapis/sdk-platform-java/blob/7ab6d2e5784f264558b7aca6f23e771361dbea3a/java-core/google-cloud-core/src/main/java/com/google/cloud/BaseServiceException.java#L270
as the logic in DatstoreException.translateAndThrow(e) calls BaseServiceException.translate(ex);
I'll update it to default to UNKNOWN and we can update the RetryHelper in a future change
This adds operation and attempt metrics that are being recorded via Gax. Since gRPC can integrate natively with Gax, this metrics are captured. This instruments the metrics to be collected via HttpJson.
Sample view of the metrics collected:
