-
Notifications
You must be signed in to change notification settings - Fork 11
Reduce complexity of CriticalPathQueuingDurationDataProvider #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3852557 to
3bb8c61
Compare
| public List<CompleteEvent> getCompleteEvents() { | ||
| completeEvents.sort(Comparator.comparing((e) -> e.start)); | ||
| return ImmutableList.copyOf(completeEvents); | ||
| synchronized (this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like completeEvents is actually mutable, and mutated in ProfileThread.addEvent. So if getCompleteEvents is called, then addEvent, then getCompleteEvents again, the second call will return a possibly unsorted list.
Is it possible to make the object immutable?
If not, is there any guarantee of call order we could enforce? For example, if addEvent is never expected to be called after getCompleteEvents, then getCompleteEvents should "freeze" the object (set some flag), no longer allowing mutation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, although addEvent is only currently called when creating the BazelProfile
Lines 133 to 134 in cdb011d
| // TODO: Use success response to take action on errant events. | |
| profileThread.addEvent(traceEvent); |
It definitely wouldn't hurt to, for example, use an immutable list here to ensure we don't break the ordering afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make it so the code guarantees correct calling order? For example add a ProfileThread.Builder class with a mutable completeEvents that you can call addEvent on, then the final ProfileThread would have an ImmutableList as completeEvents (and it'd be sorted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check for this that throws an ISE if such modifications are attempted in 5aac8c1.
(also, a test for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, we were also doing a lot of expensive copying and re-sorting in other places which I missed. Switching to a builder pattern cleaned all of those up, should be done in 9868671
PTAL
Signed-off-by: Felipe <afrueda97@outlook.com>
3bb8c61 to
cdb011d
Compare
|
Apologies for the force-push, I had accidentally pushed an old version of this work |
…ng sorted Signed-off-by: Felipe <afrueda97@outlook.com>
Signed-off-by: Felipe <afrueda97@outlook.com>
…d data Signed-off-by: Felipe <afrueda97@outlook.com>
| * modified anymore. | ||
| */ | ||
| public boolean addEvent(JsonObject event) { | ||
| if (sorted) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are instances of ProfileThread expected to be accessed concurrently?
Since getCompleteEvents synchronizes on this to access sorted, we must do the same here. Or make sorted and AtomicBoolean.
If access to some members must be synchronized, then it's likely that we should make all member access thread-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think GitHub linked you to the previous commit, this code is gone now in favor of the builder pattern you suggested
(which, being fully immutable after built, this class should also now be safe to access concurrently)
Signed-off-by: Felipe <afrueda97@outlook.com>
Signed-off-by: Felipe <afrueda97@outlook.com>
|
Friendly ping |
|
Sorry I'll only be able to get to this review towards the end of the week 😞 |
laszlocsomor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything other than CriticalPathQueuingDurationDataProvider.java looks good; I didn't review that file because I'm not sure it's related?
| import java.util.stream.Collectors; | ||
| import javax.annotation.Nullable; | ||
|
|
||
| public class ProfileThread { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional, for a follow-up: use @AutoValue and @AutoBuilder to remove some of the boilerplate (getters, equals/hashCode/toString)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the changes in this file relevant to this PR?
like the new variables such as cPathActionNameToSelfEvent and related logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are, unfortunately GitHub does a great job of not showing the differences.
Essentially the changes are intended to:
- Reduce how many loops are performed over
bazelProfile.getCriticalPath().get().getCompleteEvents()and.getThreads().flatMap((thread) -> thread.getCompleteEvents().stream()). It should be happening just once for each of these lists now. - "Unroll" complicated stream-based code into simpler for loops, to make it easier to get a sense of its performance and scalability
- Use more efficient data structures for lookups, which are the
HashMaps andHashMultimaps at the top.
laszlocsomor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % comments
...vocation/analyzer/dataproviders/remoteexecution/CriticalPathQueuingDurationDataProvider.java
Show resolved
Hide resolved
| // We have found cases where the end time of the critical path event is less than the end | ||
| // time of the processing event. This might be a bug / inconsistency in Bazel profile | ||
| // writing. | ||
| if (!cPathEvent.end.almostEquals(event.end) && cPathEvent.end.compareTo(event.end) <= 0) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came from L96 on the left, and that check was only there for "documentation purposes".
Now we'll actually do the check. Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I've reverted this to an equivalent of how it originally was (always-false check now).
...vocation/analyzer/dataproviders/remoteexecution/CriticalPathQueuingDurationDataProvider.java
Outdated
Show resolved
Hide resolved
85bbc81 to
66ff455
Compare
Signed-off-by: Felipe <afrueda97@outlook.com>
Signed-off-by: Felipe <afrueda97@outlook.com>
059a6db to
4f6fab5
Compare
Signed-off-by: Felipe <afrueda97@outlook.com>
This provider currently performs several full loops over the Bazel profiles. This results in it not scaling well performance and memory consumption-wise.
Also at the profile level, some expensive copying and re-sorting of bazel profile data is performed each time it is requested by the providers.
Allocations:

CPU:

This PR aims to reduce the amount of loops over the data in
CriticalPathQueuingDurationDataProviderto just one, enabling it to scale better.Also, cleans the expensive copying and re-sorting of bazel profile data each time it is requested, by switching thread building to a Builder pattern, which sorts the entries internally as needed before returning the immutable sorted result.
Thanks to @iamricard for doing a lot of the initial work and investigation on this (the branch was opened in another repository so I didn't know how to cleanly credit him in the commits I moved over to here)