feat: Mvtx Short circuit dataplane changes#3123
feat: Mvtx Short circuit dataplane changes#3123vaibhavtiwari33 wants to merge 32 commits intomainfrom
Conversation
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
… put them as part of monovertex Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
…former spec Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
…condition spec Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
… splitter Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
…ss rules are set in spec. Add a check in dataplane code to only send messages when corresponding sub-sinks are defined Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3123 +/- ##
==========================================
- Coverage 79.75% 79.39% -0.36%
==========================================
Files 291 292 +1
Lines 65142 65441 +299
==========================================
+ Hits 51951 51958 +7
- Misses 12638 12928 +290
- Partials 553 555 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Requesting review for implementation outline before writing tests for the final implementation |
Signed-off-by: Vigith Maurice <vigith@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
yhl25
left a comment
There was a problem hiding this comment.
Should we consider injecting the bypass router (with sink handle) directly into the component start methods?
This way, components could write directly to the sink handle when bypass conditions match, rather than writing to an intermediate stream that then gets routed. It would reduce stream hops and cloning, also it would simplify the forwarder and gives more control for the components to do routing.
| // Read from a chunked stream of messages | ||
| let chunked_stream = input_stream.chunks_timeout(batch_size, chunk_timeout); | ||
| tokio::pin!(chunked_stream); |
There was a problem hiding this comment.
Why do we need to chunk the stream?
There was a problem hiding this comment.
write, write_to_fallback and write_to_on_success methods accept batch of messages and hence I'm collecting/building this batch by chunking the stream.
| // for each message read from the chunked stream determine which sink it should be sent to | ||
| // based on the bypass conditions and wrap it in respective MessageToSink enum value | ||
| for msg in msgs { | ||
| let msg_clone = msg.clone(); |
There was a problem hiding this comment.
Do we really need to clone the complete message?
There was a problem hiding this comment.
This was done for brevity in the rest of the while loop. If we remove this message cloning, we can't prove to the compiler that the original msg wasn't moved inside the enum for the None case in the subsequent match conditional.
This cloning can certainly be avoided by moving the match logic within these if/else statements.
We should be able to do that, my only concern was that these components are also used by the pipeline, thus inherently changing the component behaviour for one specific case of monovertex might introduce unwarranted complexity within the component logic. |
What this PR does / why we need it
This PR relates to the dataplane side implementation for short-circuiting in monovertex.
The implementation follows the design doc here
Related issues
Fixes #3101
Testing
Spec testing
✅ Deploy bypass spec with all currently supported bypass sinks
✅ Error when setting bypass config for a sink that doesn't exist:
Testing Scenarios
1. ✅ Bypass defines all sink types, but UDF only sets tags relevant to fallback and on-success sink
Expected behaviour: fallback and on-success sink receive data, but primary sink doesn't receive anything
Bypass spec:
Map UDF:
UDF doesn't set correct tag for sending data to primary sink -
Sink logs:
Fallback sink logs:
Similar logs for on-success sink are observed
2. ✅ Bypass defines fallback and on-success sink types, and UDF only sets tags relevant to fallback and on-success sink
Expected behaviour: All sinks receive data respective data.
Explanation, if conditions for primary sink aren't defined then we don't want to block users from sending data to it.
Bypass spec:
Map UDF:
UDF doesn't set any tag for sending data to primary sink, implementation is same as above.
Primary Sink logs:
Fallback sink logs:
Similar logs for on-success sink are observed
3. ✅ Bypass rules are not defined
Expected Behaviour: All the messages should just end up in primary sink and nothing in any other sink.
Map UDF:
Same implementation as above is used to add tags to messages.
Primary Sink logs:
Fallback and on-success sink logs:
4. ✅ Bypass rules are defined for fallback and sink. Both Transformer and Map route messages.
Expected behaviour: Messages tagged for fallback/onSuccess from transformer/UDF respectively should end up in fallback/onSuccess sink.
Bypass spec:
Source Transformer:
Map UDF:
Fallback/OnSuccess sink logs:
Explanation -
From UDFandFrom Source Transformersignifies that these messages are directly being written from UDF/Transformer hence bypassing components in between.Primary Sink logs:
Explanation -
From UDFmessages are ones that UDF tagged with 'sink' tag, while others are ones that it didn't tag.5. ✅ Bypass rules are defined for all sinks. Both Transformer and Map route messages.
Expected behaviour: Messages tagged for fallback/onSuccess/sink from transformer/UDF respectively should end up in respective sinks. Messages left out from being forwarded should be dropped.
Bypass spec:
Map/Transformer UDF are the same as above.
Fallback sink logs (onSuccess sink has similar logs):
Primary Sink logs:
Special notes for reviewers
Anything notable for review (risk, rollout, follow-ups).