Fix typed proxy access to generic skeleton event storage#394
Fix typed proxy access to generic skeleton event storage#394rudresh-systream wants to merge 3 commits intoeclipse-score:mainfrom
Conversation
Signed-off-by: Rudresh Shirwal <rudresh.shirwal@systream.io>
Signed-off-by: Rudresh Shirwal <rudresh.shirwal@systream.io>
37ebeb3 to
dcb34ed
Compare
|
Added a dedicated verification application to prove the fix for #311 in the Second Commit. Why this application was createdThe bug is specifically about interoperability between: So the verification app was created to reproduce that exact runtime architecture instead of only relying on unit tests. The app was added under: It contains one binary that can run in two modes: How the application was created1. GenericSkeleton sideIn It registers the service events using explicit sample metadata: Both required events from the typed interface were registered so the typed proxy can create all required event control views correctly. This side proves the producer is using the GenericSkeleton storage path that originally triggered the bug. 2. Typed proxy sideIn It searches for the service, instantiates the typed proxy, subscribes to the event, receives samples through callback, validates the received data, and exits cleanly after the configured number of cycles. This side proves the consumer is using the normal typed proxy path, not a GenericProxy. Build and integration testThe app was built with: bazel build //score/mw/com/test/generic_skeleton_typed_proxy:generic_skeleton_typed_proxy
bazel build //score/mw/com/test/generic_skeleton_typed_proxy:generic_skeleton_typed_proxy-pkgThe automated integration test was executed with: bazel test //score/mw/com/test/generic_skeleton_typed_proxy/integration_test:generic_skeleton_typed_proxyThe test passed. Manual runtime verificationThe application was also run manually as two real processes from the repo. Before running, runtime configs were copied to the repo root: mkdir -p etc
cp score/mw/com/test/generic_skeleton_typed_proxy/mw_com_config.json etc/mw_com_config.json
cp score/mw/com/test/generic_skeleton_typed_proxy/logging.json etc/logging.jsonTerminal 1 ran the GenericSkeleton producer: bazel-bin/score/mw/com/test/generic_skeleton_typed_proxy/generic_skeleton_typed_proxy \
--mode generic_skeleton \
--cycle-time 40 \
--num-cycles 0Terminal 2 ran the typed proxy consumer: bazel-bin/score/mw/com/test/generic_skeleton_typed_proxy/generic_skeleton_typed_proxy \
--mode typed_proxy \
--num-cycles 25Evidence from GenericSkeleton terminalThe GenericSkeleton side successfully created and offered the LoLa service: Then it continuously sent samples: This proves the GenericSkeleton producer was active and publishing samples through GenericSkeleton-created shared memory. Evidence from typed proxy terminalThe typed proxy successfully discovered the service: Then the proxy callback was invoked and valid samples were received: The proxy continued receiving sequential samples: Finally, the typed proxy unsubscribed and terminated cleanly: How this proves the bug resolutionBefore the fix, this exact combination was broken: The typed proxy could interpret the shared-memory event storage incorrectly because GenericSkeleton and typed skeletons used different The fix changed typed proxy sample access so it uses raw event slot access with the actual sample type’s The verification app proves the fix because:
This confirms the typed proxy can now correctly consume samples from GenericSkeleton-created event storage, which is the issue described in #311. |
Summary:
Implemented a compatibility fix for the GenericSkeleton ↔ typed ProxyEvent shared-memory interaction described in #311.
Root cause
The issue originates from the fact that GenericSkeleton and typed skeletons create/interprete
EventDataStoragedifferently.For typed skeletons, the storage is created through:
which internally creates a typed
DynamicArray<SampleType>.For GenericSkeleton/GenericSkeletonEvent, the storage is instead created using:
EventDataStorage<std::max_align_t>with the storage size being calculated in units of
std::max_align_t.As a result, the underlying raw memory region may be large enough, but the metadata/layout interpretation of the
DynamicArray<T>becomes incompatible between the producer and consumer sides.The typed proxy path was still assuming a typed
EventDataStorage<T>representation and therefore interpreted slot count/layout using the wrong type information. This creates an incompatibility whenever:This issue is especially visible because the DynamicArray element count depends on the template type
T, which differs between the generic and typed paths.Investigated solution approaches
Several approaches were considered:
1. Extending
DynamicArray<T>One option was to add a constructor allowing externally managed/preallocated storage while preserving a typed
DynamicArray<T>interface.This was not chosen because:
DynamicArray,2. Replacing all event storage with
DynamicArray<std::byte>Another option was to fully migrate all event storage handling to raw byte storage.
While architecturally clean, this would require significantly broader refactoring across:
Given the scope/risk, this approach was considered too invasive for the current issue.
3. Fixing typed proxy access to use raw slot storage (chosen solution)
The implemented solution changes typed proxy sample access so it no longer depends on interpreting the shared memory as
EventDataStorage<T>.Instead:
the proxy accesses the shared-memory event region through raw slot memory/meta information,
and calculates slot/sample access using:
sizeof(T)alignof(T)This keeps the existing shared-memory layout compatible while allowing typed proxies to correctly consume data produced by GenericSkeleton.
This approach was selected because it:
Files changed
score/mw/com/impl/bindings/lola/proxy_event.hMain functional fix.
Updated typed
ProxyEvent<T>sample access logic to avoid relying on interpreting the shared-memory region asEventDataStorage<T>.Instead, access is now performed through raw event slot metadata and pointer arithmetic based on the actual sample type size/alignment.
This makes typed proxies compatible with GenericSkeleton-created storage.
score/mw/com/impl/bindings/lola/skeleton_memory_manager.hscore/mw/com/impl/bindings/lola/skeleton_memory_manager.cppUpdated/supporting changes around event storage creation and raw slot metadata handling.
These changes ensure both generic and typed paths expose compatible storage information to consumers.
score/mw/com/impl/bindings/lola/skeleton.cppAdjusted skeleton-side integration to align with the updated event storage access model and metadata usage.
score/mw/com/impl/bindings/lola/proxy_event_test.cppscore/mw/com/impl/bindings/lola/test/proxy_event_test_resources.cppscore/mw/com/impl/bindings/lola/test/proxy_event_test_resources.hUpdated and extended test resources/coverage to validate:
Validation performed
The following builds/tests were executed successfully after the changes:
All relevant LoLa event/proxy/skeleton tests passed successfully after the fix.