[all] support for OT Posix node with RCP using realtime UART#740
[all] support for OT Posix node with RCP using realtime UART#740EskoDijk wants to merge 1 commit into
Conversation
Summary of ChangesHello @EskoDijk, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the simulation platform by introducing robust support for real-time OpenThread Posix nodes configured as Radio Co-Processors (RCPs). This foundational change enables more advanced and realistic simulation scenarios, such as the future integration of actual OpenThread Border Routers. The core dispatcher logic has been refined to seamlessly manage both real-time and virtual-time interactions, and internal event handling mechanisms have been improved for clarity and efficiency, making the simulation more versatile and easier to debug. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant new functionality by adding support for real-time simulation with OpenThread Posix nodes using an RCP (Radio Co-Processor) and real-time UART communication. The changes involve extensive refactoring of the simulation's core dispatcher, node management, and UART handling to accommodate both virtual-time and real-time interactions. Key updates include a new waitForSimulation() method to streamline event processing, improved UART/log output queueing, and the integration of new build scripts for RCP and Posix executables. The addition of rcp as a node type and the corresponding CLI command restrictions in real-time mode are well-documented and implemented. Overall, the changes enhance the simulation's capabilities and flexibility for more complex testing scenarios.
I am having trouble creating individual review comments. Click here to see my feedback.
ot-rfsim/src/uart.c (41-320)
The extensive new #if OPENTHREAD_SIMULATION_VIRTUAL_TIME_UART == 0 block implements the real-time UART functionality using termios, poll, and atexit handlers. This is a significant and complex addition, correctly handling raw mode, baud rate configuration, and non-blocking I/O for real-time communication with an RCP. The use of dup and dup2 to redirect stdin/stdout/stderr is a common pattern for this type of setup.
simulation/node.go (1118-1161)
The new lineReaderStdOut function is a crucial component for real-time RCP support. It correctly parses stdout lines from the Posix host process, distinguishing between OTNS status pushes, OpenThread log lines, and regular UART writes, and then posts appropriate events to the dispatcher. This enables the simulation to react to real-time output from the RCP's host.
simulation/node.go (1053-1071)
The complete refactoring of processUartData to use bytes.Buffer and bytes.IndexByte for line parsing is a significant improvement. This approach is more efficient and robust for handling fragmented UART data, correctly identifying complete lines, and consuming the OtCliPrompt when present. This is a critical change for reliable UART communication.
simulation/node.go (108-132)
The conditional logic for constructing the exec.Cmd based on cfg.IsRcp is a core part of the RCP support. It correctly sets up the executable path and arguments, including the spinel+hdlc+forkpty URL with forkpty-arg parameters, which is essential for launching the Posix host process that forks the RCP.
simulation/simulation.go (132-133)
The added check if cfg.IsRcp && !s.networkInfo.Real prevents RCP nodes from being added in non-real-time simulation modes. This is a critical validation to ensure that RCP nodes are only used in their intended real-time environment, preventing misconfigurations and runtime errors.
dispatcher/dispatcher.go (607-608)
The change in the return condition for processNextEvent from len(d.nodes) > 0 to nextEventTime <= d.pauseTime || len(d.nodes) > 0 is a crucial functional update. It ensures that the simulation continues if there are pending events, even if all nodes are currently asleep. This is important for scenarios where events might be scheduled far in the future or when external events need to be processed.
types/node_config.go (59-88)
The refactoring of UpdateNodeConfigFromType to reset all flags first and return an error for unknown types is a significant improvement. This makes the function more robust and explicit about handling invalid node types. The addition of the RCP case correctly sets IsRouter and IsRcp for RCP nodes.
types/node_config.go (43-50)
Adding IsRcp and HostExePath to the NodeConfig struct is fundamental for supporting RCP nodes. IsRcp allows the system to identify RCP nodes, and HostExePath provides the path to the executable that drives the RCP, which is crucial for the new real-time functionality.
logger/parse.go (42-43)
The addition of otnsStatusPushLogPattern and ParseOtnsStatusPush is a key part of the new RCP functionality, allowing the simulation to parse and react to specific status messages from the OpenThread Posix host. This is well-implemented and necessary for the new feature.
ot-rfsim/CMakeLists.txt (71-74)
The introduction of the OT_SIMULATION_VIRTUAL_TIME_UART option in CMakeLists.txt is a fundamental change to control the UART behavior. Setting OPENTHREAD_SIMULATION_VIRTUAL_TIME_UART=1 conditionally allows for flexible compilation based on whether virtual or real-time UART is desired.
simulation/simulation.go (377-380)
The refactoring of OnUartWrite to directly call node.processUartData(data) and remove the node.uartReader <- data channel send simplifies the UART data flow. This change, along with the bytes.Buffer in Node, makes the UART processing more direct and efficient.
simulation/node_config.go (313-322)
The new FindHostExecutableBasedOnConfig function is essential for correctly determining the executable path for the RCP's host process. This separation of concerns ensures that the correct binary is located for the host component of an RCP node.
simulation/simulation.go (536-544)
The new waitForSimulation helper function is an excellent addition. It centralizes the logic for waiting for simulation progress and handling events, reducing code duplication across the codebase. The isBusyWaiting parameter provides flexibility for different calling contexts.
ot-rfsim/src/platform-rfsim.h (202-213)
The addition of platformUartUpdateFdSet and platformUartProcess declarations is crucial for integrating real-time UART handling into the platform's event loop. These functions enable the system to monitor and process UART I/O alongside other platform events.
ot-rfsim/src/system.c (146-148)
Conditionally calling platformUartUpdateFdSet based on OPENTHREAD_SIMULATION_VIRTUAL_TIME_UART is a key part of the real-time UART implementation. This ensures that the file descriptor sets are correctly updated for select() when real-time UART is in use.
ot-rfsim/src/system.c (175)
The call to platformUartProcess() is essential for handling real-time UART events within the main system loop. This ensures that incoming and outgoing UART data is processed promptly, which is critical for RCP communication.
dispatcher/dispatcher.go (916)
Adding d.RecvEvents() after syncAliveNodes() is a critical change. It ensures that any events generated by nodes during their time advancement (e.g., alarms, radio states) are immediately processed before the dispatcher proceeds. This prevents potential race conditions or delays in event handling, especially in a real-time context.
dispatcher/dispatcher.go (1378-1389)
The NotifyCommand function's updated signature and logic are essential for handling commands in the new real-time RCP mode. The isNodeProcessEnding parameter allows the dispatcher to differentiate between regular commands and node termination, ensuring appropriate time synchronization and event processing. Forcing advanceNodeTime when not ending ensures the node's virtual time catches up with the simulation, which is vital for correct real-time operation.
simulation/node.go (308-316)
Refactoring inputCommand to use a switch statement based on node.uartType clearly separates the logic for real-time and virtual-time UART communication. Passing false for isNodeProcessEnding to NotifyCommand is also correct for regular command input.
simulation/node.go (175-182)
The conditional assignment of node.uartType and the launch of node.lineReaderStdOut for RCP nodes are crucial for directing stdout processing to the new real-time UART handler. This correctly distinguishes between virtual-time and real-time UART data flows.
simulation/node.go (68-84)
The changes to the Node struct, including adding nameStr, isExiting, and replacing uartReader with uartLine (a bytes.Buffer), are significant. This refactoring centralizes UART data buffering and processing within the uartLine buffer, which is a more efficient and robust approach for handling incoming UART data, especially with the introduction of real-time UART.
pylibs/unittests/test_basic.py (956-957)
Increasing the go duration to 20 seconds and adding self.assertFormPartitions(1) in testUdpTcp ensures that the network has sufficient time to form a single partition before proceeding with UDP/TCP tests. This improves the reliability of the test by establishing a stable network state.
dispatcher/dispatcher.go (294-309)
The removal of the conditional if len(d.nodes) == 0 || duration.duration < 0 block in the Run() method simplifies the logic. However, it's important to ensure that the RecvEvents() and time.Sleep() calls previously handled in that block are now correctly managed elsewhere to prevent busy-waiting if no nodes are present or no simulation progress is expected. The RecvEvents() call was moved to syncAliveNodes(), which is called after goSimulateForDuration(). If goSimulateForDuration() doesn't advance time (e.g., no events), and there are no nodes, the loop might become tight. However, processNextEvent itself has a time.Sleep if simSpeed < MaxSimulateSpeed, which should prevent a busy loop. This change appears to be a valid refactoring to centralize event processing.
dispatcher/dispatcher.go (465-468)
Adding a logger.Errorf for RecvEvents timeout is a good improvement for debugging and identifying potential deadlocks or unresponsive nodes in the simulation. This will be very helpful in diagnosing issues.
simulation/node.go (224-225)
Adding node.cmd.ProcessState != nil to the signalExit condition makes the check more robust. It ensures that a signal is only sent if the process exists and has not already exited, preventing potential errors.
pylibs/unittests/test_basic.py (860-871)
The new assertions in testRealtimeMode for the go command's behavior and time advancement in real-time mode are important. They validate that the go command correctly fails and that the simulation time progresses with real clock time, which are key aspects of the new real-time feature.
pylibs/otns/cli/OTNS.py (265)
Changing the type hint for the time property from int to float is correct, as the simulation time now supports microsecond resolution, making it a floating-point value.
dispatcher/dispatcher.go (598-599)
Changing evt.NodeId > 0 to evt.NodeId != InvalidNodeId is a more robust and semantically correct way to check for a valid node ID, especially since InvalidNodeId is explicitly defined as 0.
ot-rfsim/src/platform-rfsim.c (56)
Replacing the mutable static otIp6Address unspecifiedIp6Address with a static const otIp6Address kUnspecifiedIp6Address is a good practice. It ensures that the address is constant and cannot be accidentally modified, improving code safety and clarity.
simulation/node.go (1193-1201)
The refactoring of readLine to use node.S.waitForSimulation(false) and a select statement simplifies the logic for reading lines from the node. This change improves readability and ensures that simulation progression is managed consistently.
simulation/node.go (1215-1228)
The simplification of expectLine by replacing the default case with node.S.waitForSimulation(true) centralizes simulation advancement. Additionally, improving the error message for timeout to include the expected line (expected '%v') will make debugging easier.
simulation/node_config.go (125-126)
Adding Rcp and RcpHost fields to ExecutableConfig is necessary to define the executables for the new RCP node type and its associated host process. This allows for proper configuration and lookup of these binaries.
simulation/node_config.go (145-146)
Adding default values for Rcp (ot-rcp) and RcpHost (ot-cli) in DefaultExecutableConfig provides sensible defaults for the new RCP functionality, simplifying initial setup.
simulation/node_config.go (162-163)
Adding IsRcp and HostExePath to DefaultNodeConfig ensures that the default configuration for new nodes includes the necessary fields for RCP support.
simulation/node_config.go (192-198)
Adding error handling for nodeCfg.UpdateNodeConfigFromType() and assigning nodeCfg.HostExePath in NodeConfigFinalize makes the node configuration process more robust. It ensures that if an unknown node type is provided, an error is logged, and the executable path is marked as invalid, preventing further issues.
ot-rfsim/src/openthread-core-rfsim-config.h (334)
The removal of the OPENTHREAD_SIMULATION_VIRTUAL_TIME definition is consistent with the new approach of using OPENTHREAD_SIMULATION_VIRTUAL_TIME_UART to manage UART simulation behavior. This indicates a shift towards more granular control over virtualized components.
simulation/node.go (98-104)
Adding logic to remove the flash file for RCP nodes (%d_%x.data) ensures that RCP nodes start with a clean state, similar to how other node types handle their flash files. This is important for consistent and reproducible simulation runs.
simulation/simulation.go (168)
Replacing s.d.RecvEvents() with s.waitForSimulation(false) centralizes the logic for waiting for simulation events. This improves consistency and maintainability by using the new helper function for event processing during node initialization.
simulation/simulation.go (228-231)
The logic to enable autogo (s.autoGoChange <- true) only after the first node is successfully added is a good optimization. It prevents autogo from starting prematurely when no nodes are present, ensuring that simulation only begins when there's something to simulate.
ot-rfsim/script/build_all (45)
Adding rcp and posix to the VER loop ensures that the build process includes the new RCP and Posix CLI executables, which are essential for the real-time simulation feature.
simulation/simulation.go (387-393)
Adding logic to remove the OtCliPrompt from log messages in OnLogWrite ensures that only the actual log content is processed and displayed. This improves the cleanliness and readability of log output, especially when dealing with CLI-based nodes.
simulation/simulation.go (492-494)
Changing s.d.NotifyCommand(nodeid) to s.d.NotifyCommand(nodeid, true) and s.d.RecvEvents() to s.waitForSimulation(false) in DeleteNode is important. The true flag for isNodeProcessEnding correctly signals node termination, and using waitForSimulation ensures consistent event processing during node cleanup.
simulation/node.go (1171-1186)
The simplification of expectEvent by replacing the default case with a call to node.S.waitForSimulation(true) centralizes the logic for advancing the simulation and processing events. This reduces code duplication and ensures consistent simulation progression when waiting for specific events.
simulation/types.go (43)
Modifying doneOrErrorRegexp to include ^\s* at the beginning makes the regular expression more robust by matching 'Done' or 'Error' strings even if they are preceded by whitespace at the start of a line. This improves parsing reliability for CLI output.
dispatcher/types.go (41)
Increasing DefaultReadTimeout from 5 seconds to 10 seconds might be necessary to accommodate slower response times from RCP nodes or during complex real-time operations. However, a longer timeout could also mask underlying performance issues or delays in node communication. It's important to monitor if this increased timeout is truly sufficient and doesn't lead to excessive waiting in other scenarios.
dispatcher/send_queue.go (93)
Changing evt.NodeId = 0 to evt.NodeId = InvalidNodeId improves clarity and consistency, as InvalidNodeId is explicitly defined for this purpose.
types/ot_types.go (50-57)
The introduction of DefaultOtRcpStackVendorIeeeOui, OtCliPrompt, and OtCliPromptLen constants/variables is essential for RCP support. DefaultOtRcpStackVendorIeeeOui is used in generating EUI-64 addresses for RCPs, and OtCliPrompt helps in parsing CLI output from the RCP host, improving the robustness of UART communication.
types/ot_types.go (132-137)
The GetDefaultRcpIeeeEui64 function provides a standardized way to construct the expected IEEE EUI-64 address for simulated RCP nodes. This is useful for pre-calculating or verifying addresses, although the note about OPENTHREAD_CONFIG_STACK_VENDOR_OUI is a good caveat.
types/types.go (63)
Adding RCP as a new node type constant is necessary to support the new real-time RCP functionality throughout the simulation framework.
web/site/js/vis/PixiVisualizer.js (260-265)
The simplified _applyReal() function now consistently sets add, del, and radio abilities to true regardless of real mode, while speed is conditionally enabled. This change aligns with the new real-time mode where node management operations are still allowed, but speed control is restricted.
60f12b4 to
12dd96b
Compare
78a1e02 to
12ef30b
Compare
12ef30b to
eca2090
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces significant new functionality by adding support for OT Posix nodes with RCPs in a realtime simulation context. The changes are extensive, touching upon the dispatcher, simulation core, CLI, and build systems. The introduction of a realtime simulation mode with time-drift correction is a key feature. I appreciate the refactoring efforts that simplify event processing with waitForSimulation() and streamline UART/log handling, which enhances code readability and maintainability. The addition of tests for the new realtime functionality is also a great inclusion. My review has identified a few areas for improvement, including replacing magic numbers with constants for better code clarity and addressing a potential unintended change in behavior for WIFI nodes that arose from refactoring. Overall, this is a well-executed and valuable contribution.
eca2090 to
e414c3f
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive support for OT Posix nodes with RCP using realtime UART, enabling more realistic simulation scenarios. Key changes include refactoring the dispatcher and simulation core to handle both realtime and virtual-time UART interactions, adding new CLI commands for RCP and host executable management, and updating documentation and unit tests to reflect these new capabilities. The implementation of a PID-style controller for dynamic speed adjustment in realtime mode is a notable enhancement for maintaining time synchronization. Overall, the changes are well-integrated and significantly expand the simulation's functionality.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This is a large but well-executed pull request that adds significant new functionality for real-time simulation with RCP nodes. The changes are comprehensive, touching many parts of the system from build scripts to the simulation core and visualization. The introduction of a PID controller for time synchronization and auto-detection of CLI echo behavior are particularly well-implemented. The code is also refactored in several places, improving clarity and maintainability. I have one suggestion for improving code clarity in the new lineReaderStdOut function.
This adds support for running a standard OT Posix node in a realtime simulation, where the Posix process performs realtime SPINEL/HDLC UART communication with an RCP. Support for RCPs is the basis to for upcoming features like including real OTBRs in a simulation, either running on the local host or running in a Docker. The dispatcher/simulation core is updated to allow handling of both 'realtime UART' (stdout) based CLI and logging interactions and 'virtual-time UART' interactions by nodes in the same simulation. The real-time simulation mode adapts the simulation speed slightly, around speed=~1, to keep the time drift w.r.t. the system clock near 0. A method `waitForSimulation()` is introduced to avoid the need for custom simulation event processing loops at different places in the code. Also, the queueing and filtering of UART/log output of OT nodes is simplified to make the code better readable and the data flow easier to understand and debug. The CLI (over PTY/UART/...) echo behavior differs for Linux Posix nodes that are controlled via stdin/stdout: there is no 'echo' behavior. Depending on OS, there could be. To handle all possible combinations of platform/node-type, there is a function 'setupCli' added that autodetects whether echo is used for a node, or not, and whether the CLI connection is functional. To achieve this feature, some stale/unnecessary code related to the 'assurePrompt()' function was removed, also making the node CLI code more maintainable. The 'openthread' submodule is bumped to the latest version to get some required OT new features in. Co-authored-by: François Michel <francoismichel.dev@gmail.com>
e414c3f to
7e6f6ac
Compare
|
("Develop" CI tests failing due to issue #737 . Review was requested to @francoismichel but I can't add him to the "Reviewers" field.) |
|
Now marked as draft, due to finding an issue that should be resolved I think:
|
|
(also, the Matter support PR got merged and the 'external nodes' PR is about to be merged too, which requires some rebasing and changes in the present PR. So best do that on top of the pending changes.) |
|
Now superseded by PR #769 . This contains a new event handling loop for realtime mode, that can immediately receive incoming events even when the dispatcher has an ongoing "sleep period". |
This adds support for running a standard OT Posix node in a realtime simulation, where
the Posix process performs realtime SPINEL/HDLC UART communication with an RCP.
Support for RCPs is the basis to for upcoming features like including real OTBRs in a
simulation, either running on the local host or running in a Docker.
The dispatcher/simulation core is updated to allow handling of both 'realtime UART'
(stdout) based CLI and logging interactions and 'virtual-time UART' interactions by
nodes in the same simulation. The real-time simulation mode adapts the simulation
speed slightly, around speed=~1, to keep the time drift w.r.t. the system clock
near 0.
A method
waitForSimulation()is introduced to avoid the need for custom simulationevent processing loops at different places in the code. Also, the queueing and
filtering of UART/log output of OT nodes is simplified to make the code better
readable and the data flow easier to understand and debug.
The CLI (over PTY/UART/...) echo behavior differs for Linux Posix nodes that are
controlled via stdin/stdout: there is no 'echo' behavior. Depending on OS, there could
be. To handle all possible combinations of platform/node-type, there is a function
'setupCli' added that autodetects whether echo is used for a node, or not, and whether
the CLI connection is functional.
To achieve this feature, some stale/unnecessary code related to the 'assurePrompt()'
function was removed, also making the node CLI code more maintainable.
The 'openthread' submodule is bumped to the latest version to get some required OT
new features in.