Remove Arena's Gr00t Docker; use Isaac-Gr00t clients#607
Remove Arena's Gr00t Docker; use Isaac-Gr00t clients#607cvolkcvolk wants to merge 14 commits intomainfrom
Conversation
Delete the entire networking module: PolicyServer, PolicyClient, MessageSerializer, ServerSidePolicy, ActionProtocol, RemotePolicyConfig, and the server runner CLI. These duplicate Isaac-GR00T's gr00t/policy/server_client.py. Each policy framework owns its own server and client. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
These classes wrapped Arena's now-deleted PolicyClient. Framework wrappers will use each framework's native client instead. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
This class wrapped GR00T's Gr00tPolicy inside Arena's ServerSidePolicy abstraction. The GR00T repo already provides its own server via gr00t/eval/run_gr00t_server.py. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
test_policy_client.py and test_action_chunking_client.py tested the now-removed PolicyClient and ActionChunkingClientSidePolicy. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
The remote/local distinction is no longer an Arena-level concern. Each framework wrapper handles its own lifecycle internally. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
The is_remote/shutdown_remote pattern is removed. Framework wrappers handle their own client lifecycle internally. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Dockerfile.gr00t_server launched Arena's remote_policy_server_runner, which is now deleted. The GR00T repo provides its own server via gr00t/eval/run_gr00t_server.py with its own Docker setup. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Thin wrapper that connects to a GR00T policy server via GR00T's own PolicyClient (gr00t.policy.server_client). Reuses the same obs/action translation pipeline as the local Gr00tClosedloopPolicy — only the inference call changes from in-process to remote. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Remove load_gr00t_policy_from_config() and the top-level Gr00tPolicy import from gr00t_core.py so the module no longer pulls in transformers/torch model stack at import time. The local Gr00tClosedloopPolicy now loads the model via its own _load_policy() method. This keeps gr00t_core.py lightweight for the remote wrapper which only needs obs/action translation. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Replace the import of gr00t.experiment.launch_finetune (which pulls in tyro, FinetuneConfig, and the training experiment module) with the 10-line inlined equivalent. The remote wrapper no longer requires any GR00T training dependencies. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
|
I'm sorry I'd prefer if we do MRs in small chunks. Claude is good at generating +++ lines of codes, and for this refactoring, I think we shall be careful how it was written by Claude. |
xyao-nv
left a comment
There was a problem hiding this comment.
Let's break down into smaller MRs once we agree on the design.
Update ensure_groot_deps_in_path to automatically find and add the Isaac-GR00T submodule to sys.path if gr00t is not already importable. This allows the remote wrapper to import PolicyClient without requiring a full pip install of the GR00T package or manual PYTHONPATH setup. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
|
@xyao-nv yes this is just prototyping. Most test will likely fail. Not meant for review. |
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR cleanly removes Arena's duplicated networking stack (remote_policy/ module, Docker files, server/client wrappers) in favor of using GR00T's native PolicyClient. The new Gr00tRemoteClosedloopPolicy is a well-designed thin wrapper that reuses the existing obs/action translation pipeline. The introduction of the ActionScheduler ABC is a solid design improvement that enables future scheduling strategies (temporal ensembling, pass-through, etc.). One initialization-order bug needs fixing before merge.
Design Assessment
Design is sound. The "each framework owns its own server and client" philosophy is the right direction. Removing ~1,800 lines of duplicated networking code and replacing it with a thin 237-line wrapper around the upstream client significantly reduces maintenance burden. The ActionScheduler ABC is a clean strategy pattern that decouples scheduling logic from policy logic.
Findings
🔴 Critical: _load_policy() references self.device before it is assigned — isaaclab_arena_gr00t/policy/gr00t_closedloop_policy.py:84
In Gr00tClosedloopPolicy.__init__, _load_policy() is called at line 84, but self.device is not assigned until line 88. Inside _load_policy(), self.device is passed to Gr00tPolicy(... device=self.device ...), which will raise AttributeError: 'Gr00tClosedloopPolicy' object has no attribute 'device'.
The old code avoided this because load_gr00t_policy_from_config() read from policy_config.policy_device directly rather than self.device.
Fix: either move self.device = config.policy_device (and the multi-GPU rank logic) above the _load_policy() call, or have _load_policy() use self.policy_config.policy_device instead of self.device:
# Basic attributes — must be set before _load_policy() which uses self.device
self.num_envs = config.num_envs
self.device = config.policy_device
self.task_mode = TaskMode(self.policy_config.task_mode_name)
world_size = get_world_size()
if world_size > 1 and "cuda" in config.policy_device:
local_rank = get_local_rank()
self.device = f"cuda:{local_rank}"
self.policy: Gr00tPolicy = self._load_policy()
🟡 Warning: reset() always resets the full remote server even for partial env resets — isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py:231
When policy_runner.py calls policy.reset(env_ids=...) for a subset of terminated/truncated environments, Gr00tRemoteClosedloopPolicy.reset() unconditionally calls self._client.reset() which resets the entire server-side model state. In a multi-env setup, this resets the server even when only one env needs resetting, potentially invalidating in-flight predictions for other envs.
This matches the local policy's behavior (self.policy.reset() also resets everything), so it's not a regression — but it's worth noting as a correctness concern for multi-env evaluation. Consider guarding the server reset:
def reset(self, env_ids: torch.Tensor | None = None):
if env_ids is None:
env_ids = slice(None)
self._client.reset() # Only full-reset the server on full reset
self._action_scheduler.reset(env_ids)🟡 Warning: sys.path pollution in inlined load_modality_config — isaaclab_arena_gr00t/utils/io_utils.py:221
The inlined modality config loader does sys.path.append(str(path.parent)) without cleanup. If load_gr00t_modality_config_from_file() is called multiple times (e.g., different configs), each call permanently appends to sys.path. Additionally, if the modality config module's stem collides with another module already on sys.path, importlib.import_module(path.stem) could import the wrong module.
Consider using a temporary sys.path modification or importlib.util.spec_from_file_location for more isolated loading:
spec = importlib.util.spec_from_file_location(path.stem, str(path))
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)🔵 Suggestion: Gr00tRemoteClosedloopPolicy has no close() or lifecycle management for the client — isaaclab_arena_gr00t/policy/gr00t_remote_closedloop_policy.py
The old ClientSidePolicy had a shutdown_remote() method to explicitly close the ZMQ socket. The new Gr00tRemoteClosedloopPolicy holds a Gr00tPolicyClient reference but provides no way to clean up the connection. While policy_runner.py no longer calls shutdown logic (the remote cleanup block was removed), it would be good practice to implement close() on the policy so callers can deterministically release the network connection.
Test Coverage
- Deleted tests:
test_policy_client.pyandtest_action_chunking_client.pywere correctly removed — they tested the now-deletedPolicyClientandActionChunkingClientSidePolicy. - Missing: No new tests for
Gr00tRemoteClosedloopPolicy. This is understandable since it requires a running GR00T policy server, but an integration test (similar to the deletedtest_action_chunking_client.pywhich used a dummy server fixture) would be valuable to verify the obs/action translation pipeline works end-to-end through the remote path. - Missing: No unit tests for the new
ActionScheduler/ActionChunkSchedulerrename. The backward-compat aliasActionChunkingState = ActionChunkSchedulershould be exercised by existing tests that importActionChunkingState.
CI Status
⏳ Pre-commit check is pending.
Verdict
Minor fixes needed
The self.device initialization order bug (🔴) must be fixed — it will cause an AttributeError at construction time for Gr00tClosedloopPolicy. The other findings are improvements worth considering but aren't blocking. Overall this is a well-structured refactoring that significantly reduces Arena's maintenance surface area.
| config.policy_config_yaml_path, Gr00tClosedloopPolicyConfig | ||
| ) | ||
| self.policy: Gr00tPolicy = load_gr00t_policy_from_config(self.policy_config) | ||
| self.policy: Gr00tPolicy = self._load_policy() |
There was a problem hiding this comment.
🔴 Critical: _load_policy() uses self.device which hasn't been assigned yet.
self.device is set on line 88, but _load_policy() is called here on line 84. Inside _load_policy(), Gr00tPolicy(... device=self.device ...) will raise AttributeError.
The old load_gr00t_policy_from_config() avoided this by reading policy_config.policy_device directly.
Fix: move the self.device assignment (lines 88-93, including the multi-GPU rank logic) above this line, or have _load_policy() use self.policy_config.policy_device.
| if path.exists() and path.suffix == ".py": | ||
| sys.path.append(str(path.parent)) | ||
| importlib.import_module(path.stem) | ||
| else: |
There was a problem hiding this comment.
🟡 sys.path.append(str(path.parent)) permanently pollutes sys.path and risks module name collisions if the stem matches an existing package. Consider importlib.util.spec_from_file_location() for isolated loading:
spec = importlib.util.spec_from_file_location(path.stem, str(path))
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
Summary [Draft; Do not merge]
This is a first shot at removing the Gr00t Docker and its dependecies. The Gr00t model is run through the official provided Isaac-Gr00t policy-server implementation.
remote_policy/module that re-implemented Isaac-GR00T'sserver_client.py(ZMQ+msgpack server, client, serializer)ClientSidePolicy,ActionChunkingClientSidePolicy, andGr00tRemoteServerSidePolicywrapper classesis_remote/shutdown_remotepattern fromPolicyBaseandpolicy_runner.pyDockerfile.gr00t_serverandrun_gr00t_server.sh(GR00T repo provides its own server)Gr00tRemoteClosedloopPolicy— a thin wrapper that uses GR00T's nativePolicyClientand reuses the same obs/action translation pipeline as the localGr00tClosedloopPolicyArena no longer maintains any networking code. Each policy framework (GR00T, OpenPI, etc.) owns its own server and client.
Motivation
Arena's
remote_policy/module duplicated Isaac-GR00T'sgr00t/policy/server_client.py— same ZMQ REP/REQ pattern, same msgpack serializer, same endpoint dispatch. Every protocol change in GR00T required a matching change in Arena. Meanwhile, other frameworks use entirely different protocols (OpenPI uses WebSocket, OpenVLA uses HTTP REST), making Arena's single-protocol approach a dead end for "bring any policy" evaluation.The observation and action translation (
gr00t_core.py) is the hard, valuable part and it already exists and is kept.What stays the same
PolicyBaseABC (minusis_remote)ActionChunkingState(multi-env GPU chunk buffer)gr00t_core.py(obs/action translation pipeline)Gr00tClosedloopPolicy(local, in-process)policy_runner.py/eval_runner.pyevaluation loopExample: Gr00t policy running through the official Isaac-Gr00t repository (instead of IsaacLab-Arena-Gr00t)
The policy model runs in GR00T's own Docker container, and Arena runs in its base Docker container.
You will need two terminals.
Terminal 1: Start the GR00T Policy Server
This starts the official Isaac-GR00T Docker container and launches the policy server inside it. The server loads the model and listens for inference requests over ZMQ.
Build the GR00T Docker image (first time only):
cd submodules/Isaac-GR00T bash docker/build.shThis builds a Docker image called
gr00t-devbased onnvcr.io/nvidia/pytorch:25.04-py3with all GR00T dependencies installed.Find the server IP — inside the GR00T container, run:
Note this IP (e.g.,
192.168.222.2). You will use it in Terminal 2.Launch the policy server:
Wait until you see:
Server is ready and listening on tcp://0.0.0.0:5555Terminal 2: Run Arena Evaluation
This starts the Arena base Docker container (Isaac Sim + Arena, no GR00T model dependencies) and runs the evaluation against the remote policy server.
Start the Arena Docker container:
Run the evaluation inside the Arena container:
Replace
<GROOT_SERVER_IP>with the IP from Terminal 1 (e.g.,192.168.222.2).The Arena container has no GR00T model dependencies. It only uses GR00T's lightweight
PolicyClient(ZMQ + msgpack) to send observations and receive actions. All model inference happens in the GR00T container.