-
Notifications
You must be signed in to change notification settings - Fork 602
feat(agents): agent-encode PoseStamped + position nav skills #2000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: legacy-dev-dont-merge
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,6 +12,7 @@ | |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import math | ||
| import time | ||
| from typing import Any | ||
|
|
||
|
|
@@ -37,6 +38,14 @@ | |
| logger = setup_logger() | ||
|
|
||
|
|
||
| def _make_position_goal(x: float, y: float, yaw_rad: float, frame_id: str) -> PoseStamped: | ||
| return PoseStamped( | ||
| position=make_vector3(x, y, 0.0), | ||
| orientation=Quaternion.from_euler(make_vector3(0.0, 0.0, yaw_rad)), | ||
| frame_id=frame_id, | ||
| ) | ||
|
|
||
|
|
||
| class NavigationSkillContainer(Module): | ||
| _latest_image: Image | None = None | ||
| _latest_odom: PoseStamped | None = None | ||
|
|
@@ -243,6 +252,78 @@ def _navigate_using_semantic_map(self, query: str) -> str: | |
| message = f"Found a location in the semantic map matching '{query}'." | ||
| return self._navigate_to(goal_pose, message) | ||
|
|
||
| @skill | ||
| def navigate_to_position( | ||
| self, | ||
| x: float, | ||
| y: float, | ||
| yaw_deg: float = 0.0, | ||
| frame_id: str = "map", | ||
| ) -> str: | ||
| """Navigate to an absolute position in the given frame. | ||
|
|
||
| Use this to act on a pose returned by another tool — pass its x, y, yaw_deg, and frame_id straight in. | ||
|
|
||
| Args: | ||
| x: target x in meters | ||
| y: target y in meters | ||
| yaw_deg: final heading in degrees, 0 = facing +x | ||
| frame_id: coordinate frame | ||
| """ | ||
| if not self._skill_started: | ||
| raise ValueError(f"{self} has not been started.") | ||
|
|
||
| goal = _make_position_goal(x, y, math.radians(yaw_deg), frame_id) | ||
| return self._navigate_to(goal, f"Heading to ({x:.2f}, {y:.2f}) in {frame_id}") | ||
|
|
||
| @skill | ||
| def rotate_toward_position( | ||
| self, | ||
| x: float, | ||
| y: float, | ||
| frame_id: str = "map", | ||
| ) -> str: | ||
| """Rotate in place to face an absolute position in the given frame. | ||
|
|
||
| Pass the target x and y from a pose's fields. Yaw is computed against the robot's current odometry. | ||
|
|
||
| Args: | ||
| x: target x in meters | ||
| y: target y in meters | ||
| frame_id: coordinate frame | ||
| """ | ||
| if not self._skill_started: | ||
| raise ValueError(f"{self} has not been started.") | ||
|
|
||
| if self._latest_odom is None: | ||
| return "No odometry data received yet, cannot rotate." | ||
|
|
||
| cur_x = self._latest_odom.position.x | ||
| cur_y = self._latest_odom.position.y | ||
| yaw_rad = math.atan2(y - cur_y, x - cur_x) | ||
|
|
||
| goal = _make_position_goal(cur_x, cur_y, yaw_rad, frame_id) | ||
| return self._navigate_to(goal, f"Rotating to face ({x:.2f}, {y:.2f}) in {frame_id}") | ||
|
Comment on lines
+298
to
+306
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In dimos, map↔world is published as an identity TF (dimos/navigation/rosnav.py:187-194), and existing skills follow the same implicit-frame convention — tag_location reads odom (frame "world") and stamps the result "map" (navigation.py:162-166), and _get_goal_pose_from_result hardcodes "map" (navigation.py:357). A frame_id string-equality check would actively break the default Go2 flow (odom publishes frame_id="world", default skill arg is "map") without catching any real geometric mismatch. Proper handling needs a TF-tree lookup applied uniformly across all nav skills — separate PR. |
||
|
|
||
| @skill | ||
| def current_pose(self) -> str: | ||
| """Return the robot's current pose. | ||
|
|
||
| Use this to reason about a target pose relative to the robot (e.g. distance or bearing). Compare only against poses with the same frame_id. | ||
| """ | ||
| if not self._skill_started: | ||
| raise ValueError(f"{self} has not been started.") | ||
|
|
||
| if self._latest_odom is None: | ||
| return "No odometry data received yet." | ||
|
|
||
| data = self._latest_odom.agent_encode() | ||
| return ( | ||
| f"Pose: ({data['x']}, {data['y']}, {data['z']})\n" | ||
| f"Yaw: {data['yaw_deg']}°\n" | ||
| f"Frame: {data['frame_id']}" | ||
| ) | ||
|
|
||
| @skill | ||
| def stop_navigation(self) -> str: | ||
| """Immediatly stop moving.""" | ||
|
|
@@ -273,8 +354,4 @@ def _get_goal_pose_from_result(self, result: dict[str, Any]) -> PoseStamped | No | |
| pos_y = first.get("pos_y", 0) | ||
| theta = first.get("rot_z", 0) | ||
|
|
||
| return PoseStamped( | ||
| position=make_vector3(pos_x, pos_y, 0), | ||
| orientation=Quaternion.from_euler(make_vector3(0, 0, theta)), | ||
| frame_id="map", | ||
| ) | ||
| return _make_position_goal(pos_x, pos_y, theta, "map") | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -16,7 +16,7 @@ | |
|
|
||
| import math | ||
| import time | ||
| from typing import TYPE_CHECKING, BinaryIO, TypeAlias | ||
| from typing import TYPE_CHECKING, Any, BinaryIO, TypeAlias | ||
|
|
||
| if TYPE_CHECKING: | ||
| from rerun._baseclasses import Archetype | ||
|
|
@@ -82,6 +82,23 @@ def __str__(self) -> str: | |
| f"euler=[{math.degrees(self.roll):.1f}, {math.degrees(self.pitch):.1f}, {math.degrees(self.yaw):.1f}])" | ||
| ) | ||
|
|
||
| def agent_encode(self) -> dict[str, Any]: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PoseStamped is not just for simple land robots. It's used for drones too, so roll and pitch should not be excluded. |
||
| """Render the pose for an LLM agent. | ||
|
|
||
| Returns a flat dict with ``frame_id``, ``x``, ``y``, ``z``, | ||
| ``yaw_deg``. Pass these straight into navigation skills (e.g. | ||
| ``navigate_to_position``) to act on the pose. To reason about | ||
| the pose relative to the robot, the agent should fetch its own | ||
| pose (e.g. via ``current_pose``) and combine the two. | ||
| """ | ||
| return { | ||
| "frame_id": self.frame_id, | ||
| "x": round(self.x, 3), | ||
| "y": round(self.y, 3), | ||
| "z": round(self.z, 3), | ||
| "yaw_deg": round(math.degrees(self.yaw), 1), | ||
| } | ||
|
|
||
| def to_rerun(self) -> Archetype: | ||
| """Convert to rerun Transform3D format. | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coordinates are always absolute so I don't think it makes sense to give the option of selecting a
frame_id.