Skip to content

feat: Ardupilot support for Gazebo (+ hardware) drone simulation with video stream and in a warehouse environment#1576

Draft
snktshrma wants to merge 1 commit intodimensionalOS:devfrom
snktshrma:drone-gazebo
Draft

feat: Ardupilot support for Gazebo (+ hardware) drone simulation with video stream and in a warehouse environment#1576
snktshrma wants to merge 1 commit intodimensionalOS:devfrom
snktshrma:drone-gazebo

Conversation

@snktshrma
Copy link

  • Gazebo + ArduPilot SITL: Gazebo video stream (RTP from UDP 5600), configurable as forward-facing camera source in connection module. New blueprints: basic, basic-with-spatial, agentic; all registered.
  • Position & motion: MAVLink position target in local NED with velocity feedforward (type mask fixed).
  • New agent skills: position target, move-by-distance, and yaw (rotate to heading).
  • Tracking skill in gazebo
  • Sim: MAVLink uses local position NED when present.
  • Docs: README section for Gazebo + ArduPilot (SITL, plugin, gz sim, sim_vehicle.py, DimOS).
Screenshot from 2026-03-16 19-48-05

@snktshrma snktshrma changed the title feat: Ardupilot-Gazebo drone simulation with video stream and in a warehouse environment feat: Ardupilot support for hardware and Gazebo drone simulation with video stream and in a warehouse environment Mar 16, 2026
… tracking, spatial model and warehouse environment
@snktshrma snktshrma changed the title feat: Ardupilot support for hardware and Gazebo drone simulation with video stream and in a warehouse environment feat: Ardupilot support for Gazebo (+ hardware) drone simulation with video stream and in a warehouse environment Mar 16, 2026
@uchibeke
Copy link

This is a genuinely exciting project — natural language control for humanoids, quadrupeds, drones, and robotic arms is the kind of thing that makes the "agentic AI" category concrete in a way that purely software agents don't.

One thing that jumps out immediately from the architecture: physical hardware agents need pre-action authorization at a fundamentally higher assurance level than software agents. A hallucinated Jira write is recoverable. A hallucinated command to a robotic arm or drone is not.

The standard approach to this problem in software agents — prompt-layer instructions like "always ask before acting" — doesn't hold under adversarial conditions. Prompt injection can instruct an agent to skip confirmation steps. For physical hardware, that failure mode is unacceptable.

The pattern that works: before_tool_call hook enforcement

APort Agent Guardrails implements pre-action authorization at the platform hook level, not the prompt level. Every tool call is intercepted and evaluated against a YAML policy before it executes. The model cannot skip it — there's no prompt or agent response that bypasses the hook.

For physical agents in dimos, this maps to: capability scope enforcement before actuator commands reach hardware. You'd define a policy manifest for each robot/drone's authorized capabilities, and any tool call outside that scope is denied at the framework level before it propagates to the hardware adapter.

The underlying spec — the Open Agent Protocol (OAP), DOI: 10.5281/zenodo.18901596 — also defines agent passports: signed capability manifests that declare what an agent is authorized to do. In a multi-agent dimos workflow (say, a planner agent delegating to a hardware execution agent), passports give you chain-of-custody verification that the executing agent is actually scoped for physical actuation.

"When your agent commands a robotic arm, you need verified capability scope, not a prompt."

A few specifics for dimos:

  • No sidecar or infrastructure required — YAML policy + hook, zero overhead
  • Works at the framework level — integrates before commands reach hardware adapters
  • Public CTF results: 1,149 adversarial social-engineering attempts against the guardrail, 0 bypasses
  • Apache 2.0, open standard

The physical-hardware angle makes the authorization problem more urgent, not just more interesting. Happy to discuss how OAP might fit into the dimos architecture — whether as a framework-level gate before hardware commands, or as a passport layer for multi-agent physical workflows.

Repo: https://github.com/aporthq/aport-agent-guardrails
Spec: https://github.com/aporthq/aport-spec
DOI: https://doi.org/10.5281/zenodo.18901596

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants