Skip to content

RecordReplay #1575

@leshy

Description

@leshy

intro

for now we record a robot via hardcoding RPC call to record() @

dimos/dimos/robot/unitree/go2/connection.py:193

and replay via injecting custom fake connection class in place of actual connection

dimos/dimos/robot/unitree/go2/connection.py:77

class ReplayConnection(UnitreeWebRTCConnection):
    dir_name = "unitree_go2_bigoffice"

This is pretty dumb. we should be able to record any module or groups of modules, and easily replace their outputs with recorded ones in a blueprint. We want a recorder class.

This is a vague sketch

Recorder produces a Recording that's a memory2 store.

StoreConfig is @ /dimos/memory2/store/base.py:65
Store is @ /dimos/memory2/store/base.py:80

configures backend basically.


class RecordReplay(ABC):
   topics: list[Topic]

    def __init__(self, store: Store | None, filename: str | None):
       ...
       # should init default store (sqlite) if not set use filename? thinking of ergonomics here..

   # idk maybe we want to make topics attribute on a recorder some magical class with topics as attributes? 
   # idk think about ergonomics
   # store for example has `store.streams.some_stream` magic namespace

   # I'll provide most naive possible methods here but you can design whatever:
  
   def add_topic(topic: Topic):
      ...
      
   def remove_topic(topic: Topic):
      ...

   # inspects module Out topics, records all
   def add_module(module: Module):
      ...
      
   def remove_module(module: Module):
      ...

   def start_recording():
      ...
      
   def stop_recording():
      ...

   # some timeline and playback tooling?
   
   def play(playback_speed: float = None):
      ...

   def seek(seconds: float = None)

   def get_seek() -> float:
       ... # idk I'm vibing

   # some timeline editing tooling like:
   def delete_section(time_range):
       ... 
   # idea is you'd record, then you can replay (viewing in rerun) and choose to trim some stuff, then save
   # idk if actual memory2.store supports convinient point deletion? or range deletions? should we add?
   # should we keep triming as an extra/todo later?
   
   # maybe remove_topic method here also DELETES topic from a store, if recorder made a recording?

   def save():
      ...
   # idk, do we call save() after trimming or is it auto-saved?

"a recording" this produces is whatever is required to re-init the RecordReplay, I guess since we use sqlite that's just an sqlite file,

idk if we want to keep my proposal with __init__ taking pluggable alternative stores. we can choose to keep it simple.. so __init__ can also be just take this sqlite filename

memory2/test_e2e.py shows how to record sensor data into the Store

Above is basically "video editor ui" but in API

UI

this is the "video editor UI" but in cli

Image

So ideally you start something that looks like lcmspy, you select modules (TBD!) or topics you want, and press record. you can also press stop, play, rewind etc. cool thing is that dimos pubsub Topic type is something Recorder can emit into

if you run rerun-bridge in the side terminal, it listens to LCM so you can watch your recording.

allow triming the recording, deleting, show timeline basically nice simple terminal VLC :) - this can be vibed super easily, I don't code my cli tooling at all.

you can have diff colors or use braille (like dtop) for streams, depending on message counts, bytes, idk whatever seems cool, don't bother too much with details, I vibe these things, if RecordReplay API is good this is easy to re-vibe etc

Usage

blueprint.replay(RecordReplay) or something like this? should kill modules whose OUTs are in the recording itself, but run the rest of the blueprint (talk to Paul if needed - this is the biggest unknown for me but should be easy)

blueprint.replay(RecordReplay.seek(10).map(every_other_msg).map(rate_limiter)) <- in theory (for fun, idk if useful) we can offer all memory2 query/filter methods on this? just pass down to individual streams

We also want to hook --replay into this right? so blueprint could potentially have a config seting for initing it's "appropriate" recording, so if --replay is used this recording is replayed,

but --replay should support loading files also right? (cli expects standard sqlite store)

Misc

we store recordings in lfs system, dimos/data/.lfs - dimos/utils/data.py helps with this - there are docs also, find them in docs/

Later / Extras

Fancy blueprint level default recording control

you could allow blueprint to specify default modules to record? (via blueprint config? bonus points think about higher order blueprints appending not rewriting?) so we can get dimos --record (module OUT topics are easily inspectable)

so like


unitree_go2_basic = (
    autoconnect(
        with_vis,
        go2_connection(),
        websocket_vis(),
    )
    .global_config(n_workers=4, robot_model="unitree_go2")
    .configurators(ClockSyncConfigurator())
    .default_record_modules(go2_connection) # or part of .global_config? idk
)

# so when I feed into a recorder (deployed blueprint? undeployed? idk)

recorder = Recorder("mydb.db") # relative path so places it in dimos/data via `get_data`
recorder.record_blueprint(unitree_go2_basic) # it knows the topics

# later (will disable go2_connection since it will discover that recorder offers those topics)
unitree_go_basic.replay_recoriding(Recorder("mydb.db"))


actually always record

we will have a rolling db for all runs, robots record all of their sensors by default. If you like your run, you can go into cli, run the "video editing / player" cli tool, and rerun-bridge, trim, commit to LFS

(memory2 and fancy memory functionality will use this db as well, in realtime for historical queries etc)

external module level record

we don't have a system to OUTSIDE of dimos know which modules emit which topics actually, nice to add, then in your UI you can see modules and select modules and not just topics (can have a tree of depth 1 right? module->topics)

if you use dimos --dtop and type dtop in another terminal, this is some system that reports modules and worker states (via LCM broadcast), we could have a similar (or same idk) system that reports a blueprint being ran for example (I think it's seriazable) so you can externally inspect

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions