Skip to content

[Discussion] Support for loggers other than wandb #536

@robmarkcole

Description

@robmarkcole

I've briefly investigated supporting MLFlow as a self hosted alternative to wandb (which requires a license for commercial usage). The current approach couples “logging” to the wandb API and to wandb-specific assumptions, which makes it hard to add other loggers cleanly.

  • Non-scalar metric logging is implemented via wandb.plot e.g., confusion matrices. Other loggers (MLflow, TensorBoard, CSV) don’t necessarily have the same plotting API, so you need per-backend implementations or a generic artifact representation.
  • The “is wandb initialized?” checks encode a wandb-specific lifecycle. Other backends have different notions of an active run, run IDs, and resume semantics.
  • Project management/resume currently stores a wandb_id and tries to resume by setting logger.init_args.id. That only makes sense for WandbLogger-like backends; MLflow uses run_id, others have no equivalent.
  • Automatically injecting/configuring a specific logger inside the CLI (rather than letting Lightning instantiate the logger from config) creates a second “source of truth” for logging behavior, which becomes a combinatorial mess once you support multiple backends.

If the stance is to only support wandb in rslearn, I appreciate how that keeps things simple but it would be good to clarify this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions