Skip to content

Refactor/media#1500

Merged
SAKURA-CAT merged 8 commits intomainfrom
feat/media
Mar 14, 2026
Merged

Refactor/media#1500
SAKURA-CAT merged 8 commits intomainfrom
feat/media

Conversation

@SAKURA-CAT
Copy link
Copy Markdown
Member

重构 媒体 部分的逻辑,并且简化了类型定义,现在我们通过 adapter 模块映射protobuf枚举与用户语义

Introduce EChartsItem and make EChartsValue hold repeated items; regenerate Go/Python protobuf bindings. Move transform implementations from swanlab/sdk/internal/run/data/transforms to swanlab/sdk/internal/run/transforms and add normalize_media_input helper. Revise TransformType/TransformMediaType APIs (new abstract methods, build_metric_record, column_type, type, changed transform signatures) and simplify validation. Update RecordBuilder to dispatch on TransformMediaType and lists, return transformer classes for implicit column creation, and map MetricRecord -> media type more strictly. Update BackgroundConsumer, run APIs (log_text/log_scalar), Text/Scalar transforms, typings, imports and unit tests accordingly. Misc: remove obsolete data package init files and fix related imports.
Restructure and add metric protobuf schema and generated bindings: move protos from protos/swanlab/data/v1 to protos/swanlab/metric/*, add new column and data proto definitions (including media and scalar subtypes), and remove the old metric.proto. Generate Go protos under core/proto/swanlab/metric/... and Python protos under swanlab/proto/swanlab/metric/..., update record proto/pb files, and apply related minor updates to SDK transform and unit tests. This change provides the new metric/column/data schema (audio, image, video, text, echarts, scalar) and their language bindings.
Modify proto definitions (protos/swanlab/metric/column/v1/column.proto, protos/swanlab/metric/data/v1/data.proto, protos/swanlab/record/v1/record.proto) and regenerate protobuf outputs. Remove outdated generated files under core/proto/swanlab/data/v1 and swanlab/proto/swanlab/data/v1, and update regenerated Go and Python protobuf artifacts (pb.go, pb2.py, and .pyi) for metric/column, metric/data and record. Also apply a small update to swanlab/sdk/internal/run/transforms/text/__init__.py. Cleans up old artifacts and aligns generated code with the updated .proto definitions.
@SAKURA-CAT SAKURA-CAT self-assigned this Mar 14, 2026
@SAKURA-CAT SAKURA-CAT added the 💪 enhancement New feature or request label Mar 14, 2026
Update the module docstring to indicate that the ColumnType enum adapter also serves as a path mapping. Documentation-only change; no functional code modified.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant refactoring of SwanLab's data and media handling infrastructure. The primary goal is to streamline the internal representation of metrics and columns using a more organized Protocol Buffer structure and to simplify the interaction between user input and these internal types via a new adapter layer. These changes improve the system's maintainability, consistency, and extensibility for future data types.

Highlights

  • Protocol Buffer Refactoring: The core Protocol Buffer definitions for data and metrics have been extensively refactored, moving into a new 'metric' namespace for better organization and clarity. Old metric and echarts definitions were removed, and new data and media definitions were introduced.
  • Simplified Type Definitions: Type definitions within the Protocol Buffers have been simplified, particularly for ColumnType, which now directly includes specific media types like video and echarts, removing the generic 'ANY' type and the associated 'ColumnError' message.
  • Adapter Module Introduction: A new adapter module has been added to the Python SDK. This module facilitates the mapping between user-friendly string inputs (e.g., 'must', 'allow' for resume modes) and the internal Protobuf enum values, enhancing semantic clarity and ease of use.
  • Python SDK Integration: The Python SDK has been updated to align with the new Protocol Buffer structure. This includes changes to event handling, metric definition, data transformation logic, and the removal/renaming of several data-related modules and files.
  • Enhanced Data Transformation: The TransformType abstract class was refactored to include new methods for building DataRecord envelopes and specifying ColumnType. The transform method is no longer static, allowing for more flexible data processing, and a new normalize_media_input utility was added for consistent media handling.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • core/proto/swanlab/data/v1/echarts.pb.go
    • Removed the ECharts Protocol Buffer definition.
  • core/proto/swanlab/data/v1/metric.pb.go
    • Removed the MetricRecord Protocol Buffer definition.
  • core/proto/swanlab/data/v1/scalar.pb.go
    • Removed the Scalar Protocol Buffer definition.
  • core/proto/swanlab/metric/column/v1/column.pb.go
    • Renamed from core/proto/swanlab/data/v1/column.pb.go.
    • Updated the ColumnType enum to include VIDEO and ECHARTS and removed ANY.
    • Removed the ColumnError message.
    • Adjusted field numbers and package name.
  • core/proto/swanlab/metric/data/v1/data.pb.go
    • Added a new DataRecord Protocol Buffer definition to encapsulate metric data.
  • core/proto/swanlab/metric/data/v1/media/audio.pb.go
    • Renamed from core/proto/swanlab/data/v1/audio.pb.go.
    • Updated the AudioValue message to remove the path field and adjusted field numbers.
  • core/proto/swanlab/metric/data/v1/media/echarts.pb.go
    • Added a new EChartsValue and EChartsItem Protocol Buffer definition.
  • core/proto/swanlab/metric/data/v1/media/image.pb.go
    • Renamed from core/proto/swanlab/data/v1/image.pb.go.
    • Updated the ImageValue message to remove the path field and adjusted field numbers.
  • core/proto/swanlab/metric/data/v1/media/text.pb.go
    • Renamed from core/proto/swanlab/data/v1/text.pb.go.
    • Updated the TextValue message to remove the path field and adjusted field numbers.
  • core/proto/swanlab/metric/data/v1/media/video.pb.go
    • Renamed from core/proto/swanlab/data/v1/video.pb.go.
    • Updated the VideoValue message to remove the path field and adjusted field numbers.
  • core/proto/swanlab/metric/data/v1/scalar/scalar.pb.go
    • Renamed from core/proto/swanlab/data/v1/scalar.pb.go.
    • Adjusted package name.
  • core/proto/swanlab/record/v1/record.pb.go
    • Updated the Record message to use the new ColumnRecord and DataRecord types.
    • Adjusted field numbers for various record types.
  • protos/swanlab/data/v1/echarts.proto
    • Removed the ECharts Protocol Buffer schema.
  • protos/swanlab/data/v1/metric.proto
    • Removed the MetricRecord Protocol Buffer schema.
  • protos/swanlab/data/v1/scalar.proto
    • Removed the Scalar Protocol Buffer schema.
  • protos/swanlab/metric/column/v1/column.proto
    • Renamed from protos/swanlab/data/v1/column.proto.
    • Updated the ColumnType enum and removed ColumnError message.
    • Adjusted package name and field numbers.
  • protos/swanlab/metric/data/v1/data.proto
    • Added a new DataRecord Protocol Buffer schema.
  • protos/swanlab/metric/data/v1/media/audio.proto
    • Renamed from protos/swanlab/data/v1/audio.proto.
    • Updated AudioValue to remove the path field.
  • protos/swanlab/metric/data/v1/media/echarts.proto
    • Added a new EChartsValue and EChartsItem Protocol Buffer schema.
  • protos/swanlab/metric/data/v1/media/image.proto
    • Renamed from protos/swanlab/data/v1/image.proto.
    • Updated ImageValue to remove the path field.
  • protos/swanlab/metric/data/v1/media/text.proto
    • Renamed from protos/swanlab/data/v1/text.proto.
    • Updated TextValue to remove the path field.
  • protos/swanlab/metric/data/v1/media/video.proto
    • Renamed from protos/swanlab/data/v1/video.proto.
    • Updated VideoValue to remove the path field.
  • protos/swanlab/metric/data/v1/scalar/scalar.proto
    • Renamed from protos/swanlab/data/v1/scalar.proto.
    • Adjusted package name.
  • protos/swanlab/record/v1/record.proto
    • Updated import paths for metric and column protos.
    • Adjusted field numbers within the Record message.
  • swanlab/proto/swanlab/data/init.py
    • Renamed to swanlab/proto/swanlab/metric/__init__.py.
  • swanlab/proto/swanlab/data/v1/init.py
    • Renamed to swanlab/proto/swanlab/metric/column/__init__.py.
  • swanlab/proto/swanlab/data/v1/audio_pb2.py
    • Removed the generated Python protobuf file for audio.
  • swanlab/proto/swanlab/data/v1/column_pb2.py
    • Removed the generated Python protobuf file for column.
  • swanlab/proto/swanlab/data/v1/echarts_pb2.py
    • Removed the generated Python protobuf file for echarts.
  • swanlab/proto/swanlab/data/v1/echarts_pb2.pyi
    • Removed the generated Python protobuf type hints for echarts.
  • swanlab/proto/swanlab/data/v1/image_pb2.py
    • Removed the generated Python protobuf file for image.
  • swanlab/proto/swanlab/data/v1/metric_pb2.py
    • Removed the generated Python protobuf file for metric.
  • swanlab/proto/swanlab/data/v1/metric_pb2.pyi
    • Removed the generated Python protobuf type hints for metric.
  • swanlab/proto/swanlab/data/v1/text_pb2.py
    • Removed the generated Python protobuf file for text.
  • swanlab/proto/swanlab/data/v1/video_pb2.py
    • Removed the generated Python protobuf file for video.
  • swanlab/proto/swanlab/metric/column/v1/column_pb2.py
    • Added the new generated Python protobuf file for column.
  • swanlab/proto/swanlab/metric/column/v1/column_pb2.pyi
    • Renamed from swanlab/proto/swanlab/data/v1/column_pb2.pyi and updated type hints.
  • swanlab/proto/swanlab/metric/column/v1/column_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/column_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/data_pb2.py
    • Added the new generated Python protobuf file for data records.
  • swanlab/proto/swanlab/metric/data/v1/data_pb2.pyi
    • Added the new generated Python protobuf type hints for data records.
  • swanlab/proto/swanlab/metric/data/v1/data_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/metric_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/media/audio_pb2.py
    • Added the new generated Python protobuf file for audio media.
  • swanlab/proto/swanlab/metric/data/v1/media/audio_pb2.pyi
    • Renamed from swanlab/proto/swanlab/data/v1/audio_pb2.pyi and updated type hints.
  • swanlab/proto/swanlab/metric/data/v1/media/audio_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/audio_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/media/echarts_pb2.py
    • Added the new generated Python protobuf file for ECharts media.
  • swanlab/proto/swanlab/metric/data/v1/media/echarts_pb2.pyi
    • Added the new generated Python protobuf type hints for ECharts media.
  • swanlab/proto/swanlab/metric/data/v1/media/echarts_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/echarts_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/media/image_pb2.py
    • Added the new generated Python protobuf file for image media.
  • swanlab/proto/swanlab/metric/data/v1/media/image_pb2.pyi
    • Renamed from swanlab/proto/swanlab/data/v1/image_pb2.pyi and updated type hints.
  • swanlab/proto/swanlab/metric/data/v1/media/image_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/image_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/media/text_pb2.py
    • Added the new generated Python protobuf file for text media.
  • swanlab/proto/swanlab/metric/data/v1/media/text_pb2.pyi
    • Renamed from swanlab/proto/swanlab/data/v1/text_pb2.pyi and updated type hints.
  • swanlab/proto/swanlab/metric/data/v1/media/text_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/text_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/media/video_pb2.py
    • Added the new generated Python protobuf file for video media.
  • swanlab/proto/swanlab/metric/data/v1/media/video_pb2.pyi
    • Renamed from swanlab/proto/swanlab/data/v1/video_pb2.pyi and updated type hints.
  • swanlab/proto/swanlab/metric/data/v1/media/video_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/video_pb2_grpc.py and updated gRPC generated code.
  • swanlab/proto/swanlab/metric/data/v1/scalar/scalar_pb2.py
    • Added the new generated Python protobuf file for scalar data.
  • swanlab/proto/swanlab/metric/data/v1/scalar/scalar_pb2.pyi
    • Renamed from swanlab/proto/swanlab/data/v1/scalar_pb2.pyi and updated type hints.
  • swanlab/proto/swanlab/metric/data/v1/scalar/scalar_pb2_grpc.py
    • Renamed from swanlab/proto/swanlab/data/v1/scalar_pb2_grpc.py and updated gRPC generated code.
  • swanlab/sdk/cmd/log.py
    • Updated import path for the Text transform class.
  • swanlab/sdk/internal/adapter/init.py
    • Added a new module for mapping user semantics to Protobuf enums.
  • swanlab/sdk/internal/adapter/bimap.py
    • Added a BiMap class for bidirectional mapping of values.
  • swanlab/sdk/internal/bus/init.py
    • Updated imports to reflect changes in event types.
  • swanlab/sdk/internal/bus/events.py
    • Replaced MetricDefineEvent with ScalarDefineEvent.
    • Updated ParseResult to return TransformType class instead of a string.
    • Removed DataTransferType from imports.
  • swanlab/sdk/internal/context/metrics.py
    • Updated ScalarMetric and MediaMetric to store ColumnType directly.
    • Adjusted define_media signature to accept ColumnType.
    • Removed MediaTransferType import.
  • swanlab/sdk/internal/context/transformer.py
    • Refactored TransformType and TransformMediaType abstract classes.
    • Added build_data_record and column_type abstract methods.
    • Removed static transform method and related signature validation logic.
  • swanlab/sdk/internal/run/init.py
    • Added a new log_scalar method for logging scalar values.
    • Updated log_text to handle lists of Text objects and use normalize_media_input.
    • Changed define_scalar to emit ScalarDefineEvent.
    • Updated import paths for Text and normalize_media_input.
  • swanlab/sdk/internal/run/consumer.py
    • Updated event handling logic for MetricLogEvent and ScalarDefineEvent.
    • Adjusted imports for new event types and ColumnType.
  • swanlab/sdk/internal/run/data/init.py
    • Removed the data module.
  • swanlab/sdk/internal/run/data/transforms/init.py
    • Removed the data.transforms module.
  • swanlab/sdk/internal/run/data/transforms/text/init.py
    • Removed the old Text transform module.
  • swanlab/sdk/internal/run/record_builder.py
    • Modified build_log to handle lists of TransformMediaType and return the TransformType class.
    • Updated build_column_from_log to infer column type from TransformType.
    • Replaced build_column_from_define with build_column_from_scalar_define.
    • Updated build_run and build_finish to use the new adapter module for enum mapping.
    • Removed internal helper methods _infer_column_type and _metric_to_media_type.
  • swanlab/sdk/internal/run/transforms/init.py
    • Added a new transforms module with normalize_media_input utility.
  • swanlab/sdk/internal/run/transforms/audio/init.py
    • Renamed from swanlab/sdk/internal/run/data/transforms/audio/__init__.py.
  • swanlab/sdk/internal/run/transforms/echarts/init.py
    • Renamed from swanlab/sdk/internal/run/data/transforms/echarts/__init__.py.
  • swanlab/sdk/internal/run/transforms/image/init.py
    • Renamed from swanlab/sdk/internal/run/data/transforms/image/__init__.py.
  • swanlab/sdk/internal/run/transforms/scalar/init.py
    • Renamed from swanlab/sdk/internal/run/data/transforms/scalar/__init__.py.
    • Updated Scalar class to implement new TransformType methods.
  • swanlab/sdk/internal/run/transforms/text/init.py
    • Added a new Text transform class implementing new TransformType methods.
  • swanlab/sdk/typings/run/data.py
    • Removed MediaTransferType and DataTransferType type aliases.
  • tests/unit/sdk/internal/adapter/test_bimap.py
    • Added new unit tests for the BiMap adapter.
  • tests/unit/sdk/internal/run/data/test_scalar.py
    • Updated import path for Scalar transform.
  • tests/unit/sdk/internal/run/data/test_text.py
    • Updated import paths and tests for Text transform to reflect new TransformType implementation and build_data_record method.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这次重构极大地改善了 media 部分的逻辑和 Protobuf 定义。
主要亮点包括:

  1. 更清晰的 Protobuf 结构: 将 metric 相关的定义移动到 swanlab/metric 目录下,并按 columndata 进行了组织,结构更加清晰。
  2. 引入适配器模式: 新增的 adapter 模块使用 BiMap 来处理用户输入字符串与内部 Protobuf 枚举的映射,这是一个非常好的实践,增强了代码的可维护性和解耦性。
  3. 改进的数据转换流程: TransformType 的重构和 RecordBuilder 中对 singledispatchmethod 的使用,使得数据处理流程更加健壮和可扩展。
  4. 简化的媒体类型定义: 移除了媒体类型 Protobuf 定义中的 path 字段,将路径管理责任上移到 SDK 层面,简化了消息结构。

总体来说,这是一次高质量的重构,显著提升了代码质量。我发现了一个潜在的 bug 和一个文档不一致的地方,请见具体的审查评论。

Introduce swanlab.sdk.internal.adapter.dirname and filename modules to centralize directory and filename conventions (constants for files, media, debug; metadata, config, requirements, conda and a run(run_id) helper). Export them from adapter.__init__.py and update RunContext to use these constants instead of hardcoded strings, add a debug_dir property, and use adapter.filename.run for the run file path. This centralizes naming conventions and reduces literal strings in the context code.
Change TransformType.transform from a staticmethod to an instance method with signature transform(self, *, key: str, step: int, data: str = None) and simplify content fallback to use data or an empty string. Remove the redundant type() classmethod. In RecordBuilder, rename adapter.column_type[col_type] to media_type_str and pass the original col_type (enum) to metrics.define_media while using media_type_str for the media directory path—aligning argument order and clarifying variable names.
Replace usages of unittest.mock.patch with pytest's monkeypatch across several tests to make mocking more consistent and robust. Key changes:

- tests/benchmark/sdk/internal/run/bench_run_helper.py: switch to monkeypatch for suppressing console warnings in validate_key benchmarks.
- tests/unit/sdk/cmd/login/test_login_e2e.py: migrate patches to monkeypatch, simulate prompt calls with a side-effect list, make host configurable (use fake.swanlab.cn in one test), and ensure client.exists/reset are monkeypatched. Adjusted calls to pass relogin where needed and assert saved .netrc content.
- tests/unit/sdk/internal/pkg/fs/test_fs_dir.py: replace MagicMock sys.modules hack with simple module-like objects, convert time and TemporaryFile patches to monkeypatch-based replacements, and capture console warnings into a list for assertions. Clean up environment variable tests accordingly.
- tests/unit/sdk/internal/pkg/fs/test_fs_write.py: replace os.replace patch with monkeypatch and validate atomic write rollback behavior.
- tests/unit/sdk/internal/run/test_run_fmt.py: replace console patching with monkeypatch capturing console.warning calls, and adapt tests to avoid global patch use.

Overall these changes modernize tests to use pytest fixtures, reduce reliance on context-managed patching, and make warning/assertion checks explicit and side-effect free.
@SAKURA-CAT SAKURA-CAT merged commit f416e5b into main Mar 14, 2026
18 checks passed
@SAKURA-CAT SAKURA-CAT deleted the feat/media branch March 14, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

💪 enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant