Skip to content

Commit d95ef5a

Browse files
committed
Update SDK to version 0.36.0rc0
1 parent 3327b43 commit d95ef5a

23 files changed

Lines changed: 1353 additions & 798 deletions

CHANGELOG.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,23 @@
11
# Changelog
22

3+
# 0.36.0rc0
4+
## Breaking Changes
5+
- `MessagePath.parents()` now requires a `list[str]` (`path_in_schema`) instead of a dot-delimited `str`. This reflects the shift from ambiguous dot-separated paths to explicit schema path components for correct handling of nested and dot-containing field names. If you were calling `MessagePath.parents("pose.position.x")`, update to `MessagePath.parents(["pose", "position", "x"])`. Passing a string now raises a `TypeError` with guidance on how to migrate.
6+
- `MessagePath.parts()` has been removed. Use `MessagePathRecord.path_in_schema` directly to obtain path components.
7+
8+
## Features Added
9+
- Introduced `QueryContentMode`, allowing search endpoints to return Roboto entities with or without custom metadata. Initial support is for dataset queries in particular, since datasets can store large amounts of `metadata`, which is known to affect search latency and response size. More entity types will be supported in the future.
10+
- Improved `Topic.get_data` and `Topic.get_data_as_df` performance for Parquet-backed data.
11+
- `Topic.create_from_df()` and `File.add_topic()` now support DataFrames containing nested column types (structs, lists, list<struct>). Previously, only top-level primitive columns were fully supported.
12+
- `AddMessagePathRequest` now accepts a `path_in_schema` field to explicitly specify the field's location in the source data schema as an ordered list of path components. Relatedly, `Topic.add_message_path()` and `Topic.update_message_path()` now accept an optional `path_in_schema` parameter.
13+
14+
## Bugs Fixed
15+
- Updated behavior to not retry requests when server response exceeds the maximum safe payload size.
16+
317
# 0.35.2
18+
## Features Added
19+
- Added limit, sort_by and sort_direction parameters to `v1/datasets/<dataset_id>/files/query`
20+
421
# 0.35.1
522
### Simplified File Transfer API
623

@@ -21,7 +38,6 @@ File upload and download operations have been simplified. The high-level methods
2138

2239
## Features Added
2340
- Added generic file upload API endpoints (`/v1/files/upload/*`) that support uploading files to any association type (datasets, topics, etc.), replacing the dataset-specific upload endpoints.
24-
- Added limit, sort_by and sort_direction parameters to `v1/datasets/<dataset_id>/files/query`
2541

2642
## Bugs Fixed
2743
- CLI version checker now queries GitHub Releases instead of PyPI, ensuring users are only prompted to upgrade to CLI versions that are actually published and available.

RELEASE_NOTES.md

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1-
# 0.35.2
1+
# 0.36.0rc0
2+
## Breaking Changes
3+
- `MessagePath.parents()` now requires a `list[str]` (`path_in_schema`) instead of a dot-delimited `str`. This reflects the shift from ambiguous dot-separated paths to explicit schema path components for correct handling of nested and dot-containing field names. If you were calling `MessagePath.parents("pose.position.x")`, update to `MessagePath.parents(["pose", "position", "x"])`. Passing a string now raises a `TypeError` with guidance on how to migrate.
4+
- `MessagePath.parts()` has been removed. Use `MessagePathRecord.path_in_schema` directly to obtain path components.
5+
26
## Features Added
3-
- Added limit, sort_by and sort_direction parameters to `v1/datasets/<dataset_id>/files/query`
7+
- Introduced `QueryContentMode`, allowing search endpoints to return Roboto entities with or without custom metadata. Initial support is for dataset queries in particular, since datasets can store large amounts of `metadata`, which is known to affect search latency and response size. More entity types will be supported in the future.
8+
- Improved `Topic.get_data` and `Topic.get_data_as_df` performance for Parquet-backed data.
9+
- `Topic.create_from_df()` and `File.add_topic()` now support DataFrames containing nested column types (structs, lists, list<struct>). Previously, only top-level primitive columns were fully supported.
10+
- `AddMessagePathRequest` now accepts a `path_in_schema` field to explicitly specify the field's location in the source data schema as an ordered list of path components. Relatedly, `Topic.add_message_path()` and `Topic.update_message_path()` now accept an optional `path_in_schema` parameter.
11+
12+
## Bugs Fixed
13+
- Updated behavior to not retry requests when server response exceeds the maximum safe payload size.
14+

build-support/lockfiles/python-default.lock

Lines changed: 598 additions & 635 deletions
Large diffs are not rendered by default.

src/roboto/api_version.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,20 @@ class RobotoApiVersion(StrEnum):
2727
v2026_01_02 = "2026-01-02"
2828
"""Release date for v0.35.0 of the Roboto Python SDK"""
2929

30+
v2026_02_02 = "2026-02-02"
31+
"""Content mode introduced for query APIs."""
32+
33+
v2026_02_11 = "2026-02-11"
34+
"""path_in_schema is now a required field on AddMessagePathRequest"""
35+
3036
@staticmethod
3137
def latest() -> RobotoApiVersion:
3238
"""Get the latest available API version.
3339
3440
Returns:
3541
The most recent API version supported by the platform.
3642
"""
37-
return RobotoApiVersion.v2026_01_02
43+
return RobotoApiVersion.v2026_02_11
3844

3945
def is_latest(self) -> bool:
4046
"""Check if this API version is the latest available version.

src/roboto/cli/datasets/search.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,16 @@
88
import json
99
import typing
1010

11-
from ...domain.datasets import Dataset
1211
from ...query import (
1312
Comparator,
1413
Condition,
1514
ConditionGroup,
1615
ConditionOperator,
16+
QueryContentMode,
1717
QuerySpecification,
1818
SortDirection,
1919
)
20+
from ...roboto_search import RobotoSearch
2021
from ..command import (
2122
KeyValuePairsAction,
2223
RobotoCommand,
@@ -27,6 +28,7 @@
2728

2829
def search(args, context: CLIContext, parser: argparse.ArgumentParser):
2930
conditions: list[typing.Union[Condition, ConditionGroup]] = []
31+
3032
if args.metadata:
3133
for key, value in args.metadata.items():
3234
conditions.append(
@@ -50,18 +52,19 @@ def search(args, context: CLIContext, parser: argparse.ArgumentParser):
5052
query_args: dict[str, typing.Any] = {
5153
"sort_direction": SortDirection.Descending,
5254
}
55+
5356
if conditions:
5457
if len(conditions) == 1:
5558
query_args["condition"] = conditions[0]
5659
else:
5760
query_args["condition"] = ConditionGroup(conditions=conditions, operator=ConditionOperator.And)
5861

5962
query = QuerySpecification(**query_args)
60-
results = Dataset.query(
61-
spec=query,
62-
roboto_client=context.roboto_client,
63-
owner_org_id=args.org,
64-
)
63+
searcher = RobotoSearch.for_roboto_client(context.roboto_client, args.org)
64+
65+
# Fetch dataset metadata, since we want to dump all dataset fields as JSON on the cmdline
66+
results = searcher.find_datasets(query, content_mode=QueryContentMode.RecordWithMeta)
67+
6568
print(json.dumps([result.to_dict() for result in results], indent=2))
6669

6770

src/roboto/domain/datasets/dataset.py

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
NoopProgressMonitor,
3535
TqdmProgressMonitor,
3636
)
37-
from ...query import QuerySpecification
37+
from ...query import DEFAULT_PAGE_SIZE, QueryContentMode, QuerySpecification
3838
from ...sentinels import (
3939
NotSet,
4040
NotSetType,
@@ -98,6 +98,7 @@ class Dataset:
9898

9999
__roboto_client: RobotoClient
100100
__record: DatasetRecord
101+
__content_mode: QueryContentMode
101102
__file_service: FileService
102103

103104
@classmethod
@@ -350,7 +351,7 @@ def query(
350351
Found dataset: Other Roboto Test
351352
"""
352353
roboto_client = RobotoClient.defaulted(roboto_client)
353-
spec = spec if spec is not None else QuerySpecification()
354+
spec = spec if spec is not None else QuerySpecification(limit=DEFAULT_PAGE_SIZE)
354355

355356
known = set(DatasetRecord.model_fields.keys())
356357
actual = set()
@@ -386,17 +387,22 @@ def query(
386387
def __eq__(self, other: object) -> bool:
387388
if not isinstance(other, Dataset):
388389
return False
389-
return self.record == other.record
390+
391+
# Only compare core dataset entity fields
392+
exclude_meta: dict[str, typing.Any] = {"metadata": dict()}
393+
return self.record.model_copy(update=exclude_meta) == other.record.model_copy(update=exclude_meta)
390394

391395
def __init__(
392396
self,
393397
record: DatasetRecord,
394398
roboto_client: typing.Optional[RobotoClient] = None,
395399
file_service: typing.Optional[FileService] = None,
400+
content_mode: typing.Optional[QueryContentMode] = None,
396401
) -> None:
397402
self.__roboto_client = RobotoClient.defaulted(roboto_client)
398403
self.__file_service = file_service or FileService(self.__roboto_client)
399404
self.__record = record
405+
self.__content_mode = content_mode or QueryContentMode.RecordWithMeta
400406

401407
def __repr__(self) -> str:
402408
return self.__record.model_dump_json()
@@ -454,7 +460,26 @@ def metadata(self) -> dict[str, typing.Any]:
454460
Returns a copy of the dataset's metadata dictionary containing arbitrary
455461
key-value pairs for storing custom information. Supports nested structures
456462
and dot notation for accessing nested fields.
463+
464+
Note: this attribute is kept for backward compatibility. Prefer :py:meth:`get_metadata()`,
465+
since metadata may need to be loaded on-demand from the server.
457466
"""
467+
468+
return self.get_metadata()
469+
470+
def get_metadata(self) -> dict[str, typing.Any]:
471+
"""Return custom metadata associated with this dataset.
472+
473+
Returns a copy of the dataset's metadata dictionary containing arbitrary
474+
key-value pairs for storing custom information. Supports nested structures
475+
and dot notation for accessing nested fields.
476+
"""
477+
478+
if self.__content_mode is not QueryContentMode.RecordWithMeta:
479+
# Force metadata to be fetched from the DB.
480+
self.refresh()
481+
self.__content_mode = QueryContentMode.RecordWithMeta
482+
458483
return self.__record.metadata.copy()
459484

460485
@property
@@ -552,7 +577,9 @@ def create_directory(
552577
Create a directory with intermediate directories:
553578
554579
>>> directory = dataset.create_directory(
555-
... name="final", parent_path="path/to/deep", create_intermediate_dirs=True
580+
... name="final",
581+
... parent_path=pathlib.Path("path/to/deep"),
582+
... create_intermediate_dirs=True,
556583
... )
557584
>>> print(directory.relative_path)
558585
path/to/deep/final

src/roboto/domain/topics/message_path.py

Lines changed: 24 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -58,54 +58,45 @@ class MessagePath:
5858
__topic_data_service: TopicDataService
5959

6060
@staticmethod
61-
def parents(path: str) -> list[str]:
62-
"""Get parent paths for a message path in dot notation.
61+
def parents(path_in_schema: list[str]) -> list[str]:
62+
"""Get parent paths for a message path.
6363
64-
Given a message path in dot notation, returns a list of its parent paths
65-
ordered from most specific to least specific.
64+
Given a path_in_schema (list of path components), returns a list of its
65+
parent paths ordered from most specific to least specific.
6666
6767
Args:
68-
path: Message path in dot notation (e.g., "pose.pose.position.x").
68+
path_in_schema: List of path components (e.g., ["pose", "pose", "position", "x"]).
6969
7070
Returns:
7171
List of parent paths in dot notation, ordered from most to least specific.
7272
73+
Raises:
74+
TypeError: If a string is passed instead of a list. This method previously
75+
accepted a dot-delimited string; passing a string now would silently
76+
iterate over its characters and produce wrong results.
77+
7378
Examples:
74-
>>> path = "pose.pose.position.x"
75-
>>> MessagePath.parents(path)
79+
>>> path_in_schema = ["pose", "pose", "position", "x"]
80+
>>> MessagePath.parents(path_in_schema)
7681
['pose.pose.position', 'pose.pose', 'pose']
7782
7883
>>> # Single level path has no parents
79-
>>> MessagePath.parents("velocity")
84+
>>> MessagePath.parents(["velocity"])
8085
[]
8186
"""
82-
parent_parts = MessagePath.parts(path)[:-1]
87+
# Guard against callers using the old str-based signature.
88+
# Strings are iterable in Python, so passing one wouldn't raise but would
89+
# silently produce nonsense by iterating over individual characters.
90+
if isinstance(path_in_schema, str):
91+
raise TypeError(
92+
"MessagePath.parents() now requires a list[str] (path_in_schema), "
93+
"not a dot-delimited string. Use MessagePathRecord.path_in_schema "
94+
"or split your path into components."
95+
)
96+
97+
parent_parts = path_in_schema[:-1]
8398
return [MessagePath.DELIMITER.join(parent_parts[:i]) for i in range(len(parent_parts), 0, -1)]
8499

85-
@staticmethod
86-
def parts(path: str) -> list[str]:
87-
"""Split message path in dot notation into its constituent parts.
88-
89-
Splits a message path string into individual components, useful for
90-
programmatic manipulation of message path hierarchies.
91-
92-
Args:
93-
path: Message path in dot notation (e.g., "pose.pose.position.x").
94-
95-
Returns:
96-
List of path components in order from root to leaf.
97-
98-
Examples:
99-
>>> path = "pose.pose.position.x"
100-
>>> MessagePath.parts(path)
101-
['pose', 'pose', 'position', 'x']
102-
103-
>>> # Single component path
104-
>>> MessagePath.parts("velocity")
105-
['velocity']
106-
"""
107-
return path.split(MessagePath.DELIMITER)
108-
109100
@classmethod
110101
def from_id(
111102
cls,

src/roboto/domain/topics/operations.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,10 @@ class AddMessagePathRequest(pydantic.BaseModel):
9595
canonical_data_type: Normalized Roboto data type that enables specialized
9696
platform features for maps, images, timestamps, and other data.
9797
metadata: Initial key-value pairs to associate with the message path.
98+
path_in_schema: List of path components representing the field's location
99+
in the original data schema. Unlike message_path, which assumes dots
100+
separate path parts implying nested data, this preserves the exact path
101+
from the source data for accurate attribute access.
98102
"""
99103

100104
message_path: str
@@ -105,6 +109,10 @@ class AddMessagePathRequest(pydantic.BaseModel):
105109
description="Initial key-value pairs to associate with this topic message path for discovery and search, e.g. "
106110
+ "`{ 'min': 0.71, 'max': 1.77, 'classification': 'my-custom-classification-tag' }`",
107111
)
112+
path_in_schema: list[str] = pydantic.Field(
113+
description="List of path components representing the field's location in the source data schema. "
114+
"For nested fields like 'position.x', this would be ['position', 'x'].",
115+
)
108116

109117
model_config = pydantic.ConfigDict(extra="ignore")
110118

@@ -133,6 +141,12 @@ class UpdateMessagePathRequest(pydantic.BaseModel):
133141
ability to interpret and visualize the data.
134142
"""
135143

144+
path_in_schema: typing.Union[list[str], NotSetType] = NotSet
145+
"""List of path components representing the field's location in the source data schema (optional).
146+
147+
For nested fields like 'position.x', this would be ['position', 'x'].
148+
"""
149+
136150
model_config = pydantic.ConfigDict(extra="ignore", json_schema_extra=NotSetType.openapi_schema_modifier)
137151

138152
def has_updates(self) -> bool:
@@ -145,6 +159,7 @@ def has_updates(self) -> bool:
145159
return (
146160
is_set(self.data_type)
147161
or is_set(self.canonical_data_type)
162+
or is_set(self.path_in_schema)
148163
or (is_set(self.metadata_changeset) and self.metadata_changeset.has_changes())
149164
)
150165

src/roboto/domain/topics/parquet/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
# file, You can obtain one at https://mozilla.org/MPL/2.0/.
66

77
from .arrow_to_roboto import (
8-
field_to_message_path_request,
8+
generate_message_path_requests,
99
)
1010
from .ingestion import (
1111
make_topic_filename_safe,
@@ -17,7 +17,7 @@
1717
)
1818

1919
__all__ = (
20-
"field_to_message_path_request",
20+
"generate_message_path_requests",
2121
"make_topic_filename_safe",
2222
"ParquetParser",
2323
"ParquetTopicReader",

0 commit comments

Comments
 (0)