Skip to content

Commit eb44ed4

Browse files
authored
Merge branch 'main' into feature/python-to-arrow-mapping
2 parents d60c870 + 5e8f74c commit eb44ed4

4 files changed

Lines changed: 99 additions & 1 deletion

File tree

.beads/issues.jsonl

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,6 @@
4444
{"id":"vgi-python-k7x","title":"Use Mapping instead of dict in extract_argument_specs signature","description":"The arg_types parameter in extract_argument_specs() is typed as dict[str, pa.DataType]. Using Mapping[str, pa.DataType] from collections.abc would be more flexible, accepting any mapping type.","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:21.021496-05:00","created_by":"rusty","updated_at":"2026-01-05T12:03:51.771301-05:00","closed_at":"2026-01-05T12:03:51.771301-05:00","close_reason":"Closed"}
4545
{"id":"vgi-python-kz4","title":"Rename TableInOutGeneratorFunction to TableInOutGenerator for consistency","description":"Naming inconsistency: TableFunctionGenerator uses *Generator suffix, but TableInOutGeneratorFunction uses *GeneratorFunction suffix. Rename TableInOutGeneratorFunction to TableInOutGenerator for consistency. Also consider renaming ScalarFunctionGenerator if needed.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-04T20:06:41.581028-05:00","created_by":"rusty","updated_at":"2026-01-04T21:43:58.141038-05:00","closed_at":"2026-01-04T21:43:58.141038-05:00","close_reason":"PR #7 created: https://github.com/Query-farm/vgi-python/pull/7"}
4646
{"id":"vgi-python-l1u","title":"Consider custom __repr__ for ArgumentSpec","description":"The default dataclass __repr__ includes the full Arrow type repr which can be verbose. Consider a custom __repr__ that's more concise for debugging, e.g., 'ArgumentSpec(name=\"count\", pos=0, type=int64)' instead of showing the full pa.DataType object.","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-05T11:51:21.415976-05:00","created_by":"rusty","updated_at":"2026-01-05T12:15:02.029743-05:00","closed_at":"2026-01-05T12:15:02.029743-05:00","close_reason":"Closed"}
47-
{"id":"vgi-python-l5z","title":"Update existing tests that use arg_types parameter","description":"In tests/test_argument_spec.py:\n- Update all calls to extract_argument_specs() that pass arg_types\n- Remove the arg_types parameter from test function calls\n- Ensure tests still pass with auto-inference","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:56.81929-05:00","created_by":"rusty","updated_at":"2026-01-05T15:44:56.81929-05:00","dependencies":[{"issue_id":"vgi-python-l5z","depends_on_id":"vgi-python-coi","type":"blocks","created_at":"2026-01-05T15:45:13.980985-05:00","created_by":"rusty"}]}
4847
{"id":"vgi-python-lec","title":"Add test coverage for testing.py helper edge cases","notes":"Coverage: 89% in vgi/testing.py. Missing tests for:\n- Lines 421-422, 450-451: StopIteration handling in _process_batch\n- Lines 468-472: FINISHED status during data phase\n- Lines 485-486, 502-503: _finalize edge cases\n\nLow priority since these are test helpers.","status":"closed","priority":4,"issue_type":"task","created_at":"2026-01-04T22:15:34.006563-05:00","created_by":"rusty","updated_at":"2026-01-05T12:07:48.026786-05:00","closed_at":"2026-01-05T12:07:48.026786-05:00","close_reason":"Closed"}
4948
{"id":"vgi-python-lzc","title":"Extract duplicated sort_key function in argument_spec","description":"The sort_key function is duplicated at lines 139-142 and 309-312 in argument_spec.py. Extract it to a module-level function to follow DRY principles.","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-05T11:51:19.141041-05:00","created_by":"rusty","updated_at":"2026-01-05T11:55:22.535816-05:00","closed_at":"2026-01-05T11:55:22.535816-05:00","close_reason":"Closed"}
5049
{"id":"vgi-python-m45","title":"Create tests/test_argument_spec.py","description":"## Overview\n\nCreate comprehensive tests for the argument specification serialization module.\n\n## File Location\n\n`tests/test_argument_spec.py`\n\n## Test Classes and Cases\n\n### TestArgumentSpecToSchema\n\nTest converting ArgumentSpec objects to Arrow schema.\n\n#### test_positional_arguments_preserve_order\n- Create specs with positions 0, 1, 2\n- Convert to schema\n- Verify field order matches position order\n- Verify field types are preserved\n\n#### test_named_arguments_have_metadata\n- Create spec with position='key' (named)\n- Convert to schema\n- Verify field has `vgi_arg=named` metadata\n\n#### test_mixed_positional_and_named\n- Create mix of positional (0, 1) and named ('format', 'verbose') specs\n- Convert to schema\n- Verify positional come first, then named\n- Verify named have correct metadata\n\n#### test_table_input_uses_null_type\n- Create spec with is_table_input=True\n- Convert to schema\n- Verify field type is pa.null()\n- Verify field has `vgi_type=table` metadata\n\n#### test_any_type_uses_null_type\n- Create spec with is_any_type=True\n- Convert to schema\n- Verify field type is pa.null()\n- Verify field has `vgi_type=any` metadata\n\n#### test_varargs_has_metadata\n- Create spec with is_varargs=True and arrow_type=pa.int64()\n- Convert to schema\n- Verify field type is pa.int64() (element type preserved)\n- Verify field has `vgi_varargs=true` metadata\n\n### TestSchemaToArgumentSpecs\n\nTest converting Arrow schema back to ArgumentSpec objects.\n\n#### test_positional_arguments_from_schema\n- Create schema with 3 fields (no metadata)\n- Convert to specs\n- Verify positions are 0, 1, 2\n\n#### test_named_arguments_from_metadata\n- Create schema with `vgi_arg=named` metadata on fields\n- Convert to specs\n- Verify position is field name string\n\n#### test_table_input_detected\n- Create schema with `vgi_type=table` metadata\n- Convert to specs\n- Verify is_table_input=True\n\n#### test_any_type_detected\n- Create schema with `vgi_type=any` metadata\n- Convert to specs\n- Verify is_any_type=True\n\n#### test_varargs_detected\n- Create schema with `vgi_varargs=true` metadata\n- Convert to specs\n- Verify is_varargs=True\n\n### TestRoundTrip\n\nTest that specs survive serialization round-trip.\n\n#### test_complex_arrow_types_preserved\nTest each of these types round-trips correctly:\n- pa.int64(), pa.float32(), pa.utf8()\n- pa.list_(pa.float64())\n- pa.struct([pa.field('a', pa.int32()), pa.field('b', pa.string())])\n- pa.map_(pa.string(), pa.int64())\n- pa.decimal128(10, 2)\n- pa.timestamp('us', tz='UTC')\n\n#### test_full_function_signature_roundtrip\n- Create specs matching a realistic function:\n - count: int, position 0\n - data: TableInput, position 1\n - extra: float varargs, position 2\n - format: str, named 'format'\n- Convert to schema, serialize to bytes, deserialize, convert back to specs\n- Verify all specs match original\n\n### TestExtractArgumentSpecs\n\nTest extracting specs from function classes.\n\n#### test_extract_from_simple_function\n- Define function class with Arg descriptors\n- Call extract_argument_specs with arg_types dict\n- Verify specs match descriptors\n\n#### test_extract_table_input\n- Define function with Arg[TableInput]\n- Extract specs\n- Verify is_table_input=True\n\n#### test_extract_any_arrow\n- Define function with Arg[AnyArrow]\n- Extract specs\n- Verify is_any_type=True\n\n#### test_extract_varargs\n- Define function with Arg[int](2, varargs=True)\n- Extract specs\n- Verify is_varargs=True\n\n### TestEdgeCases\n\n#### test_empty_schema\n- Convert empty list of specs to schema\n- Verify empty schema works\n- Convert back, verify empty list\n\n#### test_only_named_arguments\n- Create specs with only named arguments (no positional)\n- Round-trip and verify\n\n#### test_only_positional_arguments\n- Create specs with only positional arguments (no named)\n- Round-trip and verify\n\n## Test Utilities\n\nConsider creating fixtures for common patterns:\n- `make_spec()` helper for creating ArgumentSpec\n- Sample function classes for extraction tests","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T11:18:53.312911-05:00","created_by":"rusty","updated_at":"2026-01-05T11:32:35.580879-05:00","closed_at":"2026-01-05T11:32:35.580879-05:00","close_reason":"Created comprehensive tests with 43 passing test cases","dependencies":[{"issue_id":"vgi-python-m45","depends_on_id":"vgi-python-cd0","type":"blocks","created_at":"2026-01-05T11:19:30.779207-05:00","created_by":"rusty"}]}

tests/test_argument_spec.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -570,3 +570,57 @@ def test_combined_metadata(self) -> None:
570570
assert field.metadata is not None
571571
assert field.metadata.get(VGI_ARG_KEY) == VGI_ARG_NAMED
572572
assert field.metadata.get(VGI_TYPE_KEY) == VGI_TYPE_ANY
573+
574+
575+
class TestArgumentSpecRepr:
576+
"""Test ArgumentSpec __repr__ method."""
577+
578+
def test_positional_argument_repr(self) -> None:
579+
"""Positional argument should show integer position."""
580+
spec = ArgumentSpec(name="count", position=0, arrow_type=pa.int64())
581+
result = repr(spec)
582+
assert 'name="count"' in result
583+
assert "pos=0" in result
584+
assert "int64" in result
585+
assert "flags=" not in result # No flags when all False
586+
587+
def test_named_argument_repr(self) -> None:
588+
"""Named argument should show quoted string position."""
589+
spec = ArgumentSpec(name="format", position="format", arrow_type=pa.utf8())
590+
result = repr(spec)
591+
assert 'name="format"' in result
592+
assert 'pos="format"' in result
593+
assert "string" in result or "utf8" in result
594+
595+
def test_flags_shown_when_true(self) -> None:
596+
"""Flags should only appear when True."""
597+
spec = ArgumentSpec(
598+
name="data",
599+
position=0,
600+
arrow_type=pa.null(),
601+
is_table_input=True,
602+
)
603+
result = repr(spec)
604+
assert "flags=[table_input]" in result
605+
606+
def test_multiple_flags(self) -> None:
607+
"""Multiple flags should all be shown."""
608+
spec = ArgumentSpec(
609+
name="cols",
610+
position=0,
611+
arrow_type=pa.utf8(),
612+
is_varargs=True,
613+
)
614+
result = repr(spec)
615+
assert "varargs" in result
616+
617+
def test_any_type_flag(self) -> None:
618+
"""any_type flag should be shown."""
619+
spec = ArgumentSpec(
620+
name="value",
621+
position=0,
622+
arrow_type=pa.null(),
623+
is_any_type=True,
624+
)
625+
result = repr(spec)
626+
assert "any_type" in result

vgi/argument_spec.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,17 @@ class ArgumentSpec:
109109
is_varargs: True if this argument collects all remaining positional
110110
arguments (varargs=True).
111111
112+
Note:
113+
For named arguments, the Python attribute name (``name``) and the SQL
114+
key (``position``) are assumed to be identical. This is the standard
115+
convention::
116+
117+
format = Arg[str]("format") # name="format", position="format"
118+
119+
If they differ, the ``position`` value will be lost during schema
120+
round-trip serialization, as only ``name`` is stored in the Arrow
121+
schema field name.
122+
112123
"""
113124

114125
name: str
@@ -118,6 +129,27 @@ class ArgumentSpec:
118129
is_any_type: bool = False
119130
is_varargs: bool = False
120131

132+
def __repr__(self) -> str:
133+
"""Return concise repr showing key attributes."""
134+
# Build position display: integer or quoted string
135+
pos = self.position if isinstance(self.position, int) else f'"{self.position}"'
136+
137+
# Build flags list (only show if True)
138+
flags = []
139+
if self.is_table_input:
140+
flags.append("table_input")
141+
if self.is_any_type:
142+
flags.append("any_type")
143+
if self.is_varargs:
144+
flags.append("varargs")
145+
146+
flags_str = f", flags=[{', '.join(flags)}]" if flags else ""
147+
148+
return (
149+
f'ArgumentSpec(name="{self.name}", pos={pos}, '
150+
f"type={self.arrow_type}{flags_str})"
151+
)
152+
121153

122154
# =============================================================================
123155
# Serialization Functions

vgi/arguments.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -530,6 +530,19 @@ def transform(self, batch):
530530
# Validation happens automatically on first access
531531
...
532532
533+
Note:
534+
For named arguments (string position), the Python attribute name should
535+
match the SQL key. This is the standard convention::
536+
537+
format = Arg[str]("format") # Recommended: attribute == key
538+
539+
Avoid using different names::
540+
541+
output_format = Arg[str]("format") # Not recommended
542+
543+
While this works at runtime, it can cause issues with metadata
544+
serialization where only one name is preserved.
545+
533546
"""
534547

535548
__slots__ = (

0 commit comments

Comments
 (0)