|
28 | 28 | {"id":"vgi-python-cd0","title":"Create vgi/argument_spec.py module","description":"## Overview\n\nCreate the core module implementing Arrow-based argument specification serialization.\n\n## File Location\n\n`vgi/argument_spec.py`\n\n## Constants to Define\n\n```python\n# Metadata keys (all bytes for Arrow compatibility)\nVGI_ARG_KEY = b\"vgi_arg\"\nVGI_ARG_NAMED = b\"named\"\n\nVGI_TYPE_KEY = b\"vgi_type\"\nVGI_TYPE_TABLE = b\"table\"\nVGI_TYPE_ANY = b\"any\"\n\nVGI_VARARGS_KEY = b\"vgi_varargs\"\nVGI_VARARGS_TRUE = b\"true\"\n```\n\n## ArgumentSpec Dataclass\n\n```python\n@dataclass(frozen=True)\nclass ArgumentSpec:\n \"\"\"Specification for a single function argument.\"\"\"\n name: str # Python attribute name\n position: int | str # int for positional index, str for named key\n arrow_type: pa.DataType # Arrow type (pa.null() for special types)\n is_table_input: bool = False # Arg[TableInput]\n is_any_type: bool = False # Arg[AnyArrow]\n is_varargs: bool = False # varargs=True\n```\n\n## Functions to Implement\n\n### argument_specs_to_schema(specs: Sequence[ArgumentSpec]) -\u003e pa.Schema\n\nConvert ArgumentSpecs to a single Arrow schema:\n1. Sort specs: positional first (by index), then named\n2. For each spec, create a pa.field with:\n - name = spec.name\n - type = spec.arrow_type (or pa.null() for table/any)\n - metadata = appropriate markers based on flags\n3. Return pa.schema(fields)\n\n### schema_to_argument_specs(schema: pa.Schema) -\u003e list[ArgumentSpec]\n\nConvert schema back to ArgumentSpecs:\n1. Iterate through schema fields in order\n2. Track position index (increments for non-named args)\n3. Check field metadata for markers:\n - `vgi_arg=named` -\u003e position is field name string\n - `vgi_type=table` -\u003e is_table_input=True\n - `vgi_type=any` -\u003e is_any_type=True\n - `vgi_varargs=true` -\u003e is_varargs=True\n4. Return list of ArgumentSpec\n\n### extract_argument_specs(cls: type, arg_types: dict[str, pa.DataType]) -\u003e list[ArgumentSpec]\n\nExtract specs from a function class with Arg descriptors:\n1. Walk class MRO to find all Arg descriptors (like extract_parameters in metadata.py)\n2. For each Arg descriptor:\n - Get name from attribute name\n - Get position from arg.position\n - Get arrow_type from arg_types dict\n - Check type hints for TableInput/AnyArrow\n - Check arg.varargs flag\n3. Sort and return list\n\n## Dependencies\n\n- Import `Arg`, `TableInput`, `AnyArrow` from `vgi.arguments`\n- Reference `extract_parameters()` pattern in `vgi/metadata.py`","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T11:18:32.777241-05:00","created_by":"rusty","updated_at":"2026-01-05T11:28:07.227452-05:00","closed_at":"2026-01-05T11:28:07.227452-05:00","close_reason":"Created vgi/argument_spec.py with ArgumentSpec dataclass and serialization functions","dependencies":[{"issue_id":"vgi-python-cd0","depends_on_id":"vgi-python-8ra","type":"blocks","created_at":"2026-01-05T11:19:30.743936-05:00","created_by":"rusty"}]} |
29 | 29 | {"id":"vgi-python-ckg","title":"Add AnyValue sentinel class to vgi/arguments.py","description":"Add AnyValue class similar to TableInput, export in __all__","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T10:41:41.392694-05:00","created_by":"rusty","updated_at":"2026-01-05T11:05:38.37392-05:00","closed_at":"2026-01-05T11:05:38.37392-05:00","close_reason":"Added AnyArrow sentinel class to arguments.py","dependencies":[{"issue_id":"vgi-python-ckg","depends_on_id":"vgi-python-awm","type":"blocks","created_at":"2026-01-05T10:41:52.658405-05:00","created_by":"rusty"}]} |
30 | 30 | {"id":"vgi-python-coi","title":"Update extract_argument_specs() to remove arg_types parameter","description":"In vgi/argument_spec.py:\n1. Remove arg_types parameter from function signature\n2. Update arrow_type resolution logic:\n - Use arg.arrow_type if explicitly set\n - Infer from Python type hint using PYTHON_TO_ARROW\n - Handle TableInput/AnyArrow → pa.null()\n - Warn and default to pa.null() for unknown types\n3. Import PYTHON_TO_ARROW from vgi.arguments","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:38.141157-05:00","created_by":"rusty","updated_at":"2026-01-05T15:44:38.141157-05:00","dependencies":[{"issue_id":"vgi-python-coi","depends_on_id":"vgi-python-cvj","type":"blocks","created_at":"2026-01-05T15:45:13.831745-05:00","created_by":"rusty"},{"issue_id":"vgi-python-coi","depends_on_id":"vgi-python-dv0","type":"blocks","created_at":"2026-01-05T15:45:13.864608-05:00","created_by":"rusty"}]} |
31 | | -{"id":"vgi-python-cvj","title":"Add PYTHON_TO_ARROW type mapping to vgi/arguments.py","description":"Add the Python→Arrow type mapping dict after imports:\n```python\nPYTHON_TO_ARROW: dict[type, pa.DataType] = {\n int: pa.int64(),\n str: pa.utf8(),\n float: pa.float64(),\n bool: pa.bool_(),\n bytes: pa.binary(),\n}\n```\nExport in __all__.","status":"in_progress","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:37.900421-05:00","created_by":"rusty","updated_at":"2026-01-05T15:46:01.292126-05:00"} |
| 31 | +{"id":"vgi-python-cvj","title":"Add PYTHON_TO_ARROW type mapping to vgi/arguments.py","description":"Add the Python→Arrow type mapping dict after imports:\n```python\nPYTHON_TO_ARROW: dict[type, pa.DataType] = {\n int: pa.int64(),\n str: pa.utf8(),\n float: pa.float64(),\n bool: pa.bool_(),\n bytes: pa.binary(),\n}\n```\nExport in __all__.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:37.900421-05:00","created_by":"rusty","updated_at":"2026-01-05T15:48:42.422086-05:00","closed_at":"2026-01-05T15:48:42.422086-05:00","close_reason":"PR #18 created"} |
32 | 32 | {"id":"vgi-python-d73","title":"Create docs/argument-serialization.md","description":"## Overview\n\nCreate LLM-friendly documentation explaining the argument specification serialization format. This document should enable future implementors (human or AI) to understand how function argument signatures are serialized to Arrow schemas.\n\n## File Location\n\n`docs/argument-serialization.md`\n\n## Document Structure\n\n### Title and Purpose\n\nExplain that this document describes how VGI function argument specifications are serialized to Apache Arrow schemas for IPC transmission and DuckDB function registration.\n\n### Quick Reference\n\nA concise summary table showing:\n- Metadata keys and their meanings\n- Special type representations\n\n### Schema Format\n\nExplain the single-schema design:\n1. All arguments are fields in one Arrow schema\n2. Positional arguments come first, in order (field index = position index)\n3. Named arguments follow, marked with metadata\n4. Field name = Python attribute name (or argument key for named)\n5. Field type = exact Arrow type\n\n### Metadata Keys Reference\n\nComplete table of all metadata keys:\n\n| Key | Value | Description |\n|-----|-------|-------------|\n| `vgi_arg` | `named` | Field is a named argument, not positional. The field name is the argument key. |\n| `vgi_type` | `table` | Argument receives streaming table input (Arg[TableInput]). Arrow type is pa.null(). |\n| `vgi_type` | `any` | Argument accepts any Arrow type (Arg[AnyArrow]). Arrow type is pa.null(). |\n| `vgi_varargs` | `true` | Argument collects all remaining positional args. Arrow type is the element type. |\n\n### Special Type Handling\n\nExplain how special argument types are represented:\n\n#### TableInput\n- Arrow type: `pa.null()`\n- Metadata: `{b\"vgi_type\": b\"table\"}`\n- Meaning: This position receives streaming RecordBatches, not a scalar value\n\n#### AnyArrow\n- Arrow type: `pa.null()`\n- Metadata: `{b\"vgi_type\": b\"any\"}`\n- Meaning: Accepts any valid Arrow scalar type at runtime\n\n#### Varargs\n- Arrow type: The element type (e.g., `pa.int64()` for `Arg[int](..., varargs=True)`)\n- Metadata: `{b\"vgi_varargs\": b\"true\"}`\n- Meaning: Collects all remaining positional arguments from this position onwards\n\n### Examples\n\n#### Example 1: Simple Function\n\n```python\nclass MyFunction(TableInOutFunction):\n count = Arg[int](0) # Positional 0\n name = Arg[str](1) # Positional 1\n verbose = Arg[bool](\"verbose\") # Named\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"count\", pa.int64()),\n pa.field(\"name\", pa.utf8()),\n pa.field(\"verbose\", pa.bool_(), metadata={b\"vgi_arg\": b\"named\"}),\n])\n```\n\n#### Example 2: Function with Table Input\n\n```python\nclass TransformFunction(TableInOutFunction):\n multiplier = Arg[float](0)\n data = Arg[TableInput](1)\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"multiplier\", pa.float64()),\n pa.field(\"data\", pa.null(), metadata={b\"vgi_type\": b\"table\"}),\n])\n```\n\n#### Example 3: Function with Varargs\n\n```python\nclass SumFunction(TableInOutFunction):\n columns = Arg[str](0, varargs=True)\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"columns\", pa.utf8(), metadata={b\"vgi_varargs\": b\"true\"}),\n])\n```\n\n#### Example 4: Complex Function\n\n```python\nclass ComplexFunction(TableInOutFunction):\n count = Arg[int](0)\n data = Arg[TableInput](1)\n extra = Arg[float](2, varargs=True)\n format = Arg[str](\"format\")\n threshold = Arg[AnyArrow](\"threshold\")\n\n# Serializes to:\nschema = pa.schema([\n pa.field(\"count\", pa.int64()),\n pa.field(\"data\", pa.null(), metadata={b\"vgi_type\": b\"table\"}),\n pa.field(\"extra\", pa.float64(), metadata={b\"vgi_varargs\": b\"true\"}),\n pa.field(\"format\", pa.utf8(), metadata={b\"vgi_arg\": b\"named\"}),\n pa.field(\"threshold\", pa.null(), metadata={b\"vgi_arg\": b\"named\", b\"vgi_type\": b\"any\"}),\n])\n```\n\n### Serialization Code\n\nShow how to serialize and deserialize:\n\n```python\n# Serialize to bytes\nschema_bytes = schema.serialize().to_pybytes()\n\n# Deserialize from bytes\nschema = pa.ipc.read_schema(pa.py_buffer(schema_bytes))\n```\n\n### Parsing Algorithm\n\nExplain how to parse a schema back to argument specs:\n\n1. Initialize position_index = 0\n2. For each field in schema:\n a. Check if field has `vgi_arg=named` metadata\n b. If named: position = field.name (string)\n c. If positional: position = position_index, then increment position_index\n d. Check for `vgi_type` metadata (table or any)\n e. Check for `vgi_varargs` metadata\n f. Create ArgumentSpec with extracted info\n\n### Not Included\n\nExplicitly state what is NOT serialized:\n- Default values\n- Validation constraints (ge, le, choices, pattern)\n- Documentation strings\n\nThese are Python-side concerns handled by the Arg descriptor at runtime.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T11:19:17.488877-05:00","created_by":"rusty","updated_at":"2026-01-05T11:33:29.168007-05:00","closed_at":"2026-01-05T11:33:29.168007-05:00","close_reason":"Created comprehensive LLM-friendly documentation","dependencies":[{"issue_id":"vgi-python-d73","depends_on_id":"vgi-python-8ra","type":"blocks","created_at":"2026-01-05T11:19:30.820384-05:00","created_by":"rusty"}]} |
33 | 33 | {"id":"vgi-python-dv0","title":"Add arrow_type parameter to Arg class","description":"In vgi/arguments.py:\n1. Add 'arrow_type' to __slots__\n2. Add parameter: arrow_type: pa.DataType | None = None\n3. Store: self.arrow_type = arrow_type\n4. Update __repr__ to include arrow_type if set","status":"open","priority":2,"issue_type":"task","created_at":"2026-01-05T15:44:38.020395-05:00","created_by":"rusty","updated_at":"2026-01-05T15:44:38.020395-05:00","dependencies":[{"issue_id":"vgi-python-dv0","depends_on_id":"vgi-python-cvj","type":"blocks","created_at":"2026-01-05T15:45:13.696822-05:00","created_by":"rusty"}]} |
34 | 34 | {"id":"vgi-python-dvo","title":"Export AnyValue in vgi/__init__.py","description":"Import and add AnyValue to __all__ exports","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-05T10:41:41.65732-05:00","created_by":"rusty","updated_at":"2026-01-05T11:07:09.187969-05:00","closed_at":"2026-01-05T11:07:09.187969-05:00","close_reason":"Exported AnyArrow in vgi/__init__.py","dependencies":[{"issue_id":"vgi-python-dvo","depends_on_id":"vgi-python-ckg","type":"blocks","created_at":"2026-01-05T10:41:48.715634-05:00","created_by":"rusty"}]} |
|
0 commit comments