Summary
In opengin/core-api/db/repository/postgres/data_handler.go, StoreTabularData() currently uses a two-pass approach for new tables (schema inference + row validation), and a separate validation path for existing tables.
This issue proposes consolidating this into a single scanner pass that can both infer (when needed) and validate rows, while keeping behavior consistent.
Current Flow (with exact code locations)
File: opengin/core-api/db/repository/postgres/data_handler.go
- Main entrypoint:
StoreTabularData(ctx, entityID, attrName, value)
- Current helper calls involved:
schema.GenerateSchema(...) (new-table path)
validateRowsAgainstSchema(...) (both paths)
hasNullOnlyColumns(...) (new-table path)
isDateTime(...) and isStructpbNull(...) (validation/type helpers)
schemaToColumns(...) (DDL generation)
Current behavior inside StoreTabularData(...)
- If table exists:
- Load persisted schema from
attribute_schemas
- Call
validateRowsAgainstSchema(&tabularStruct, &existingSchema)
- If table does not exist:
- Call
schema.GenerateSchema(value.Value)
- Call
hasNullOnlyColumns(schemaInfo)
- Call
validateRowsAgainstSchema(&tabularStruct, schemaInfo)
- Create table via
schemaToColumns(schemaInfo)
This means new-table writes may traverse row data multiple times (inference + validation).
Problem
- Repeated scans of the same tabular payload in
StoreTabularData(...).
- Type inference rules and validation rules are distributed across helpers, increasing chance of drift.
- Date/datetime/null handling policy should be consistent between inference and validation.
Proposed Refactor (specific functions/files)
1) Add combined scanner in:
opengin/core-api/db/repository/postgres/data_handler.go
Proposed function:
scanTabularRows(data *structpb.Struct, existingSchema *schema.SchemaInfo) (*schema.SchemaInfo, error)
Responsibilities in one pass:
- validate row shape,
- classify cell type,
- existing-schema mode: validate cell against
existingSchema,
- inference mode: infer column type from first non-null, then validate compatibility for later rows.
2) Centralize shared type logic in same file
Use/introduce helpers in data_handler.go such as:
inferCellType(...)
- compatibility helper (e.g.
areInferredTypesCompatible(...))
- value-vs-schema checker (e.g.
valueMatchesType(...))
- one canonical date/datetime policy (currently tied to
isDateTime(...) semantics)
3) Rewire StoreTabularData(...)
In opengin/core-api/db/repository/postgres/data_handler.go:
- existing table path: fetch schema ->
scanTabularRows(..., &existingSchema) -> insert
- new table path:
scanTabularRows(..., nil) -> if inferred schema has any null-only columns, fail -> schemaToColumns(...) -> create + persist schema -> insert
4) Remove redundant helpers if no longer needed
From data_handler.go, remove/simplify:
validateRowsAgainstSchema(...)
hasNullOnlyColumns(...)
- any obsolete compatibility helper only used by old flow
Explicit Requirement
For new table creation in StoreTabularData(...): if any column is null across all rows (cannot infer concrete type), return a clear error and do not create table/schema.
Tests to Update (specific file/functions)
File: opengin/core-api/db/repository/postgres/data_handler_test.go
Add/update tests for:
- existing schema valid rows accepted,
- existing schema invalid rows rejected with row/column context,
- new schema inferred correctly,
- new schema fails when any column is all-null,
- mixed-type incompatibility behavior under unified rules.
File: opengin/core-api/db/repository/postgres/postgres_client_test.go
Adjust integration-style tabular write tests to align with new scanner-driven flow in StoreTabularData(...).
Validation
go test ./db/repository/postgres -run ^$ (compile check)
go test ./engine -run ^$ (downstream compile check)
- Run targeted postgres tabular tests (scanner/inference/validation cases)
Summary
In
opengin/core-api/db/repository/postgres/data_handler.go,StoreTabularData()currently uses a two-pass approach for new tables (schema inference + row validation), and a separate validation path for existing tables.This issue proposes consolidating this into a single scanner pass that can both infer (when needed) and validate rows, while keeping behavior consistent.
Current Flow (with exact code locations)
File:
opengin/core-api/db/repository/postgres/data_handler.goStoreTabularData(ctx, entityID, attrName, value)schema.GenerateSchema(...)(new-table path)validateRowsAgainstSchema(...)(both paths)hasNullOnlyColumns(...)(new-table path)isDateTime(...)andisStructpbNull(...)(validation/type helpers)schemaToColumns(...)(DDL generation)Current behavior inside
StoreTabularData(...)attribute_schemasvalidateRowsAgainstSchema(&tabularStruct, &existingSchema)schema.GenerateSchema(value.Value)hasNullOnlyColumns(schemaInfo)validateRowsAgainstSchema(&tabularStruct, schemaInfo)schemaToColumns(schemaInfo)This means new-table writes may traverse row data multiple times (inference + validation).
Problem
StoreTabularData(...).Proposed Refactor (specific functions/files)
1) Add combined scanner in:
opengin/core-api/db/repository/postgres/data_handler.goProposed function:
scanTabularRows(data *structpb.Struct, existingSchema *schema.SchemaInfo) (*schema.SchemaInfo, error)Responsibilities in one pass:
existingSchema,2) Centralize shared type logic in same file
Use/introduce helpers in
data_handler.gosuch as:inferCellType(...)areInferredTypesCompatible(...))valueMatchesType(...))isDateTime(...)semantics)3) Rewire
StoreTabularData(...)In
opengin/core-api/db/repository/postgres/data_handler.go:scanTabularRows(..., &existingSchema)-> insertscanTabularRows(..., nil)-> if inferred schema has any null-only columns, fail ->schemaToColumns(...)-> create + persist schema -> insert4) Remove redundant helpers if no longer needed
From
data_handler.go, remove/simplify:validateRowsAgainstSchema(...)hasNullOnlyColumns(...)Explicit Requirement
For new table creation in
StoreTabularData(...): if any column is null across all rows (cannot infer concrete type), return a clear error and do not create table/schema.Tests to Update (specific file/functions)
File:
opengin/core-api/db/repository/postgres/data_handler_test.goAdd/update tests for:
File:
opengin/core-api/db/repository/postgres/postgres_client_test.goAdjust integration-style tabular write tests to align with new scanner-driven flow in
StoreTabularData(...).Validation
go test ./db/repository/postgres -run ^$(compile check)go test ./engine -run ^$(downstream compile check)