[PECOBLR-1377] SRID in geospatial column type name by sreekanth-db · Pull Request #1157 · databricks/databricks-jdbc

sreekanth-db · 2025-12-22T11:30:35Z

Description

This PR enhances geospatial datatype handling to include SRID (Spatial Reference System Identifier) information in column type names and fixes multiple issues related to complex datatype handling across different result formats.

Key Changes

Geospatial Type Name Enhancement
- Column type names now include SRID: GEOMETRY(4326) instead of GEOMETRY
- Applies to both GEOMETRY and GEOGRAPHY types
- Preserves full type information in metadata for better type identification
SEA Inline Mode Complex Type Fix
- Fixed issue where complex types (ARRAY, MAP, STRUCT) were not returned as complex objects in SEA Inline mode (JSON array result format)
- Now properly converts to complex datatype objects when EnableComplexDatatypeSupport=true
Thrift CloudFetch Metadata Enhancement
- Fixed error when extracting type details (e.g., INT from ARRAY<INT>) in Thrift CloudFetch mode
- Enhanced getColumnInfoFromTColumnDesc() to use Arrow schema metadata alongside TColumnDesc
- Arrow schema provides complete type information (e.g., ARRAY<INT>) while TColumnDesc only contains base type (e.g., ARRAY)
Arrow Metadata Extraction
- Added DatabricksThriftUtil.getArrowMetadata() to deserialize Arrow schema from TGetResultSetMetadataResp
- Fixed null arrow metadata issue in DatabricksResultSet constructor for Thrift CloudFetch mode

Testing

Unit Tests

All existing unit tests pass and additional tests are added for new methods

Integration Tests

GeospatialTests.java - Comprehensive E2E integration test
- Tests geospatial types (GEOMETRY and GEOGRAPHY)
- Validates 24 configuration combinations:
  - Protocol: Thrift / SEA
  - Serialization: Arrow / Inline
  - CloudFetch: Enabled / Disabled (only with Arrow, as CloudFetch requires Arrow)
  - GeoSpatial Support: Enabled / Disabled
  - Complex Type Support: Enabled / Disabled
- Validates metadata: column types, type names, class names
- Validates values: WKT representation, SRID
- Validates behavior when geospatial objects are enabled vs. disabled (STRING fallback)
- All 24 tests pass ✅

Additional Notes to the Reviewer

Other required details are mentioned in comments in the diff

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

sreekanth-db · 2025-12-22T13:07:15Z

    return parser.parseJsonStringToDbStruct(object.toString(), arrowMetadata);
  }

-  private static AbstractDatabricksGeospatial convertToGeospatial(


removing duplicate code, using the same logic present in GeospatialConverter

sreekanth-db · 2025-12-22T13:11:08Z

      long rowSize = executionResult.getRowCount();
-      List<String> arrowMetadata = null;
-      if (executionResult instanceof ArrowStreamResult) {
-        arrowMetadata = ((ArrowStreamResult) executionResult).getArrowMetadata();


((ArrowStreamResult) executionResult).getArrowMetadata() will always be null for cloud fetch mode at this point in the code flow. So getting arrow metadata from TGetResultSetMetadataResp

Will getResultSetMetadata always be present?

Ideally we should have TGetResultSetMetadataResp object in TFetchResultsResp. Anyways we are handling this case in getArrowMetadata method to check for null values

is arrow metadata consistently populated in TGetResultSetMetadataResp?

Not when inline result. But that is limitation of complex data types as well

Not when inline result

Arrow metadata is populated from thrift always when format is CSV/JSON_ARRAY or ARROW.

The only time arrow metadata is not present is for when format is HIVE.
Is inline result in hive format ?

sreekanth-db · 2025-12-22T13:14:14Z

-   * @return true if the type name starts with ARRAY, MAP, STRUCT, GEOMETRY, or GEOGRAPHY, false
-   *     otherwise
-   */
-  private static boolean isComplexType(String typeName) {


moving this method to DatabricksTypeUtil for using in other classes

sreekanth-db · 2025-12-22T13:15:50Z

+    return complexDatatypeSupport ? obj : obj.toString();
  }

-  private Object handleComplexDataTypesForSEAInline(Object obj, String columnName)


Updating this method to always return complex datatype, if complex mode is disabled, caller will convert to string

sreekanth-db · 2025-12-22T13:17:12Z

            typeText = "STRING";
          }

+          // store base type eg. DECIMAL instead of DECIMAL(7,2) except for geospatial datatypes


For geospatial, we want SRID info (in paranthesis) to be present. eg: GEOMETRY(4326)

Will there be a case that SRID info might be missing?

I don't think so, even for SRID 0, we get GEOMETRY(0)

srid will always be present

sreekanth-db · 2025-12-22T13:21:39Z

            columnIndex++) {
          TColumnDesc columnDesc = resultManifest.getSchema().getColumns().get(columnIndex);

-          ColumnInfo columnInfo = getColumnInfoFromTColumnDesc(columnDesc);


TColumnDesc doesn't have complete information about the column metadata, hence enhancing it using the arrowMetadata extracted from the TGetResultSetMetadataResp. This is required for methods which use ColumnInfo later in the flow (thrift + cloudfetch + complex)

sreekanth-db · 2025-12-22T13:23:23Z

                .columnTypeClassName("java.lang.String")
                .columnType(Types.OTHER)
                .columnTypeText(VARIANT);
-          } else if (isGeometryColumn(arrowMetadata, columnIndex)


This is handled with updates to the column info, no separate handling is required

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

gopalldb · 2026-01-01T08:39:48Z

+
+    if (arrowMetadata != null && isComplexType(arrowMetadata)) {
+      typeText = arrowMetadata;
+      if (arrowMetadata.startsWith(GEOMETRY)) {


do we want to check if Geospatial flag is enabled?

This is an inner layer, so flag check is not required here. we are checking for geospatial flags in DatabricksResultSetMetaData which is the interface through which users get metadata info

gopalldb · 2026-01-01T08:41:24Z

+          .map(e -> e.get(ARROW_METADATA_KEY))
+          .collect(Collectors.toList());
+    } catch (IOException e) {
+      throw new DatabricksSQLException(


add logging

added error log

…ents

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

vikrantpuppala

thanks! can we also run the jdbc comparator in different modes (sea/arrow) across different column types to ensure things are consistent

vikrantpuppala · 2026-01-12T06:10:23Z


 ### Fixed
+- Fixed complex types not being returned as objects in SEA Inline mode when `EnableComplexDatatypeSupport=true`.
+- Fixed errors with complex data types in Thrift CloudFetch mode.


can we be more descriptive on what these errors are?

updated the change log with more details

vikrantpuppala · 2026-01-12T06:12:01Z

      long rowSize = executionResult.getRowCount();
-      List<String> arrowMetadata = null;
-      if (executionResult instanceof ArrowStreamResult) {
-        arrowMetadata = ((ArrowStreamResult) executionResult).getArrowMetadata();


is arrow metadata consistently populated in TGetResultSetMetadataResp?

vikrantpuppala · 2026-01-12T06:13:38Z

+      return parser.parseJsonStringToDbStruct(obj.toString(), columnName);
    } else if (columnName.startsWith(GEOMETRY)) {
-      return obj;
+      return new GeospatialConverter().toDatabricksGeometry(obj);


why is toDatabricksGeometry not a static method? why do we need to init an object to use it?

updated the code to use cached converter similar to other datatype's converters

milastdbx · 2026-01-12T12:43:52Z

      long rowSize = executionResult.getRowCount();
-      List<String> arrowMetadata = null;
-      if (executionResult instanceof ArrowStreamResult) {
-        arrowMetadata = ((ArrowStreamResult) executionResult).getArrowMetadata();


Not when inline result

Arrow metadata is populated from thrift always when format is CSV/JSON_ARRAY or ARROW.

The only time arrow metadata is not present is for when format is HIVE.
Is inline result in hive format ?

milastdbx · 2026-01-12T12:45:29Z

+      return parser.parseJsonStringToDbStruct(obj.toString(), columnName);
    } else if (columnName.startsWith(GEOMETRY)) {
-      return obj;
+      return new GeospatialConverter().toDatabricksGeometry(obj);


why didnt we handle this before ?
what is the impact of not having this in 3.0.6 ?

Currently complex data types are not supported in SEA inline flow, so geospatial type is returned as a string (till 3.0.7). This is consistent with our documentation. With this change we are supporting complex types for SEA inline mode as well. Will update the documentation accordingly.

milastdbx · 2026-01-12T12:46:22Z

            typeText = "STRING";
          }

+          // store base type eg. DECIMAL instead of DECIMAL(7,2) except for geospatial datatypes


srid will always be present

milastdbx · 2026-01-12T13:37:46Z

+ * and EnableComplexDatatypeSupport are enabled AND not in Thrift+Inline mode. Otherwise, returns as
+ * STRING.
+ */
+public class GeospatialTests {


lets test geometry any as well:
example query:

%sql SELECT * FROM VALUES (ST_GeomFromText('POINT(17 7)', 4326)), (ST_GeomFromText('POINT(5 5)', 0)) AS t(geom)

expected type name: GEOMETRY(ANY) expected SRIDs per row: 4326 and 0. lets assert on everything, that both row level values are correct and column metadata is correct

Added the tests for GEOMETRY(ANY)

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

milastdbx · 2026-01-28T10:37:37Z

+  private void setColumnInfo(TGetResultSetMetadataResp resultManifest)
+      throws DatabricksSQLException {
    columnInfos = new ArrayList<>();
+    List<String> arrowMetadataList = DatabricksThriftUtil.getArrowMetadata(resultManifest);


what happens when this is null ?
Some very old spark servers do not return arrowMetadata, we do not want them to break

In the method DatabricksThriftUtil.getArrowMetadata, we check if arrowSchema is present in TGetResultSetMetadataResp, and return null if required. At every code point where we are accessing the arrowMetadataList, we check if it is not null and the size of the list before accessing the element so that existing flows doesn't break. We have multi DBR tests to ensure the backward compatibility.

In the context of geospatial datatypes, if arrowMetadata is null (won't happen in ideal flow, since driver would always get arrowMetadata in DBR 17.1+ from which geospatial types are supported), the column type is shown as String (since the TColumnDesc contains STRING and driver doesn't have other information to identify it as GEOMETRY/GEOGRAPHY)

milastdbx · 2026-01-28T10:37:58Z

-      columnInfos.add(
-          com.databricks.jdbc.common.util.DatabricksThriftUtil.getColumnInfoFromTColumnDesc(
-              tColumnDesc));
+    List<String> arrowMetadata = DatabricksThriftUtil.getArrowMetadata(resultManifest);


what happens when this is null ?
Some very old spark servers do not return arrowMetadata, we do not want them to break

do we have tests when thrift protocol is low ?

In the method DatabricksThriftUtil.getArrowMetadata, we check if arrowSchema is present in TGetResultSetMetadataResp, and return null if required. At every code point where we are accessing the arrowMetadataList, we check if it is not null and the size of the list before accessing the element so that existing flows doesn't break. We have multi DBR tests to ensure the backward compatibility.

milastdbx · 2026-01-28T10:43:41Z

+    if (arrowMetadata != null && isComplexType(arrowMetadata)) {
+      typeText = arrowMetadata;
+      if (arrowMetadata.startsWith(GEOMETRY)) {
+        columnInfoTypeName = ColumnInfoTypeName.GEOMETRY;


shouldn't this return column info in parenthesis ? e.g. GEOMETRY(4326)?

columnInfoTypeName is an enum, so we can't have SRID info here. But this is just an intermediate processing step, not the ultimate response that user receives. User response includes SRID info. E2E behaviour can be found in this test (from line no: 211)

sreekanth-db added 2 commits December 22, 2025 16:58

geospatial column type name with srid

d24c48e

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

unit tests

dcb3ee5

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

sreekanth-db commented Dec 22, 2025

View reviewed changes

updated next changelog

48e113d

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

sreekanth-db changed the title ~~geospatial column type name with srid~~ SRID in geospatial column type name Dec 22, 2025

sreekanth-db requested review from gopalldb, jayantsing-db, madhav-db and samikshya-db December 29, 2025 10:24

sreekanth-db changed the title ~~SRID in geospatial column type name~~ [PECOBLR-1377] SRID in geospatial column type name Dec 30, 2025

gopalldb reviewed Jan 1, 2026

View reviewed changes

sreekanth-db added 2 commits January 2, 2026 13:19

Merge remote-tracking branch 'upstream/main' into geospatial-enhancem…

d378d76

…ents

added error log

f3bf0d6

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

vikrantpuppala reviewed Jan 12, 2026

View reviewed changes

milastdbx reviewed Jan 12, 2026

View reviewed changes

sreekanth-db added 2 commits January 12, 2026 19:48

Merge remote-tracking branch 'origin/main' into geospatial-enhancements

3e4c4fb

addressing review comments

ba4b613

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

sreekanth-db requested review from gopalldb, milastdbx and vikrantpuppala January 12, 2026 15:54

sreekanth-db added 3 commits January 19, 2026 19:49

Merge branch 'main' into geospatial-enhancements

12b939b

integrating with recent changes

f72f704

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

fix tests

df8f6e3

Signed-off-by: Sreekanth Vadigi <sreekanth.vadigi@databricks.com>

gopalldb approved these changes Jan 21, 2026

View reviewed changes

sreekanth-db enabled auto-merge (squash) January 21, 2026 18:32

Merge branch 'main' into geospatial-enhancements

28fc86a

sreekanth-db disabled auto-merge January 21, 2026 18:44

sreekanth-db merged commit 809c0b3 into main Jan 21, 2026
13 of 15 checks passed

sreekanth-db deleted the geospatial-enhancements branch January 21, 2026 18:46

milastdbx reviewed Jan 28, 2026

View reviewed changes

sreekanth-db mentioned this pull request Feb 4, 2026

fix next changelog #1198

Closed

Conversation

sreekanth-db commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Changes

Testing

Unit Tests

Integration Tests

Additional Notes to the Reviewer

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sreekanth-db Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sreekanth-db Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vikrantpuppala left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sreekanth-db commented Dec 22, 2025 •

edited

Loading

sreekanth-db Dec 22, 2025 •

edited

Loading

sreekanth-db Dec 22, 2025 •

edited

Loading

sreekanth-db Jan 28, 2026 •

edited

Loading