[analytics-engine] Skip unsupported field types when building Calcite schema#21698
Conversation
… schema OpenSearchSchemaBuilder.buildSchema previously aborted with IllegalArgumentException whenever an index mapping contained a field type the analytics-engine cannot scan (e.g. geo_point, geo_shape). This blocked any query against such an index even when the query did not reference the unsupported column. Drop the unsupported column from the Calcite row type instead. Queries that reference the dropped column now fail at validation with a clean "column not found" error; queries that do not reference it plan and execute normally. Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
PR Reviewer Guide 🔍(Review updated until commit 2cfa721)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 2cfa721 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit e6dae75
Suggestions up to commit 60a5386
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #21698 +/- ##
============================================
+ Coverage 73.36% 73.44% +0.07%
- Complexity 74672 74753 +81
============================================
Files 5991 5991
Lines 339374 339374
Branches 48921 48921
============================================
+ Hits 248980 249247 +267
+ Misses 70522 70295 -227
+ Partials 19872 19832 -40 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Persistent review updated to latest commit e6dae75 |
|
The sandbox-check failure is in CoordinatorResilienceIT.testShardHostNodeKillAndRestart and |
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
8905bb3 to
2cfa721
Compare
|
Persistent review updated to latest commit 2cfa721 |
1 similar comment
|
Persistent review updated to latest commit 2cfa721 |
Problem
Queries against any index that contains a field type the analytics-engine cannot scan (e.g.
geo_point,geo_shape,alias) fail at planning time withIllegalArgumentException: Unsupported OpenSearch field type: <type>— even when the query never references the unsupported column.Solution
Drop the unsupported column from the Calcite row type during schema build instead of throwing.
column not foundvalidator error.Verification
Unit tests (16/16 green, 9 new): cover Calcite end-to-end validator behavior on dropped columns, mixed mappings, nested objects with unsupported leaves, all-unsupported indices, and the full type catalogue.
Same-cluster A/B IT regression sweep on a force-routed analytics-engine cluster, identical 4872-test run with fix OFF vs fix ON:
Unsupported field type: geo_pointUnsupported OpenSearch field typethrows (raw, JVM-level)The 214 unblocked tests span
CalciteDataTypeIT,CalciteDateTimeComparisonIT,CalciteGeoPointFormatsIT,CalciteMultiValueStatsIT,CalcitePPLCastFunctionIT. The pass-delta is small (+4) because most unblocked tests immediately re-fail on the next engine gap (TIMESTAMP/IS_NULL/IS_NOT_NULLscalar functions, etc.) — this fix is the necessary prerequisite for those tests to ever reach those gaps in the first place. The load-bearing metric is the 443 → 0 / 214 → 0 elimination, not the pass-rate %.Per-class pass wins:
CalciteGeoPointFormatsIT0 → 2,CalcitePPLStringBuiltinFunctionIT44 → 46.Test plan
:sandbox:libs:analytics-api:check:sandbox:plugins:analytics-engine:checkspotlessCheckon both modulesFrameworks.getPlanner