[Feature] PPL Command: MvExpand #5144
[Feature] PPL Command: MvExpand #5144srikanthpadakanti wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR introduces native PPL support for the Changes
Sequence DiagramsequenceDiagram
participant User
participant Parser
participant ASTBuilder
participant Analyzer
participant Calcite as CalciteRelNodeVisitor
participant Expand as Expand Machinery
participant Result
User->>Parser: mvexpand skills limit=5
Parser->>ASTBuilder: mvexpandCommand(field, limit)
ASTBuilder->>ASTBuilder: parse field expression & limit
ASTBuilder->>Result: create MvExpand AST node
Result->>Analyzer: analyze MvExpand node
Analyzer->>Analyzer: visitMvExpand (Calcite-only)
Analyzer->>Calcite: delegate to Calcite
Calcite->>Calcite: visitMvExpand
Calcite->>Calcite: validate field is array type
Calcite->>Calcite: extract per-doc limit (5)
Calcite->>Expand: buildExpandRelNode(field, limit)
Expand->>Expand: apply UnnestRel with limit
Expand->>Expand: construct projection with per-doc limit
Expand->>Result: return expanded rows
Result->>User: one row per array element (max 5 per doc)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.40.5)core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java[] Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java`:
- Around line 480-482: Add a JavaDoc block to the public method
visitMvExpand(MvExpand node, C context) in AbstractNodeVisitor explaining that
it visits an MvExpand node and delegates to visitChildren(node, context);
include `@param` node the MvExpand AST node being visited, `@param` context the
visitor context, and `@return` the result of type T; add an `@throws` clause only if
this method can propagate checked exceptions (otherwise omit it).
In `@core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java`:
- Around line 481-483: The public DSL helper method mvexpand in class AstDSL
lacks JavaDoc and unit tests; add a JavaDoc block above the
mvexpand(UnresolvedPlan input, Field field, Integer limit) method describing the
method purpose and include `@param` tags for input, field, and limit, an `@return`
describing the UnresolvedPlan result, and any `@throws` if the method can
propagate exceptions (or state none). Then add unit tests under
core/src/test/java that exercise AstDSL.mvexpand: verify it returns an MvExpand
attached to the provided input, confirm the field and limit are set correctly,
and include edge cases (null limit and invalid inputs) to satisfy coverage and
coding guidelines.
In `@docs/user/ppl/index.md`:
- Around line 84-85: The commands table has inconsistent version metadata: the
mvcombine row shows "Version Introduced 3.5" but "stable (since 3.4)"; update
the mvcombine entry so both "Version Introduced" and "stable (since ...)" refer
to the same release (choose the correct release and apply it consistently) and
confirm mvexpand’s version numbers match its actual introduction/stability
release; edit the table rows for mvcombine and mvexpand (the entries referencing
mvcombine command and mvexpand command) so the two columns are consistent and
follow the project's version-format style.
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`:
- Around line 2533-2539: The test method testMvexpandExplain loads an expected
plan into the variable expected but never asserts it; update the test to capture
the explain output from explainQueryYaml (or the method it returns) and assert
equality with expected (e.g., use Assert.assertEquals or equivalent),
referencing the existing methods loadExpectedPlan and explainQueryYaml to locate
where to insert the comparison so the test actually validates the explain plan.
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java`:
- Around line 225-248: The test method testMvexpandLargeArrayElements assumes
numeric ordering but the query uses "sort skills.name" which sorts strings
lexicographically; update the expectations in testMvexpandLargeArrayElements
(the verifyDataRows call) to match lexicographic order (s1, s10, s2, s3, ...,
s9) or change the query to sort numerically (e.g., sort on a numeric
extraction/cast of skills.name) or alter test data to zero-pad names (s01..s10);
locate changes around the executeQuery/verifyDataRows calls in
testMvexpandLargeArrayElements and adjust either the expected rows, the query's
sort expression, or the sample data accordingly.
In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`:
- Around line 258-269: Extend testMvExpandCommand to assert MVEXPAND semantics
rather than only non-empty output: after executeQuery(String.format("search
source=%s | mvexpand skills", TEST_INDEX_MVEXPAND_EDGE_CASES)) use verifyQuery
plus explicit assertions on expected row count and a few representative
rows/columns to validate element expansion; add separate tests (new methods)
that call executeQuery and catch ResponseException like testMvExpandCommand but
assert correct behavior for NULL inputs (e.g., field missing or null),
empty-array inputs, and non-array inputs (expect either single-row passthrough
or a specific error), and add boundary tests for limit values (min=0, large
limits, and invalid negative limits) and a multi-document test to ensure
per-document expansion; reference test method testMvExpandCommand, helper
verifyQuery, index TEST_INDEX_MVEXPAND_EDGE_CASES and the MVEXPAND operator when
adding these assertions and new test methods.
In
`@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java`:
- Around line 120-167: Add a new unit test in CalcitePPLMvExpandTest (alongside
testMvExpandWithLimitParameter and others) named testMvExpandInvalidLimit that
calls getRelNode with invalid mvexpand limit values (e.g., "source=DEPT |
mvexpand EMPNOS limit=0" and "source=DEPT | mvexpand EMPNOS limit=-1") and
asserts that the planner/parser rejects them by expecting an exception (use the
same test framework helper as other tests, e.g., assertThrows or equivalent) or
checks for an error result; ensure the new test references mvexpand and
getRelNode so it verifies validation of zero and negative limits.
🧹 Nitpick comments (3)
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)
1111-1114: Add a brief comment explaining the purpose of this second INTERNAL_ITEM overload.The first registration (lines 1103-1110) includes detailed documentation explaining the custom type checker. This second registration (lines 1111-1114) using
(IGNORE, CHARACTER)should have a comment as well to clarify its intended use case—specifically for struct/record field access by string name (as used byQualifiedNameResolverand elsewhere for accessing nested fields in MVExpand results).core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (1)
550-553: Minor inconsistency in exception message casing.The exception message uses
"MvExpand"while similar Calcite-only commands use lowercase (e.g.,"mvcombine"at line 548,"addtotals"at line 538).Suggested fix for consistency
`@Override` public LogicalPlan visitMvExpand(MvExpand node, AnalysisContext context) { - throw getOnlyForCalciteException("MvExpand"); + throw getOnlyForCalciteException("mvexpand"); }docs/user/ppl/cmd/mvexpand.md (1)
7-10: Add language identifier to syntax code block.The static analysis tool flagged the missing language identifier. Adding
pplmaintains consistency with other code blocks in this file.Suggested fix
## Syntax -``` +```ppl mvexpand <field> [limit=<int>]</details> </blockquote></details> </blockquote></details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| public T visitMvExpand(MvExpand node, C context) { | ||
| return visitChildren(node, context); | ||
| } |
There was a problem hiding this comment.
Add JavaDoc for the new visitMvExpand public method.
This new public method in core/src/main/java should include @param/@return (and @throws if applicable).
📝 JavaDoc skeleton
+ /**
+ * Visit an MvExpand node.
+ *
+ * `@param` node MvExpand node
+ * `@param` context visitor context
+ * `@return` visitor result
+ */
public T visitMvExpand(MvExpand node, C context) {
return visitChildren(node, context);
}As per coding guidelines, "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public T visitMvExpand(MvExpand node, C context) { | |
| return visitChildren(node, context); | |
| } | |
| /** | |
| * Visit an MvExpand node. | |
| * | |
| * `@param` node MvExpand node | |
| * `@param` context visitor context | |
| * `@return` visitor result | |
| */ | |
| public T visitMvExpand(MvExpand node, C context) { | |
| return visitChildren(node, context); | |
| } |
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java` around
lines 480 - 482, Add a JavaDoc block to the public method visitMvExpand(MvExpand
node, C context) in AbstractNodeVisitor explaining that it visits an MvExpand
node and delegates to visitChildren(node, context); include `@param` node the
MvExpand AST node being visited, `@param` context the visitor context, and `@return`
the result of type T; add an `@throws` clause only if this method can propagate
checked exceptions (otherwise omit it).
| public static UnresolvedPlan mvexpand(UnresolvedPlan input, Field field, Integer limit) { | ||
| return new MvExpand(field, limit).attach(input); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the AstDSL.java file and examine the mvexpand method
find . -name "AstDSL.java" -type f | head -5Repository: opensearch-project/sql
Length of output: 125
🏁 Script executed:
# Search for test files that might cover mvexpand or MvExpand
rg -l "mvexpand|MvExpand" --type=java | grep -i testRepository: opensearch-project/sql
Length of output: 770
🏁 Script executed:
# Look at the actual implementation around line 481-483
fd "AstDSL.java" | xargs -I {} sh -c 'echo "=== {} ===" && sed -n "475,490p" {}'Repository: opensearch-project/sql
Length of output: 659
🏁 Script executed:
# Look for AstDSLTest.java and check if it covers mvexpand
fd "AstDSLTest.java" -o "AstDslTest.java" | xargs -I {} sh -c 'echo "=== {} ===" && cat {}'Repository: opensearch-project/sql
Length of output: 215
🏁 Script executed:
# Check if mvexpand appears in test files related to AstDSL
rg -n "mvexpand" --type=java -B 3 -A 3 core/src/test/java 2>/dev/null || echo "No tests found in core/src/test/java"Repository: opensearch-project/sql
Length of output: 102
🏁 Script executed:
# Examine CalcitePPLMvExpandTest.java to see test coverage
fd "CalcitePPLMvExpandTest.java" | xargs -I {} sh -c 'wc -l {} && head -60 {}'Repository: opensearch-project/sql
Length of output: 2192
🏁 Script executed:
# Check CalciteMvExpandCommandIT.java as well
fd "CalciteMvExpandCommandIT.java" | xargs -I {} sh -c 'wc -l {} && head -60 {}'Repository: opensearch-project/sql
Length of output: 2141
🏁 Script executed:
# Search for direct unit tests of AstDSL.mvexpand method
rg "AstDSL\.mvexpand|new MvExpand" --type=java core/src/test/java -A 2 2>/dev/null || echo "Searching in core tests..."Repository: opensearch-project/sql
Length of output: 92
🏁 Script executed:
# Find all test files in core module
fd -e java "Test\.java$" core/src/test/java | head -20Repository: opensearch-project/sql
Length of output: 1553
🏁 Script executed:
# Check if there's any reference to AstDSL in test files across the project
rg "class.*AstDSL.*Test|AstDSL.*mvexpand" --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look at the actual file again with more context - check if JavaDoc exists above line 481
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "470,490p" {}'Repository: opensearch-project/sql
Length of output: 745
🏁 Script executed:
# Check for any existing JavaDoc patterns in AstDSL.java
rg "^\s*\*\s*@param|^\s*\*\s*@return|^\s*/\*\*" core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java -cRepository: opensearch-project/sql
Length of output: 68
🏁 Script executed:
# Check if AstDSL methods generally have JavaDoc
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "400,450p" {}'Repository: opensearch-project/sql
Length of output: 1445
🏁 Script executed:
# Look for patterns showing how other public static methods are documented in AstDSL
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "200,250p" {}'Repository: opensearch-project/sql
Length of output: 1425
🏁 Script executed:
# Check if there's an AstDSL test file or if DSL methods are tested elsewhere
fd ".*Test.*\.java" core/src/test/java | xargs rg -l "AstDSL" 2>/dev/nullRepository: opensearch-project/sql
Length of output: 1653
🏁 Script executed:
# Comprehensive check - look for any test that directly uses or tests AstDSL.mvexpand
rg "AstDSL\s*\.\s*mvexpand" --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Count total lines in AstDSL.java and see the overall JavaDoc coverage
fd "AstDSL.java" | xargs -I {} sh -c 'wc -l {} && echo "---" && grep -c "public static" {}'Repository: opensearch-project/sql
Length of output: 134
🏁 Script executed:
# Check the actual structure - look for any method with JavaDoc
fd "AstDSL.java" | xargs -I {} sh -c 'grep -B 3 "public static.*mvexpand\|public static.*mvcombine" {} | head -20'Repository: opensearch-project/sql
Length of output: 424
🏁 Script executed:
# Check if similar DSL helper methods (like mvcombine right above) have JavaDoc
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "475,490p" {}'Repository: opensearch-project/sql
Length of output: 593
🏁 Script executed:
# Look for how MvExpand itself is used in the codebase
rg "new MvExpand|\.mvexpand\(" --type=java -B 2 -A 2 | head -60Repository: opensearch-project/sql
Length of output: 991
🏁 Script executed:
# Check if this method is actually part of the commit or existing
git log --oneline -S "mvexpand" -- core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java 2>/dev/null | head -5 || echo "Git log not available"Repository: opensearch-project/sql
Length of output: 125
Add JavaDoc and unit tests for the new mvexpand DSL helper.
This new public method in core/src/main/java is missing required JavaDoc documentation and unit tests. Per coding guidelines, both are mandatory for this location.
The method needs:
- JavaDoc with
@param(input, field, limit),@return, and@throwsif applicable - Unit tests in
core/src/test/java(in the same commit)
📝 Suggested JavaDoc
+ /**
+ * Build an MVEXPAND plan node and attach it to the input plan.
+ *
+ * `@param` input input plan
+ * `@param` field field to expand
+ * `@param` limit optional per-document limit
+ * `@return` MvExpand plan attached to the input
+ */
public static UnresolvedPlan mvexpand(UnresolvedPlan input, Field field, Integer limit) {
return new MvExpand(field, limit).attach(input);
}🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java` around lines 481 -
483, The public DSL helper method mvexpand in class AstDSL lacks JavaDoc and
unit tests; add a JavaDoc block above the mvexpand(UnresolvedPlan input, Field
field, Integer limit) method describing the method purpose and include `@param`
tags for input, field, and limit, an `@return` describing the UnresolvedPlan
result, and any `@throws` if the method can propagate exceptions (or state none).
Then add unit tests under core/src/test/java that exercise AstDSL.mvexpand:
verify it returns an MvExpand attached to the provided input, confirm the field
and limit are set correctly, and include edge cases (null limit and invalid
inputs) to satisfy coverage and coding guidelines.
| | [mvcombine command](cmd/mvcombine.md) | 3.5 | stable (since 3.4) | Combines values of a specified field across rows identical on all other fields. | | ||
| | [mvexpand command](cmd/mvexpand.md) | 3.6 | stable (since 3.6) | Expand a multi-valued field into separate documents (one per value). | |
There was a problem hiding this comment.
Reconcile mvcombine version info in the commands table.
The mvcombine row lists “Version Introduced 3.5” but “stable (since 3.4)”. Please align these to a single release (and confirm mvexpand’s version while you’re here).
As per coding guidelines, "Check for consistent formatting and style."
🤖 Prompt for AI Agents
In `@docs/user/ppl/index.md` around lines 84 - 85, The commands table has
inconsistent version metadata: the mvcombine row shows "Version Introduced 3.5"
but "stable (since 3.4)"; update the mvcombine entry so both "Version
Introduced" and "stable (since ...)" refer to the same release (choose the
correct release and apply it consistently) and confirm mvexpand’s version
numbers match its actual introduction/stability release; edit the table rows for
mvcombine and mvexpand (the entries referencing mvcombine command and mvexpand
command) so the two columns are consistent and follow the project's
version-format style.
| @Test | ||
| public void testMvexpandExplain() throws IOException { | ||
| // mvexpand explain plan validation | ||
| String expected = loadExpectedPlan("explain_mvexpand.yaml"); | ||
| explainQueryYaml( | ||
| "source=mvexpand_edge_cases | eval skills_arr = array(1, 2, 3) | mvexpand skills_arr"); | ||
| } |
There was a problem hiding this comment.
testMvexpandExplain doesn’t assert the plan.
expected is loaded but never compared; the test currently validates nothing. Please assert the explain output.
✅ Suggested fix
`@Test`
public void testMvexpandExplain() throws IOException {
// mvexpand explain plan validation
String expected = loadExpectedPlan("explain_mvexpand.yaml");
- explainQueryYaml(
- "source=mvexpand_edge_cases | eval skills_arr = array(1, 2, 3) | mvexpand skills_arr");
+ String actual =
+ explainQueryYaml(
+ "source=mvexpand_edge_cases | eval skills_arr = array(1, 2, 3) | mvexpand skills_arr");
+ assertYamlEqualsIgnoreId(expected, actual);
}As per coding guidelines, "integ-test//*IT.java: Check test assertions are meaningful and specific" and "/calcite/**/*Test.java: Test SQL generation and optimization paths in Calcite integration."
🤖 Prompt for AI Agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`
around lines 2533 - 2539, The test method testMvexpandExplain loads an expected
plan into the variable expected but never asserts it; update the test to capture
the explain output from explainQueryYaml (or the method it returns) and assert
equality with expected (e.g., use Assert.assertEquals or equivalent),
referencing the existing methods loadExpectedPlan and explainQueryYaml to locate
where to insert the comparison so the test actually validates the explain plan.
| @Test | ||
| public void testMvexpandLargeArrayElements() throws Exception { | ||
| String query = | ||
| String.format( | ||
| "source=%s | mvexpand skills | where username='large' | fields username, skills.name |" | ||
| + " sort skills.name", | ||
| INDEX); | ||
| JSONObject result = executeQuery(query); | ||
|
|
||
| verifyNumOfRows(result, 10); | ||
|
|
||
| verifyDataRows( | ||
| result, | ||
| rows("large", "s1"), | ||
| rows("large", "s2"), | ||
| rows("large", "s3"), | ||
| rows("large", "s4"), | ||
| rows("large", "s5"), | ||
| rows("large", "s6"), | ||
| rows("large", "s7"), | ||
| rows("large", "s8"), | ||
| rows("large", "s9"), | ||
| rows("large", "s10")); | ||
| } |
There was a problem hiding this comment.
Incorrect expected row order due to string lexicographic sorting.
The query uses sort skills.name which performs lexicographic string sorting. The expected order s1, s2, s3, ..., s10 assumes numeric sorting, but string sorting would produce: s1, s10, s2, s3, s4, s5, s6, s7, s8, s9.
🐛 Proposed fix for expected row order
verifyDataRows(
result,
rows("large", "s1"),
+ rows("large", "s10"),
rows("large", "s2"),
rows("large", "s3"),
rows("large", "s4"),
rows("large", "s5"),
rows("large", "s6"),
rows("large", "s7"),
rows("large", "s8"),
- rows("large", "s9"),
- rows("large", "s10"));
+ rows("large", "s9"));Alternatively, consider using zero-padded naming in test data (s01, s02, ..., s10) to get intuitive ordering.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @Test | |
| public void testMvexpandLargeArrayElements() throws Exception { | |
| String query = | |
| String.format( | |
| "source=%s | mvexpand skills | where username='large' | fields username, skills.name |" | |
| + " sort skills.name", | |
| INDEX); | |
| JSONObject result = executeQuery(query); | |
| verifyNumOfRows(result, 10); | |
| verifyDataRows( | |
| result, | |
| rows("large", "s1"), | |
| rows("large", "s2"), | |
| rows("large", "s3"), | |
| rows("large", "s4"), | |
| rows("large", "s5"), | |
| rows("large", "s6"), | |
| rows("large", "s7"), | |
| rows("large", "s8"), | |
| rows("large", "s9"), | |
| rows("large", "s10")); | |
| } | |
| `@Test` | |
| public void testMvexpandLargeArrayElements() throws Exception { | |
| String query = | |
| String.format( | |
| "source=%s | mvexpand skills | where username='large' | fields username, skills.name |" | |
| " sort skills.name", | |
| INDEX); | |
| JSONObject result = executeQuery(query); | |
| verifyNumOfRows(result, 10); | |
| verifyDataRows( | |
| result, | |
| rows("large", "s1"), | |
| rows("large", "s10"), | |
| rows("large", "s2"), | |
| rows("large", "s3"), | |
| rows("large", "s4"), | |
| rows("large", "s5"), | |
| rows("large", "s6"), | |
| rows("large", "s7"), | |
| rows("large", "s8"), | |
| rows("large", "s9")); | |
| } |
🤖 Prompt for AI Agents
In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java`
around lines 225 - 248, The test method testMvexpandLargeArrayElements assumes
numeric ordering but the query uses "sort skills.name" which sorts strings
lexicographically; update the expectations in testMvexpandLargeArrayElements
(the verifyDataRows call) to match lexicographic order (s1, s10, s2, s3, ...,
s9) or change the query to sort numerically (e.g., sort on a numeric
extraction/cast of skills.name) or alter test data to zero-pad names (s01..s10);
locate changes around the executeQuery/verifyDataRows calls in
testMvexpandLargeArrayElements and adjust either the expected rows, the query's
sort expression, or the sample data accordingly.
| @Test | ||
| public void testMvExpandCommand() throws IOException { | ||
| JSONObject result; | ||
| try { | ||
| result = | ||
| executeQuery( | ||
| String.format("search source=%s | mvexpand skills", TEST_INDEX_MVEXPAND_EDGE_CASES)); | ||
| } catch (ResponseException e) { | ||
| result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); | ||
| } | ||
| verifyQuery(result); | ||
| } |
There was a problem hiding this comment.
Add specific assertions and edge‑case coverage for MVEXPAND.
testMvExpandCommand only checks for non‑empty output. For a new per‑document command, add assertions that validate expansion semantics (e.g., expected row count or representative rows), plus coverage for null/empty/non‑array inputs and limit boundary/error cases.
As per coding guidelines, "**/+(Test|IT).java: NULL input tests must be included for all new functions" and "Include boundary condition tests (min/max values, empty inputs) for all new functions," "Include error condition tests (invalid inputs, exceptions) for all new functions," and "Include multi-document tests for per-document operations."
🤖 Prompt for AI Agents
In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`
around lines 258 - 269, Extend testMvExpandCommand to assert MVEXPAND semantics
rather than only non-empty output: after executeQuery(String.format("search
source=%s | mvexpand skills", TEST_INDEX_MVEXPAND_EDGE_CASES)) use verifyQuery
plus explicit assertions on expected row count and a few representative
rows/columns to validate element expansion; add separate tests (new methods)
that call executeQuery and catch ResponseException like testMvExpandCommand but
assert correct behavior for NULL inputs (e.g., field missing or null),
empty-array inputs, and non-array inputs (expect either single-row passthrough
or a specific error), and add boundary tests for limit values (min=0, large
limits, and invalid negative limits) and a multi-document test to ensure
per-document expansion; reference test method testMvExpandCommand, helper
verifyQuery, index TEST_INDEX_MVEXPAND_EDGE_CASES and the MVEXPAND operator when
adding these assertions and new test methods.
| @Test | ||
| public void testMvExpandWithLimitParameter() { | ||
| String ppl = "source=DEPT | mvexpand EMPNOS limit=2"; | ||
| RelNode root = getRelNode(ppl); | ||
|
|
||
| assertContains(root, "LogicalCorrelate"); | ||
| assertContains(root, "Uncollect"); | ||
|
|
||
| assertAnyContains(root, "fetch=", "LIMIT", "RowNumber", "Window"); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMvExpandProjectNested() { | ||
| String ppl = "source=DEPT | mvexpand EMPNOS | fields DEPTNO, EMPNOS"; | ||
| RelNode root = getRelNode(ppl); | ||
|
|
||
| assertContains(root, "LogicalCorrelate"); | ||
| assertContains(root, "Uncollect"); | ||
| assertContains(root, "LogicalProject"); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMvExpandEmptyOrNullArray() { | ||
| RelNode root = getRelNode("source=DEPT | where isnull(EMPNOS) | mvexpand EMPNOS"); | ||
| assertContains(root, "LogicalCorrelate"); | ||
| assertContains(root, "Uncollect"); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMvExpandWithDuplicates() { | ||
| RelNode root = getRelNode("source=DEPT | where DEPTNO in (10, 10, 20) | mvexpand EMPNOS"); | ||
| assertContains(root, "LogicalCorrelate"); | ||
| assertContains(root, "Uncollect"); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMvExpandLargeArray() { | ||
| RelNode root = getRelNode("source=DEPT | where DEPTNO = 999 | mvexpand EMPNOS"); | ||
| assertContains(root, "LogicalCorrelate"); | ||
| assertContains(root, "Uncollect"); | ||
| } | ||
|
|
||
| @Test | ||
| public void testMvExpandPrimitiveArray() { | ||
| RelNode root = getRelNode("source=DEPT | mvexpand EMPNOS"); | ||
| assertContains(root, "LogicalCorrelate"); | ||
| assertContains(root, "Uncollect"); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the full test file and check its size
wc -l ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.javaRepository: opensearch-project/sql
Length of output: 146
🏁 Script executed:
# Read the full test file
cat -n ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.javaRepository: opensearch-project/sql
Length of output: 7503
🏁 Script executed:
# Search for mvexpand tests with "invalid", "non.array", "error", "negative", "zero"
rg -n --type=java -i '(invalid|non[-_]?array|error|negative|zero).*mvexpand|mvexpand.*(invalid|non[-_]?array|error|negative|zero)' ppl/Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for limit-related error tests across PPL tests
rg -n --type=java 'limit.*[=<>].*[0-9]|limit.*invalid|limit.*negative' ppl/src/test/ -A 2 -B 2Repository: opensearch-project/sql
Length of output: 16436
🏁 Script executed:
# Search for mvexpand validation tests and error handling
rg -n --type=java 'mvexpand.*limit\s*=\s*[0-9-]|mvexpand.*non.*array|mvexpand.*invalid|mvexpand.*error' ppl/Repository: opensearch-project/sql
Length of output: 346
🏁 Script executed:
# Look for integration tests related to mvexpand
fd -e java -e IT ppl/ | grep -i mvexpandRepository: opensearch-project/sql
Length of output: 387
🏁 Script executed:
# Search for mvexpand in integration tests
rg -n 'mvexpand' ppl/src/test/java --type=java -l | grep -i 'it\.java|integration'Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check if there are any negative limit validation tests for mvexpand or expand
rg -n --type=java 'limit.*[=-]\s*-?[0-9]|negative.*limit|limit.*invalid' ppl/src/test/ -A 2 -B 2Repository: opensearch-project/sql
Length of output: 14716
🏁 Script executed:
# Search for mvexpand implementation and limit validation
rg -n 'class.*MvExpand|mvexpand.*limit' ppl/src/main --type=java -A 3Repository: opensearch-project/sql
Length of output: 546
🏁 Script executed:
# Search for limit validation in mvexpand parser
fd -type f -name '*MvExpand*.java' ppl/src/mainRepository: opensearch-project/sql
Length of output: 236
🏁 Script executed:
# Check AST builder for mvexpand limit validation
rg -n 'mvexpand|MvExpand' ppl/src/main --type=java -l | head -10Repository: opensearch-project/sql
Length of output: 204
🏁 Script executed:
# Check AstBuilder for mvexpand command parsing and limit validation
rg -n 'mvexpand|MvExpand' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java -A 5 -B 2Repository: opensearch-project/sql
Length of output: 829
🏁 Script executed:
# Search for limit parameter validation in mvexpand
rg -n 'visitMvExpandCommand|mvExpandCommand' ppl/src/main --type=java -A 10Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check if there's validation for non-array fields in mvexpand
rg -n 'array|Array' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java -C 3 | grep -A 5 -B 5 mvexpandRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check the MvExpand class definition for validation
rg -n 'class MvExpand' ppl/src/main --type=java -A 20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Verify the full mvexpand command parsing to see if validation happens elsewhere
rg -n 'MvExpand|limit' ppl/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java -A 5 -B 2Repository: opensearch-project/sql
Length of output: 165
🏁 Script executed:
# Find the MvExpand class file
fd -type f -name 'MvExpand.java'Repository: opensearch-project/sql
Length of output: 236
🏁 Script executed:
# Search for MvExpand class definition across the repo
rg -n 'class MvExpand' --type=javaRepository: opensearch-project/sql
Length of output: 176
🏁 Script executed:
# Read the MvExpand class definition
cat -n core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.javaRepository: opensearch-project/sql
Length of output: 1616
Add tests for invalid mvexpand limit values (0 and negative).
The suite includes testMvExpandEmptyOrNullArray() which covers NULL/empty array boundary conditions. However, it lacks tests for invalid limit parameters—specifically limit=0 and negative limit values. Per coding guidelines, include error condition tests for all new functions. Add a test (e.g., testMvExpandInvalidLimit()) to verify that mvexpand rejects invalid limit values, or document where such validation is covered in integration tests.
🤖 Prompt for AI Agents
In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java`
around lines 120 - 167, Add a new unit test in CalcitePPLMvExpandTest (alongside
testMvExpandWithLimitParameter and others) named testMvExpandInvalidLimit that
calls getRelNode with invalid mvexpand limit values (e.g., "source=DEPT |
mvexpand EMPNOS limit=0" and "source=DEPT | mvexpand EMPNOS limit=-1") and
asserts that the planner/parser rejects them by expecting an exception (use the
same test framework helper as other tests, e.g., assertThrows or equivalent) or
checks for an error result; ensure the new test references mvexpand and
getRelNode so it verifies validation of zero and negative limits.
Description
This pull request adds native support for the mvexpand command in PPL to OpenSearch SQL, enabling users to expand multivalue fields (arrays) into separate rows directly within queries. This functionality is analogous to Splunk's mvexpand command and streamlines analytics, dashboarding, and data preparation involving arrays or multivalue fields.
Key features introduced:
Native mvexpand command for PPL queries to expand array fields into separate rows/events.
Optional limit parameter to restrict the number of expanded values per event/document.
Robust handling of empty/null arrays, large arrays (with memory/resource limits), and non-array fields.
Streaming/distributable execution for performance and scalability.
Comprehensive documentation and edge case coverage.
This feature makes OpenSearch SQL more powerful and user-friendly for log analytics, data exploration, and migration from platforms like Splunk.
Related Issues
Resolves #4439
#4439
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.