Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@
import org.opensearch.sql.ast.tree.ML;
import org.opensearch.sql.ast.tree.Multisearch;
import org.opensearch.sql.ast.tree.MvCombine;
import org.opensearch.sql.ast.tree.MvExpand;
import org.opensearch.sql.ast.tree.Paginate;
import org.opensearch.sql.ast.tree.Parse;
import org.opensearch.sql.ast.tree.Patterns;
Expand Down Expand Up @@ -546,6 +547,11 @@ public LogicalPlan visitMvCombine(MvCombine node, AnalysisContext context) {
throw getOnlyForCalciteException("mvcombine");
}

@Override
public LogicalPlan visitMvExpand(MvExpand node, AnalysisContext context) {
throw getOnlyForCalciteException("MvExpand");
}

/** Build {@link ParseExpression} to context and skip to child nodes. */
@Override
public LogicalPlan visitParse(Parse node, AnalysisContext context) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@
import org.opensearch.sql.ast.tree.ML;
import org.opensearch.sql.ast.tree.Multisearch;
import org.opensearch.sql.ast.tree.MvCombine;
import org.opensearch.sql.ast.tree.MvExpand;
import org.opensearch.sql.ast.tree.Paginate;
import org.opensearch.sql.ast.tree.Parse;
import org.opensearch.sql.ast.tree.Patterns;
Expand Down Expand Up @@ -475,4 +476,8 @@ public T visitAddColTotals(AddColTotals node, C context) {
public T visitMvCombine(MvCombine node, C context) {
return visitChildren(node, context);
}

public T visitMvExpand(MvExpand node, C context) {
return visitChildren(node, context);
}
Comment on lines +480 to +482
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the new visitMvExpand public method.

This new public method in core/src/main/java should include @param/@return (and @throws if applicable).

📝 JavaDoc skeleton
+  /**
+   * Visit an MvExpand node.
+   *
+   * `@param` node MvExpand node
+   * `@param` context visitor context
+   * `@return` visitor result
+   */
   public T visitMvExpand(MvExpand node, C context) {
     return visitChildren(node, context);
   }

As per coding guidelines, "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public T visitMvExpand(MvExpand node, C context) {
return visitChildren(node, context);
}
/**
* Visit an MvExpand node.
*
* `@param` node MvExpand node
* `@param` context visitor context
* `@return` visitor result
*/
public T visitMvExpand(MvExpand node, C context) {
return visitChildren(node, context);
}
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java` around
lines 480 - 482, Add a JavaDoc block to the public method visitMvExpand(MvExpand
node, C context) in AbstractNodeVisitor explaining that it visits an MvExpand
node and delegates to visitChildren(node, context); include `@param` node the
MvExpand AST node being visited, `@param` context the visitor context, and `@return`
the result of type T; add an `@throws` clause only if this method can propagate
checked exceptions (otherwise omit it).

}
5 changes: 5 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
import org.opensearch.sql.ast.tree.Limit;
import org.opensearch.sql.ast.tree.MinSpanBin;
import org.opensearch.sql.ast.tree.MvCombine;
import org.opensearch.sql.ast.tree.MvExpand;
import org.opensearch.sql.ast.tree.Parse;
import org.opensearch.sql.ast.tree.Patterns;
import org.opensearch.sql.ast.tree.Project;
Expand Down Expand Up @@ -477,6 +478,10 @@ public static MvCombine mvcombine(Field field, String delim) {
return new MvCombine(field, delim);
}

public static UnresolvedPlan mvexpand(UnresolvedPlan input, Field field, Integer limit) {
return new MvExpand(field, limit).attach(input);
}
Comment on lines +481 to +483
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the AstDSL.java file and examine the mvexpand method
find . -name "AstDSL.java" -type f | head -5

Repository: opensearch-project/sql

Length of output: 125


🏁 Script executed:

# Search for test files that might cover mvexpand or MvExpand
rg -l "mvexpand|MvExpand" --type=java | grep -i test

Repository: opensearch-project/sql

Length of output: 770


🏁 Script executed:

# Look at the actual implementation around line 481-483
fd "AstDSL.java" | xargs -I {} sh -c 'echo "=== {} ===" && sed -n "475,490p" {}'

Repository: opensearch-project/sql

Length of output: 659


🏁 Script executed:

# Look for AstDSLTest.java and check if it covers mvexpand
fd "AstDSLTest.java" -o "AstDslTest.java" | xargs -I {} sh -c 'echo "=== {} ===" && cat {}'

Repository: opensearch-project/sql

Length of output: 215


🏁 Script executed:

# Check if mvexpand appears in test files related to AstDSL
rg -n "mvexpand" --type=java -B 3 -A 3 core/src/test/java 2>/dev/null || echo "No tests found in core/src/test/java"

Repository: opensearch-project/sql

Length of output: 102


🏁 Script executed:

# Examine CalcitePPLMvExpandTest.java to see test coverage
fd "CalcitePPLMvExpandTest.java" | xargs -I {} sh -c 'wc -l {} && head -60 {}'

Repository: opensearch-project/sql

Length of output: 2192


🏁 Script executed:

# Check CalciteMvExpandCommandIT.java as well
fd "CalciteMvExpandCommandIT.java" | xargs -I {} sh -c 'wc -l {} && head -60 {}'

Repository: opensearch-project/sql

Length of output: 2141


🏁 Script executed:

# Search for direct unit tests of AstDSL.mvexpand method
rg "AstDSL\.mvexpand|new MvExpand" --type=java core/src/test/java -A 2 2>/dev/null || echo "Searching in core tests..."

Repository: opensearch-project/sql

Length of output: 92


🏁 Script executed:

# Find all test files in core module
fd -e java "Test\.java$" core/src/test/java | head -20

Repository: opensearch-project/sql

Length of output: 1553


🏁 Script executed:

# Check if there's any reference to AstDSL in test files across the project
rg "class.*AstDSL.*Test|AstDSL.*mvexpand" --type=java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

# Look at the actual file again with more context - check if JavaDoc exists above line 481
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "470,490p" {}'

Repository: opensearch-project/sql

Length of output: 745


🏁 Script executed:

# Check for any existing JavaDoc patterns in AstDSL.java
rg "^\s*\*\s*@param|^\s*\*\s*@return|^\s*/\*\*" core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java -c

Repository: opensearch-project/sql

Length of output: 68


🏁 Script executed:

# Check if AstDSL methods generally have JavaDoc
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "400,450p" {}'

Repository: opensearch-project/sql

Length of output: 1445


🏁 Script executed:

# Look for patterns showing how other public static methods are documented in AstDSL
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "200,250p" {}'

Repository: opensearch-project/sql

Length of output: 1425


🏁 Script executed:

# Check if there's an AstDSL test file or if DSL methods are tested elsewhere
fd ".*Test.*\.java" core/src/test/java | xargs rg -l "AstDSL" 2>/dev/null

Repository: opensearch-project/sql

Length of output: 1653


🏁 Script executed:

# Comprehensive check - look for any test that directly uses or tests AstDSL.mvexpand
rg "AstDSL\s*\.\s*mvexpand" --type=java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

# Count total lines in AstDSL.java and see the overall JavaDoc coverage
fd "AstDSL.java" | xargs -I {} sh -c 'wc -l {} && echo "---" && grep -c "public static" {}'

Repository: opensearch-project/sql

Length of output: 134


🏁 Script executed:

# Check the actual structure - look for any method with JavaDoc
fd "AstDSL.java" | xargs -I {} sh -c 'grep -B 3 "public static.*mvexpand\|public static.*mvcombine" {} | head -20'

Repository: opensearch-project/sql

Length of output: 424


🏁 Script executed:

# Check if similar DSL helper methods (like mvcombine right above) have JavaDoc
fd "AstDSL.java" | xargs -I {} sh -c 'sed -n "475,490p" {}'

Repository: opensearch-project/sql

Length of output: 593


🏁 Script executed:

# Look for how MvExpand itself is used in the codebase
rg "new MvExpand|\.mvexpand\(" --type=java -B 2 -A 2 | head -60

Repository: opensearch-project/sql

Length of output: 991


🏁 Script executed:

# Check if this method is actually part of the commit or existing
git log --oneline -S "mvexpand" -- core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java 2>/dev/null | head -5 || echo "Git log not available"

Repository: opensearch-project/sql

Length of output: 125


Add JavaDoc and unit tests for the new mvexpand DSL helper.

This new public method in core/src/main/java is missing required JavaDoc documentation and unit tests. Per coding guidelines, both are mandatory for this location.

The method needs:

  1. JavaDoc with @param (input, field, limit), @return, and @throws if applicable
  2. Unit tests in core/src/test/java (in the same commit)
📝 Suggested JavaDoc
+  /**
+   * Build an MVEXPAND plan node and attach it to the input plan.
+   *
+   * `@param` input input plan
+   * `@param` field field to expand
+   * `@param` limit optional per-document limit
+   * `@return` MvExpand plan attached to the input
+   */
   public static UnresolvedPlan mvexpand(UnresolvedPlan input, Field field, Integer limit) {
     return new MvExpand(field, limit).attach(input);
   }
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java` around lines 481 -
483, The public DSL helper method mvexpand in class AstDSL lacks JavaDoc and
unit tests; add a JavaDoc block above the mvexpand(UnresolvedPlan input, Field
field, Integer limit) method describing the method purpose and include `@param`
tags for input, field, and limit, an `@return` describing the UnresolvedPlan
result, and any `@throws` if the method can propagate exceptions (or state none).
Then add unit tests under core/src/test/java that exercise AstDSL.mvexpand:
verify it returns an MvExpand attached to the provided input, confirm the field
and limit are set correctly, and include edge cases (null limit and invalid
inputs) to satisfy coverage and coding guidelines.


public static List<Argument> sortOptions() {
return exprList(argument("desc", booleanLiteral(false)));
}
Expand Down
46 changes: 46 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.ast.tree;

import com.google.common.collect.ImmutableList;
import java.util.List;
import javax.annotation.Nullable;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.ToString;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Field;

/** AST node representing the {@code mvexpand} PPL command: {@code mvexpand <field> [limit=N]}. */
@ToString
@EqualsAndHashCode(callSuper = false)
public class MvExpand extends UnresolvedPlan {

private UnresolvedPlan child;
@Getter private final Field field;
@Getter @Nullable private final Integer limit;

public MvExpand(Field field, @Nullable Integer limit) {
this.field = field;
this.limit = limit;
}

@Override
public MvExpand attach(UnresolvedPlan child) {
this.child = child;
return this;
}

@Override
public List<UnresolvedPlan> getChild() {
return this.child == null ? ImmutableList.of() : ImmutableList.of(this.child);
}

@Override
public <T, C> T accept(AbstractNodeVisitor<T, C> nodeVisitor, C context) {
return nodeVisitor.visitMvExpand(this, context);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@
import org.opensearch.sql.ast.tree.ML;
import org.opensearch.sql.ast.tree.Multisearch;
import org.opensearch.sql.ast.tree.MvCombine;
import org.opensearch.sql.ast.tree.MvExpand;
import org.opensearch.sql.ast.tree.Paginate;
import org.opensearch.sql.ast.tree.Parse;
import org.opensearch.sql.ast.tree.Patterns;
Expand Down Expand Up @@ -920,7 +921,11 @@ public RelNode visitPatterns(Patterns node, CalcitePlanContext context) {
.toList();
context.relBuilder.aggregate(context.relBuilder.groupKey(groupByList), aggCall);
buildExpandRelNode(
context.relBuilder.field(node.getAlias()), node.getAlias(), node.getAlias(), context);
context.relBuilder.field(node.getAlias()),
node.getAlias(),
node.getAlias(),
null,
context);
flattenParsedPattern(
node.getAlias(),
context.relBuilder.field(node.getAlias()),
Expand Down Expand Up @@ -3117,7 +3122,7 @@ public RelNode visitExpand(Expand expand, CalcitePlanContext context) {
RexInputRef arrayFieldRex = (RexInputRef) rexVisitor.analyze(arrayField, context);
String alias = expand.getAlias();

buildExpandRelNode(arrayFieldRex, arrayField.getField().toString(), alias, context);
buildExpandRelNode(arrayFieldRex, arrayField.getField().toString(), alias, null, context);

return context.relBuilder.peek();
}
Expand Down Expand Up @@ -3290,6 +3295,61 @@ private void restoreColumnOrderAfterArrayAgg(
relBuilder.project(projections, projectionNames, /* force= */ true);
}

/**
* MVExpand command visitor.
*
* <p>Expands a multi-value (array) field into separate rows using Calcite's CORRELATE join with
* UNCOLLECT. Each element of the array becomes a separate row while preserving all other fields
* from the original row.
*
* <p>Implementation uses {@link #buildExpandRelNode} to create a correlate join between the
* original relation and an uncollected (unnested) version of the target array field.
*
* <p>Behavior:
*
* <ul>
* <li>Array fields: Each array element is expanded into a separate row
* <li>Non-array fields: Treated as single-element arrays (returns original row unchanged)
* <li>Missing fields: Throws {@link SemanticCheckException}
* <li>Optional limit parameter: Limits the number of expanded elements per document
* </ul>
*
* @param mvExpand MVExpand command containing the field to expand and optional limit
* @param context CalcitePlanContext containing the RelBuilder and planning context
* @return RelNode representing the relation with the expanded multi-value field
* @throws SemanticCheckException if the target field does not exist in the schema
*/
@Override
public RelNode visitMvExpand(MvExpand mvExpand, CalcitePlanContext context) {
visitChildren(mvExpand, context);

final RelBuilder relBuilder = context.relBuilder;
final Field field = mvExpand.getField();
final String fieldName = field.getField().toString();

final RelDataType inputType = relBuilder.peek().getRowType();
final RelDataTypeField inputField =
inputType.getField(fieldName, /*caseSensitive*/ true, /*elideRecord*/ false);

if (inputField == null) {
throw new SemanticCheckException(
String.format("Field '%s' not found in the schema", fieldName));
}

final RexInputRef arrayFieldRex = (RexInputRef) rexVisitor.analyze(field, context);

final SqlTypeName actual = arrayFieldRex.getType().getSqlTypeName();
if (actual != SqlTypeName.ARRAY) {
// For non-array fields (scalars), mvexpand just returns the field unchanged.
// This treats single-value fields as if they were arrays with one element.
return relBuilder.peek();
}

buildExpandRelNode(arrayFieldRex, fieldName, fieldName, mvExpand.getLimit(), context);

return relBuilder.peek();
}

@Override
public RelNode visitValues(Values values, CalcitePlanContext context) {
if (values.getValues() == null || values.getValues().isEmpty()) {
Expand Down Expand Up @@ -3534,7 +3594,11 @@ private void flattenParsedPattern(
}

private void buildExpandRelNode(
RexInputRef arrayFieldRex, String arrayFieldName, String alias, CalcitePlanContext context) {
RexInputRef arrayFieldRex,
String arrayFieldName,
String alias,
@Nullable Integer perDocLimit,
CalcitePlanContext context) {
// 3. Capture the outer row in a CorrelationId
Holder<RexCorrelVariable> correlVariable = Holder.empty();
context.relBuilder.variable(correlVariable::set);
Expand All @@ -3549,14 +3613,17 @@ private void buildExpandRelNode(
RelNode leftNode = context.relBuilder.build();

// 5. Build join right node and expand the array field using uncollect
RelNode rightNode =
context
.relBuilder
// fake input, see convertUnnest and convertExpression in Calcite SqlToRelConverter
.push(LogicalValues.createOneRow(context.relBuilder.getCluster()))
.project(List.of(correlArrayFieldAccess), List.of(arrayFieldName))
.uncollect(List.of(), false)
.build();
context
.relBuilder
// fake input, see convertUnnest and convertExpression in Calcite SqlToRelConverter
.push(LogicalValues.createOneRow(context.relBuilder.getCluster()))
.project(List.of(correlArrayFieldAccess), List.of(arrayFieldName))
.uncollect(List.of(), false);

if (perDocLimit != null) {
context.relBuilder.limit(0, perDocLimit);
}
RelNode rightNode = context.relBuilder.build();

// 6. Perform a nested-loop join (correlate) between the original table and the expanded
// array field.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1108,6 +1108,10 @@ void populate() {
OperandTypes.family(SqlTypeFamily.ARRAY, SqlTypeFamily.INTEGER)
.or(OperandTypes.family(SqlTypeFamily.MAP, SqlTypeFamily.ANY)),
false));
registerOperator(
INTERNAL_ITEM,
SqlStdOperatorTable.ITEM,
PPLTypeChecker.family(SqlTypeFamily.IGNORE, SqlTypeFamily.CHARACTER));
registerOperator(
XOR,
SqlStdOperatorTable.NOT_EQUALS,
Expand Down
1 change: 1 addition & 0 deletions docs/category.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
"user/ppl/cmd/join.md",
"user/ppl/cmd/lookup.md",
"user/ppl/cmd/mvcombine.md",
"user/ppl/cmd/mvexpand.md",
"user/ppl/cmd/parse.md",
"user/ppl/cmd/patterns.md",
"user/ppl/cmd/rare.md",
Expand Down
141 changes: 141 additions & 0 deletions docs/user/ppl/cmd/mvexpand.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# mvexpand

## Description
The `mvexpand` command expands each value in a multivalue (array) field into a separate row. For each document, every element in the specified array field is returned as a new row.


## Syntax
```
mvexpand <field> [limit=<int>]
```

- `<field>`: The multivalue (array) field to expand. (Required)
- `limit`: Maximum number of values per document to expand. (Optional)


### Output field naming
After `mvexpand`, the expanded value remains under the same field name (for example, `tags` or `ids`).
If the array contains objects, you can reference subfields (for example, `skills.name`).


## Examples

### Example 1: Basic Expansion (single document)
Input document (case "basic") contains three tag values.

PPL query:
```ppl
source=people
| eval tags = array('error', 'warning', 'info')
| fields tags
| head 1
| mvexpand tags
| fields tags
```

Expected output:
```text
fetched rows / total rows = 3/3
+---------+
| tags |
|---------|
| error |
| warning |
| info |
+---------+
```

### Example 2: Expansion with Limit
Input document (case "ids") contains an array of integers; expand and apply limit.

PPL query:
```ppl
source=people
| eval ids = array(1, 2, 3, 4, 5)
| fields ids
| head 1
| mvexpand ids limit=3
| fields ids
```

Expected output:
```text
fetched rows / total rows = 3/3
+-----+
| ids |
|-----|
| 1 |
| 2 |
| 3 |
+-----+
```

### Example 3: Expand projects
This example demonstrates expanding a multivalue `projects` field into one row per project.

PPL query:
```ppl
source=people
| head 1
| fields projects
| mvexpand projects
| fields projects.name
```

Expected output:
```text
fetched rows / total rows = 3/3
+--------------------------------+
| projects.name |
|--------------------------------|
| AWS Redshift Spectrum querying |
| AWS Redshift security |
| AWS Aurora security |
+--------------------------------+
```

### Example 4: Single-value array (case "single")
Single-element array should expand to one row.

PPL query:
```ppl
source=people
| eval tags = array('error')
| fields tags
| head 1
| mvexpand tags
| fields tags
```

Expected output:
```text
fetched rows / total rows = 1/1
+-------+
| tags |
|-------|
| error |
+-------+
```

### Example 5: Missing Field
If the field does not exist in the input schema (for example, it is not mapped or was projected out earlier), mvexpand throws a semantic check exception.

PPL query:
```ppl
source=people
| eval some_field = 'x'
| fields some_field
| head 1
| mvexpand tags
| fields tags
```

Expected output:
```text
{'reason': 'Invalid Query', 'details': "Field 'tags' not found in the schema", 'type': 'SemanticCheckException'}
Error: Query returned no data
```

## Notes about these doctests
- The examples below generate deterministic multivalue fields using `eval` + `array()` so doctests are stable.
- All examples run against a single source index (`people`) and use `head 1` to keep output predictable.
Loading
Loading