Skip to content

Extend Support for RaggedTensorToTensor#613

Merged
apaniukov merged 4 commits intoopenvinotoolkit:masterfrom
apaniukov:extend-ragged-to-tensor
Mar 24, 2026
Merged

Extend Support for RaggedTensorToTensor#613
apaniukov merged 4 commits intoopenvinotoolkit:masterfrom
apaniukov:extend-ragged-to-tensor

Conversation

@apaniukov
Copy link
Copy Markdown
Collaborator

@apaniukov apaniukov commented Feb 20, 2026

@apaniukov apaniukov changed the title [WiP] Extend Support for RaggedTensorToTensor Extend Support for RaggedTensorToTensor Mar 23, 2026
@apaniukov apaniukov requested a review from Copilot March 23, 2026 19:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends the TensorFlow frontend’s RaggedTensorToTensor translation to support an additional ragged representation (two ragged dimensions) and updates internal ragged ops to handle non-1D data tensors during densification.

Changes:

  • Added support for row_partition_types == {"ROW_SPLITS", "VALUE_ROWIDS"} in translate_ragged_tensor_to_tensor (3D densification via two RaggedToDense passes).
  • Updated RaggedToDense to accept data tensors with rank ≥ 1 and to propagate trailing dense dimensions into output shape + runtime evaluation.
  • Adjusted RaggedToRagged type/shape inference and added a new ragged-input validator helper.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/utils.hpp Declares new helper for validating ragged inputs with any-rank data.
src/utils.cpp Implements the new ragged validation helper (rank ≥ 1 for data).
src/tensorflow_translators.cpp Extends RaggedTensorToTensor translation to handle {ROW_SPLITS, VALUE_ROWIDS} and reshapes pad value to scalar.
src/ragged_to_ragged.hpp Adds evaluate_lower/upper overrides returning false.
src/ragged_to_ragged.cpp Changes output shape inference logic based on first_dim_size.
src/ragged_to_dense.cpp Broadens data rank support and updates runtime evaluation to handle trailing dense dimensions.
Comments suppressed due to low confidence (1)

src/ragged_to_dense.cpp:66

  • pad_right input validation allows any integral element type (is_integral()), but evaluate() reads it as bool (inputs[5].data<bool>()). If a caller passes an integral (e.g., i32) scalar, this will interpret the buffer as bool and can produce incorrect results/UB. Tighten the check to require boolean element type (or read as the validated type and convert to bool).
    if (get_input_size() == 3 + 1 + 1 + 1) {
        OPENVINO_ASSERT(get_input_partial_shape(5).is_dynamic() || get_input_partial_shape(5).is_static() && get_input_partial_shape(5).rank().get_length() == 0,
                        "RaggedToDense: pad_right should be a boolean scalar.");

        OPENVINO_ASSERT(get_input_element_type(5).is_integral(),
                      "RaggedToDense: pad_right should be a boolean value.");
    }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ragged_to_ragged.cpp
Comment on lines +22 to +35
// Check whether input 1 is a Constant node, otherwise fall back to the lower-bound tensor
PartialShape out_shape({ Dimension::dynamic() });
auto infer_from_int32_tensor = [&](const ov::Tensor& t) {
if (t && t.get_element_type() == element::i32 && t.get_size() >= 1) {
out_shape = PartialShape({ static_cast<Dimension::value_type>(t.data<const int32_t>()[0]) });
}
};
if (const auto* first_dim_const = dynamic_cast<const Constant*>(get_input_node_ptr(1))) {
auto vals = first_dim_const->cast_vector<int32_t>();
if (!vals.empty()) {
out_shape = PartialShape({ static_cast<Dimension::value_type>(vals[0]) });
}
} else {
infer_from_int32_tensor(get_input_tensor(1).get_lower_value());
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In validate_and_infer_types(), inferring a static output length from get_input_tensor(1).get_lower_value() can over-constrain the output shape (e.g., lower bound may be 0/unknown while runtime first_dim_size is larger). This can lead to incorrect compile-time shapes and downstream validation failures. Safer approach: only set a concrete out_shape when input 1 is a Constant; otherwise keep it dynamic (or only specialize when lower==upper and is known).

Suggested change
// Check whether input 1 is a Constant node, otherwise fall back to the lower-bound tensor
PartialShape out_shape({ Dimension::dynamic() });
auto infer_from_int32_tensor = [&](const ov::Tensor& t) {
if (t && t.get_element_type() == element::i32 && t.get_size() >= 1) {
out_shape = PartialShape({ static_cast<Dimension::value_type>(t.data<const int32_t>()[0]) });
}
};
if (const auto* first_dim_const = dynamic_cast<const Constant*>(get_input_node_ptr(1))) {
auto vals = first_dim_const->cast_vector<int32_t>();
if (!vals.empty()) {
out_shape = PartialShape({ static_cast<Dimension::value_type>(vals[0]) });
}
} else {
infer_from_int32_tensor(get_input_tensor(1).get_lower_value());
// Infer output shape: only specialize when input 1 is a Constant; otherwise keep it dynamic.
PartialShape out_shape({ Dimension::dynamic() });
if (const auto* first_dim_const = dynamic_cast<const Constant*>(get_input_node_ptr(1))) {
auto vals = first_dim_const->cast_vector<int32_t>();
if (!vals.empty()) {
out_shape = PartialShape({ static_cast<Dimension::value_type>(vals[0]) });
}

Copilot uses AI. Check for mistakes.
Comment thread src/ragged_to_dense.cpp
Comment on lines 80 to +83

// Suppose validate was called and set correct output shape
// Take a target shape value for ragged dimension
size_t target_dim = inputs[3].data<const int32_t>()[0];
const size_t target_dim = static_cast<size_t>(inputs[3].data<const int32_t>()[0]);

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

padding_size/target_dim is validated as any integral type (get_input_element_type(3).is_integral_number()), but evaluate() unconditionally reads it as int32_t. This is inconsistent and will misread i64/u64 inputs. Either restrict input 3 to i32 in validate_and_infer_types() or handle other integral element types in evaluate() (e.g., via a type switch and conversion to size_t).

Copilot uses AI. Check for mistakes.
Comment thread src/ragged_to_dense.cpp
Comment on lines +130 to 137
for (size_t i = 0; i < nelems; ++i) {
const size_t data_len = static_cast<size_t>(ends[i] - begins[i]);
size_t target_len = (std::min(data_len, target_dim) * (1 - m_pad_max_length) +
target_dim * m_pad_max_length);

const auto begin = elems + elem_size * inner_elems * static_cast<size_t>(begins[i]);
const auto end = begin + elem_size * inner_elems * target_len;
out_elems = std::copy(begin, end, out_elems);
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When m_pad_max_length is true, target_len becomes target_dim regardless of data_len. If data_len < target_dim, the subsequent std::copy(begin, end, ...) will read past the end of inputs[2] (OOB). This is especially likely when padding to a fixed max length (see pad_max_length usage). Consider always using copy_len = min(data_len, target_dim) for the copy, and use default-value padding to reach target_dim when data_len < target_dim.

Copilot uses AI. Check for mistakes.
Comment thread src/ragged_to_dense.cpp
Comment on lines +150 to 166
for (size_t i = 0; i < nelems; ++i) {
const size_t data_len = static_cast<size_t>(ends[i] - begins[i]);
size_t target_len = (std::min(data_len, target_dim) * (1 - m_pad_max_length) +
target_dim * m_pad_max_length);
const size_t pad_len = target_dim - target_len;

for (size_t j = 0; j < pad_len; ++j) {
out_elems = std::copy(default_value, default_value + elem_size, out_elems);
fill_default_block(out_elems);
}
// fill actual values
auto begin = elems + elem_size * begins[i];
auto end = begin + elem_size * target_len;

const auto begin = elems + elem_size * inner_elems * static_cast<size_t>(begins[i]);
const auto end = begin + elem_size * inner_elems * target_len;
out_elems = std::copy(begin, end, out_elems);

// construct padding mask
out_mask = std::fill_n(out_mask, pad_len, char(0));
out_mask = std::fill_n(out_mask, target_dim - pad_len, char(1));
mask_fill(out_mask, pad_len * inner_elems, char(0));
mask_fill(out_mask, (target_dim - pad_len) * inner_elems, char(1));
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same m_pad_max_length/target_len issue exists in the left-padding branch: if m_pad_max_length is true and data_len < target_dim, target_len becomes target_dim and the copy uses target_len, which can read past the input buffer. Align the copy length with available data_len and pad with default_value to reach target_dim.

Copilot uses AI. Check for mistakes.
Comment thread src/ragged_to_dense.cpp
Comment on lines +97 to +103
// Number of dense elements per one ragged element (trailing dense dimensions).
// For 1D data, this equals 1.
const auto& data_shape = inputs[2].get_shape();
size_t inner_elems = 1;
for (size_t idx = 1; idx < data_shape.size(); ++idx) {
inner_elems *= data_shape[idx];
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adds support for data tensors with rank > 1 (via inner_elems and appending trailing data_shape dims). There are existing tests for RaggedToDense, but they appear to only cover 1D data. Add test coverage for multi-rank data (e.g., values shaped [N, D] and [N, D1, D2]) and for both pad_right settings to ensure padding/mask behavior is correct across trailing dims.

Copilot uses AI. Check for mistakes.
@apaniukov apaniukov merged commit 6a170d2 into openvinotoolkit:master Mar 24, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants