Skip to content

[tmva][sofie] Conv and Pool SAME_UPPER/SAME_LOWER auto-padding computes wrong output shape #21737

@harz05

Description

@harz05

Check duplicate issues.

  • Checked for duplicates

Description

In tmva/sofie/inc/TMVA/ROperator_Conv.hxx and tmva/sofie/inc/TMVA/ROperator_Pool.hxx, the SAME_UPPER and SAME_LOWER auto-padding computation in DoShapeInference() is incorrect. I could find that the implementation has three bugs in the same block (which is copy-pasted across both files, each with an existing // need to check this! comment):

Bug 1: wrong base padding formula

The code initializes each side with kernel / 2:

fAttrPads = {fAttrKernelShape[0] / 2, fAttrKernelShape[1] / 2,
             fAttrKernelShape[0] / 2, fAttrKernelShape[1] / 2};

For even kernels (e.g. kernel=4), this gives total_pad as 4 instead of the correct kernel - 1 = 3.

Bug 2: extra-increment condition is inverted

if (fAttrKernelShape[0] % 2 == 1) { ... }  // fires for odd kernels

Asymmetric padding is needed when total_pad is odd, which happens when the kernel is even i.e. kernel - 1 is odd.

Bug 3: SAME_UPPER and SAME_LOWER directions are swapped

(fAttrAutopad == "SAME_UPPER") ? fAttrPads[0]++ : fAttrPads[i1]++;

As per the ONNX spec, SAME_UPPER places the extra padding at the end and SAME_LOWER at the beginning, but here the code does the opposite.

Reproducer

import numpy as np
import onnx
from onnx import TensorProto, helper
import onnxruntime as rt
import tempfile, os

# Conv2D: input (1,1,5,5), kernel (1,1,4,4), SAME_UPPER, stride=1
# Correct output shape: (1,1,5,5). SOFIE produces (1,1,6,6).
input_data = np.zeros((1, 1, 5, 5), dtype=np.float32)
input_data[0, 0, 0, 0] = 1.0
weight_data = np.zeros((1, 1, 4, 4), dtype=np.float32)
weight_data[0, 0, 0, 0] = 1.0

graph = helper.make_graph(
    [helper.make_node('Conv', inputs=['X', 'W'], outputs=['Y'],
        auto_pad='SAME_UPPER', strides=[1, 1])],
    'test_same_upper',
    inputs=[helper.make_tensor_value_info('X', TensorProto.FLOAT, [1, 1, 5, 5]),
            helper.make_tensor_value_info('W', TensorProto.FLOAT, [1, 1, 4, 4])],
    outputs=[helper.make_tensor_value_info('Y', TensorProto.FLOAT, None)])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid('', 11)])
model.ir_version = 7

tmpdir = tempfile.mkdtemp()
onnx_path = os.path.join(tmpdir, 'test_same_upper.onnx')
onnx.save(model, onnx_path)

# onnxruntime reference
rt_out = rt.InferenceSession(onnx_path).run(None, {'X': input_data, 'W': weight_data})[0]
print("onnxruntime output shape:", rt_out.shape)   # (1, 1, 5, 5)

# SOFIE
import ROOT
ROOT.gSystem.Load('libROOTTMVASofieParser')
parser = ROOT.TMVA.Experimental.SOFIE.RModelParser_ONNX()
rmodel = parser.Parse(onnx_path)
rmodel.Generate()
rmodel.OutputGenerated(os.path.join(tmpdir, 'test_same_upper.hxx'))
ROOT.gInterpreter.Declare(f'#include "{tmpdir}/test_same_upper.hxx"')
sess = ROOT.TMVA_SOFIE_test_same_upper.Session(f'{tmpdir}/test_same_upper.dat')

flat_x = np.ascontiguousarray(input_data.flatten(), dtype=np.float32)
flat_w = np.ascontiguousarray(weight_data.flatten(), dtype=np.float32)
sofie_out = np.array(sess.infer(flat_x, flat_w))
print("SOFIE output size:", sofie_out.size)   # 36 (wrong, should be 25)
print("Shape mismatch:", sofie_out.size != rt_out.size)

Run with: python3 reproducer.py (requires ROOT with SOFIE, onnx, onnxruntime Python packages)
Result: we get onnxruntime output shape as (1,1,5,5) which has 25 elements. SOFIE output size is 36(6x6) which is incorrect shape.

ROOT version

6.39.01

Installation method

Built from source with -Dtmva-sofie=ON -Dtmva-pymva=ON -Dbuiltin_protobuf=ON

Operating system

Linux (Ubuntu 22.04)

Additional context

Additionally, fAttrStrides is initialized after the padding block in both files, so when SAME_UPPER/SAME_LOWER padding is computed, strides may be empty (defaulting to uninitialized values). The fix should move strides initialization before the padding block.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions