-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Check duplicate issues.
- Checked for duplicates
Description
In tmva/sofie/inc/TMVA/ROperator_Conv.hxx and tmva/sofie/inc/TMVA/ROperator_Pool.hxx, the SAME_UPPER and SAME_LOWER auto-padding computation in DoShapeInference() is incorrect. I could find that the implementation has three bugs in the same block (which is copy-pasted across both files, each with an existing // need to check this! comment):
Bug 1: wrong base padding formula
The code initializes each side with kernel / 2:
fAttrPads = {fAttrKernelShape[0] / 2, fAttrKernelShape[1] / 2,
fAttrKernelShape[0] / 2, fAttrKernelShape[1] / 2};
For even kernels (e.g. kernel=4), this gives total_pad as 4 instead of the correct kernel - 1 = 3.
Bug 2: extra-increment condition is inverted
if (fAttrKernelShape[0] % 2 == 1) { ... } // fires for odd kernels
Asymmetric padding is needed when total_pad is odd, which happens when the kernel is even i.e. kernel - 1 is odd.
Bug 3: SAME_UPPER and SAME_LOWER directions are swapped
(fAttrAutopad == "SAME_UPPER") ? fAttrPads[0]++ : fAttrPads[i1]++;
As per the ONNX spec, SAME_UPPER places the extra padding at the end and SAME_LOWER at the beginning, but here the code does the opposite.
Reproducer
import numpy as np
import onnx
from onnx import TensorProto, helper
import onnxruntime as rt
import tempfile, os
# Conv2D: input (1,1,5,5), kernel (1,1,4,4), SAME_UPPER, stride=1
# Correct output shape: (1,1,5,5). SOFIE produces (1,1,6,6).
input_data = np.zeros((1, 1, 5, 5), dtype=np.float32)
input_data[0, 0, 0, 0] = 1.0
weight_data = np.zeros((1, 1, 4, 4), dtype=np.float32)
weight_data[0, 0, 0, 0] = 1.0
graph = helper.make_graph(
[helper.make_node('Conv', inputs=['X', 'W'], outputs=['Y'],
auto_pad='SAME_UPPER', strides=[1, 1])],
'test_same_upper',
inputs=[helper.make_tensor_value_info('X', TensorProto.FLOAT, [1, 1, 5, 5]),
helper.make_tensor_value_info('W', TensorProto.FLOAT, [1, 1, 4, 4])],
outputs=[helper.make_tensor_value_info('Y', TensorProto.FLOAT, None)])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid('', 11)])
model.ir_version = 7
tmpdir = tempfile.mkdtemp()
onnx_path = os.path.join(tmpdir, 'test_same_upper.onnx')
onnx.save(model, onnx_path)
# onnxruntime reference
rt_out = rt.InferenceSession(onnx_path).run(None, {'X': input_data, 'W': weight_data})[0]
print("onnxruntime output shape:", rt_out.shape) # (1, 1, 5, 5)
# SOFIE
import ROOT
ROOT.gSystem.Load('libROOTTMVASofieParser')
parser = ROOT.TMVA.Experimental.SOFIE.RModelParser_ONNX()
rmodel = parser.Parse(onnx_path)
rmodel.Generate()
rmodel.OutputGenerated(os.path.join(tmpdir, 'test_same_upper.hxx'))
ROOT.gInterpreter.Declare(f'#include "{tmpdir}/test_same_upper.hxx"')
sess = ROOT.TMVA_SOFIE_test_same_upper.Session(f'{tmpdir}/test_same_upper.dat')
flat_x = np.ascontiguousarray(input_data.flatten(), dtype=np.float32)
flat_w = np.ascontiguousarray(weight_data.flatten(), dtype=np.float32)
sofie_out = np.array(sess.infer(flat_x, flat_w))
print("SOFIE output size:", sofie_out.size) # 36 (wrong, should be 25)
print("Shape mismatch:", sofie_out.size != rt_out.size)Run with: python3 reproducer.py (requires ROOT with SOFIE, onnx, onnxruntime Python packages)
Result: we get onnxruntime output shape as (1,1,5,5) which has 25 elements. SOFIE output size is 36(6x6) which is incorrect shape.
ROOT version
6.39.01
Installation method
Built from source with -Dtmva-sofie=ON -Dtmva-pymva=ON -Dbuiltin_protobuf=ON
Operating system
Linux (Ubuntu 22.04)
Additional context
Additionally, fAttrStrides is initialized after the padding block in both files, so when SAME_UPPER/SAME_LOWER padding is computed, strides may be empty (defaulting to uninitialized values). The fix should move strides initialization before the padding block.