-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Check duplicate issues.
- Checked for duplicates
Description
In tmva/sofie/inc/TMVA/ROperator_BasicBinary.hxx, the Func() method for both the Mod and FMod operator traits incorrectly calls std::pow(t1, t2) instead of the correct modulo operation. This is a copy-paste error from the Pow trait directly above them.
Func() is called during ONNX parsing when both inputs to a Mod/FMod node are ONNX Constant nodes,
SOFIE constant-folds the result at parse time using Func() rather than emitting runtime code. As a result, the constant-folded output is completely wrong.
Note: Op() in both traits is correct (generates t1 % t2 and std::fmod(t1, t2) respectively). Only Func() is broken.
Buggy code in tmva/sofie/inc/TMVA/ROperator_BasicBinary.hxx:
// Line 54-58: Mod trait
template <typename T>
struct BinaryOperatorTrait<T, Mod> {
static const std::string Name() { return "Mod"; }
static std::string Op(const std::string &t1, const std::string t2) { return "(" + t1 + " % " + t2 + ")"; }
static T Func(T t1, T t2) { return std::pow(t1, t2); } // BUG: should be t1 % t2
};
// Line 59-64: FMod trait
template <typename T>
struct BinaryOperatorTrait<T, FMod> {
static const std::string Name() { return "FMod"; }
static std::string Op(const std::string &t1, const std::string t2) { return "std::fmod(" + t1 + "," + t2 + ")"; }
static T Func(T t1, T t2) { return std::pow(t1, t2); } // BUG: should be std::fmod(t1, t2)
};
Reproducer
Build an ONNX model where both inputs to a Mod node (with fmod=1) are ONNX Constant nodes (not initializers, it must be Constant to trigger Func()):
import numpy as np
import onnx
from onnx import TensorProto, helper, numpy_helper
import tempfile, os
X_data = np.array([10.0, 7.0, 5.0], dtype=np.float32)
D_data = np.array([3.0, 3.0, 3.0], dtype=np.float32)
X_const = helper.make_node('Constant', inputs=[], outputs=['X'],
value=numpy_helper.from_array(X_data, name='X'))
D_const = helper.make_node('Constant', inputs=[], outputs=['D'],
value=numpy_helper.from_array(D_data, name='D'))
fmod_node = helper.make_node('Mod', inputs=['X', 'D'], outputs=['Y'], fmod=1)
graph = helper.make_graph([X_const, D_const, fmod_node], 'test_fmod',
inputs=[],
outputs=[helper.make_tensor_value_info('Y', TensorProto.FLOAT, [3])])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid('', 13)])
tmpdir = tempfile.mkdtemp()
onnx_path = os.path.join(tmpdir, 'test_fmod.onnx')
onnx.save(model, onnx_path)
import ROOT
ROOT.gSystem.Load("libROOTTMVASofie")
parser = ROOT.TMVA.Experimental.SOFIE.RModelParser_ONNX()
rmodel = parser.Parse(onnx_path)
rmodel.Generate()
rmodel.OutputGenerated(os.path.join(tmpdir, 'test_fmod.hxx'))
ROOT.gInterpreter.Declare(f'#include "{tmpdir}/test_fmod.hxx"')
sess = ROOT.TMVA_SOFIE_test_fmod.Session(f'{tmpdir}/test_fmod.dat')
result = list(sess.infer())
print("Got: ", result) # [1000.0, 343.0, 125.0] (std::pow applied)
print("Expected:", [1.0, 1.0, 2.0]) # fmod([10,7,5], [3,3,3])Run with: python3 reproducer.py (requires ROOT with SOFIE, onnx Python package)
Result:
- Expected:
[1.0, 1.0, 2.0] - Got (buggy):
[1000.0, 343.0, 125.0](std::pow(10,3),std::pow(7,3),std::pow(5,3))
ROOT version
6.39.01 (built from source)
Installation method
Built from source with -Dtmva-sofie=ON -Dtmva-pymva=ON -Dbuiltin_protobuf=ON
Operating system
Linux (Ubuntu 22.04)
Additional context
However, I think that the trigger condition of this bug is narrow. This bug triggers when both inputs to the Mod/FMod node are ONNX Constant nodes (not regular initializers). When that happens, SOFIE silently produces wrong output with no error.