Skip to content

[tmva][sofie] Mod/FMod constant-folding uses std::pow instead of modulo #21734

@harz05

Description

@harz05

Check duplicate issues.

  • Checked for duplicates

Description

In tmva/sofie/inc/TMVA/ROperator_BasicBinary.hxx, the Func() method for both the Mod and FMod operator traits incorrectly calls std::pow(t1, t2) instead of the correct modulo operation. This is a copy-paste error from the Pow trait directly above them.

Func() is called during ONNX parsing when both inputs to a Mod/FMod node are ONNX Constant nodes,
SOFIE constant-folds the result at parse time using Func() rather than emitting runtime code. As a result, the constant-folded output is completely wrong.

Note: Op() in both traits is correct (generates t1 % t2 and std::fmod(t1, t2) respectively). Only Func() is broken.

Buggy code in tmva/sofie/inc/TMVA/ROperator_BasicBinary.hxx:

// Line 54-58: Mod trait
template <typename T>
struct BinaryOperatorTrait<T, Mod> {
   static const std::string Name() { return "Mod"; }
   static std::string Op(const std::string &t1, const std::string t2) { return "(" + t1 + " % " + t2 + ")"; }
   static T Func(T t1, T t2) { return std::pow(t1, t2); }  // BUG: should be t1 % t2
};

// Line 59-64: FMod trait
template <typename T>
struct BinaryOperatorTrait<T, FMod> {
   static const std::string Name() { return "FMod"; }
   static std::string Op(const std::string &t1, const std::string t2) { return "std::fmod(" + t1 + "," + t2 + ")"; }
   static T Func(T t1, T t2) { return std::pow(t1, t2); }  // BUG: should be std::fmod(t1, t2)
};

Reproducer

Build an ONNX model where both inputs to a Mod node (with fmod=1) are ONNX Constant nodes (not initializers, it must be Constant to trigger Func()):

import numpy as np
import onnx
from onnx import TensorProto, helper, numpy_helper
import tempfile, os

X_data = np.array([10.0, 7.0, 5.0], dtype=np.float32)
D_data = np.array([3.0, 3.0, 3.0], dtype=np.float32)

X_const = helper.make_node('Constant', inputs=[], outputs=['X'],
    value=numpy_helper.from_array(X_data, name='X'))
D_const = helper.make_node('Constant', inputs=[], outputs=['D'],
    value=numpy_helper.from_array(D_data, name='D'))
fmod_node = helper.make_node('Mod', inputs=['X', 'D'], outputs=['Y'], fmod=1)

graph = helper.make_graph([X_const, D_const, fmod_node], 'test_fmod',
    inputs=[],
    outputs=[helper.make_tensor_value_info('Y', TensorProto.FLOAT, [3])])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid('', 13)])

tmpdir = tempfile.mkdtemp()
onnx_path = os.path.join(tmpdir, 'test_fmod.onnx')
onnx.save(model, onnx_path)

import ROOT
ROOT.gSystem.Load("libROOTTMVASofie")
parser = ROOT.TMVA.Experimental.SOFIE.RModelParser_ONNX()
rmodel = parser.Parse(onnx_path)
rmodel.Generate()
rmodel.OutputGenerated(os.path.join(tmpdir, 'test_fmod.hxx'))

ROOT.gInterpreter.Declare(f'#include "{tmpdir}/test_fmod.hxx"')
sess = ROOT.TMVA_SOFIE_test_fmod.Session(f'{tmpdir}/test_fmod.dat')
result = list(sess.infer())
print("Got:     ", result)           # [1000.0, 343.0, 125.0]  (std::pow applied)
print("Expected:", [1.0, 1.0, 2.0])  # fmod([10,7,5], [3,3,3])

Run with: python3 reproducer.py (requires ROOT with SOFIE, onnx Python package)

Result:

  • Expected: [1.0, 1.0, 2.0]
  • Got (buggy): [1000.0, 343.0, 125.0] (std::pow(10,3), std::pow(7,3), std::pow(5,3))

ROOT version

6.39.01 (built from source)

Installation method

Built from source with -Dtmva-sofie=ON -Dtmva-pymva=ON -Dbuiltin_protobuf=ON

Operating system

Linux (Ubuntu 22.04)

Additional context

However, I think that the trigger condition of this bug is narrow. This bug triggers when both inputs to the Mod/FMod node are ONNX Constant nodes (not regular initializers). When that happens, SOFIE silently produces wrong output with no error.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions