Add ONNX rewriter#82
Conversation
25f2f66 to
1c39f0a
Compare
ebfa390 to
2cf0db0
Compare
4225415 to
37866e7
Compare
| # The number of Matmul+Gemm has to be less compared to the model pre-transformation | ||
| # This is not zero since there are matmul that are not linear layers so they are not replaced | ||
| # and some linears layers can be excluded from quantization | ||
| assert matmul_gemm_counter <= original_matmul_gemm_counter |
There was a problem hiding this comment.
A better test would have us monitoring exactly how many Matmul and Gemm we expect to have before/after transformations, and similarly with matmulinteger.
Considering all the other changes that we are doing plus new tests that we will be adding, maybe we could wait for this kind of implementation when everything is more stable.
|
It seems that |
@Giuseppe5 for tests could you use python>=3.9. It should work with this |
|
python 3.8 is still largely used (see https://pypistats.org/packages/transformers), although EOL is in a few months. We thus probably don't want to have |
cc5b3b8 to
fac256f
Compare
| # The number of Matmul+Gemm has to be less compared to the model pre-transformation | ||
| # This is not zero since there are matmul that are not linear layers so they are not replaced | ||
| # and some linears layers can be excluded from quantization | ||
| assert matmul_gemm_counter <= original_matmul_gemm_counter |
There was a problem hiding this comment.
| assert matmul_gemm_counter <= original_matmul_gemm_counter | |
| self.assertTrue(matmul_gemm_counter <= original_matmul_gemm_counter) |
Depends on #110