40b, 1b models have significantly lower performance when `use_fp8_input_projections` is set to false
40b, 1b models have significantly lower performance when
use_fp8_input_projectionsis set to false