Is your feature request related to a problem? Please describe.
There are other speedup methods for transformers like FasterTransformer.
Describe the solution you'd like
Can you describe how your method compares to FT method and if it can be combined and potentially show an example?