Skip to content

Commit b1ff83b

Browse files
hexagon: further optimization and tuning of matmul and dot kernels (ggml-org#19407)
* ggml-hexagon: implement 2x2 matmul kernel * hexmm: implement vec_dot_rx2x2 for Q8_0 and MXFP4 * hexagon: fix editor config failures * hexagon: refactor matmul ops to use context struct and remove wrappers Also implement vec_dot_f16 2x2 * hexagon: refactor dyn quantizers to use mmctx * hexagon: remove mm fastdiv from op_ctx * hexagon: refactor matmul entry point to reduce code duplication --------- Co-authored-by: Trivikram Reddy <tamarnat@qti.qualcomm.com>
1 parent 4ae1b75 commit b1ff83b

2 files changed

Lines changed: 847 additions & 671 deletions

File tree

ggml/src/ggml-hexagon/htp/htp-ops.h

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -64,25 +64,12 @@ struct htp_ops_context {
6464
struct fastdiv_values broadcast_rv2;
6565
struct fastdiv_values broadcast_rv3;
6666

67-
struct fastdiv_values mm_div_ne12_ne1; // fastdiv values for ne12 * ne1
68-
struct fastdiv_values mm_div_ne1; // fastdiv values for ne1
69-
struct fastdiv_values mm_div_r2; // fastdiv values for ne12 / ne02
70-
struct fastdiv_values mm_div_r3; // fastdiv values for ne13 / ne03
71-
7267
struct fastdiv_values set_rows_div_ne12; // fastdiv values for ne12
7368
struct fastdiv_values set_rows_div_ne11; // fastdiv values for ne11
7469

7570
struct fastdiv_values get_rows_div_ne10; // fastdiv values for ne10
7671
struct fastdiv_values get_rows_div_ne10_ne11; // fastdiv values for ne10 * ne11
7772

78-
struct fastdiv_values cpy_div_ne01; // fastdiv values for ne01
79-
struct fastdiv_values cpy_div_ne02; // fastdiv values for ne02
80-
struct fastdiv_values cpy_div_ne03; // fastdiv values for ne03
81-
82-
struct fastdiv_values cpy_rshp_div_n0; // fastdiv values for ne00
83-
struct fastdiv_values cpy_rshp_div_n1n0; // fastdiv values for ne00*ne01
84-
struct fastdiv_values cpy_rshp_div_n2n1n0; // fastdiv values for ne00*ne01*ne02
85-
8673
uint32_t flags;
8774
};
8875

0 commit comments

Comments
 (0)