Skip to content

Performance enhancement via matmul #31

@rafaelha

Description

@rafaelha

Evaluation of scalars is done essentially via binary matmuls. This is a bit hard to see, but effectively is what happens after the VMAP transformation.

Generally, kernels are not optimized for binary matmul, so one could consider using floating-point matmul instead.

An additional strategy is bit-packing, which would also significantly save memory.

Some profiling should definitely be done for this ticket.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions