HMLP: High-performance Machine Learning Primitives
|
This kernel takes opkernel, op1 and op2 to implement an MR-by-NR GKMM operation. More...
#include <fused_mrxnr.hpp>
Public Member Functions | |
void | operator() (int k, TA *a, TB *b, TC *c, int ldc, TV *v, int ldv, aux_s< TA, TB, TC, TV > *aux) const |
void | operator() (int k, TA *a, TB *b, TV *v, int rs_c, int cs_c, aux_s< TA, TB, TC, TV > *aux) const |
Public Attributes | |
OPKERNEL | opkernel |
OP1 | op1 |
OP2 | op2 |
TV | initV |
Static Public Attributes | |
static const size_t | mr = MR |
static const size_t | nr = NR |
static const size_t | pack_mr = MR |
static const size_t | pack_nr = NR |
static const size_t | align_size = 32 |
This kernel takes opkernel, op1 and op2 to implement an MR-by-NR GKMM operation.