HMLP: High-performance Machine Learning Primitives
|
Public Member Functions | |
void | operator() (kernel_s< T, T > *kernel, int k, int nrhs, T *u, T *a, T *a2, T *b, T *b2, T *w, T *c, int ldc, aux_s< T, T, T, T > *aux) const |
|
inline |
use an MR-by-NR static buffer
rank-k update
accumulate the previous rank-k update
matrix-vector multiplication