Open
Description
?GEADD
has virtually no utility over ?AXPBY
, which is itself unoptimizable (i.e. the equivalent loops, optimized by a compiler, will perform at least as well in all cases).
Both appleblas_?geadd
and cublas?geam
support transposition, which is useful, in part because transposition is nontrivial to optimize.
cblas_?omatcopy
, which is inspired by MKL, has transpose but does not support the generality of accumulation of the former.
It would be nice to have a GEAM-style routine in OpenBLAS.