matrix multiplication - Armadillo vs Eigen3 Timing Difference -
my hope discussion might else having issues armadillo , eigen3.
i've written wrapper class, mat, wraps either arma::mat armadillo library or eigen::matrix eigen3 library. controlled flag @ compile time.
additionally, i've written tensor class uses mat storage. primary feature of class use of voigt notation condense higher order tensors stored in matrix.
finally, i've written test multiplies 2nd order tensor (i.e. matrix) , 1st order tensor (i.e. vector) multiple times , records time takes complete operators. mat class , tensor class.
because tensor wraps mat, expect time larger. case armadillo, close 20% on average. however, when using eigen, using tensor faster, makes absolutely no sense me.
does stick out anyone?
edit: providing more details.
i've first wrapped arma::mat myown::armamat , eigen::matrix myown::eigenmat. both of these wrap armadillo , eigen's api common framework. finally, based on compiler flag, myown::mat wraps armamat or eigenmat. i'm not sure optimization flags have turned on.
as described above, myown::tensor uses myown::mat storage. because of physical applications i'll using tensor class for, templated 2d (i.e. 2-by-2 if it's 2nd order) or 3d (i.e. 3-by-3). (in contrast, mat can of size).
the operator i'm using timing purposes is: 2-by-2 matrix (2nd order tensor) times 2-by-1 matrix (1st order tensor). when using mat, i'm using armadillo's or eigen's expression templating.
when using tensor class, i'm overloading operator* such:
template< typename t1, bool sym > moris::mat< t1 > operator*( moris::tensor< t1, 2, 2, true > const & atensor1, moris::tensor< t1, 1, 2, sym > const & atensor2 ) { moris::mat< t1 > tvector(2, 1); tvector(0) = atensor1[0]*atensor2[0] + atensor1[2]*atensor2[1]; tvector(1) = atensor1[2]*atensor2[0] + atensor1[1]*atensor2[1]; return tvector; } the [] operator on tensor accesses data form underlying storage mat (via voigt convention).
"it's complicated."
we offer bindings both armadillo , eigen r via add-on packages rcpparmadillo , rcppeigen comparison , horse-race question comes lot.
and don't think there clear answer. make matters "worse", armadillo refers whichever lapack/blas have installed , hence using multicore parallelism, whereas eigen tends prefer own routines. when preparing rcpp book did timings , found counter-intuitive results.
at end of day may need profile your problem.
Comments
Post a Comment