On Friday October 17, this site was moved to a new server, https://mw.hh.se. The original address will continue to work. Whithin a week or two this site will return to the original address. /Peo HH IT-dep

WG211/M23Kiselyov: Difference between revisions

From WG 2.11
Jump to navigationJump to search
Jeremy-y (talk | contribs)
Add Oleg's talk details
 
(No difference)

Latest revision as of 08:44, 11 March 2024

The Mysteries of AXPY (Oleg Kiselyov)

AXPY is one of the Basic Linear Algebra (BLAS) vector operations: vector addition aX+Y. It is a perfect target for classical optimizations like partial loop unrolling and scalar promotion. (AXPY is also embarrassingly parallel; however, this talk focuses on single-thread performance.) These optimizations are indeed carried out -- by hand -- in OpenBLAS, regarded as one of the two fastest BLAS implementations. One can make a case for automatic code generation, to reduce the tedium of applying such optimizations (given that there are many platforms and several AXPY varieties to optimize: SAXPY, DAXPY, CAXPY). This is the traditional elevator talk about metaprogramming in HPC.

How does it correspond to real life, in this day and age?