On Friday October 17, this site was moved to a new server, https://mw.hh.se. The original address will continue to work. Whithin a week or two this site will return to the original address. /Peo HH IT-dep

WG211/M10Kelly: Difference between revisions

From WG 2.11
Jump to navigationJump to search
m 1 revision
 
(No difference)

Latest revision as of 12:06, 12 December 2011



Using DSLs to open expand the parallel code synthesis design space by Paul Kelly

What is the right code to generate, for a given hardware platform? How does this change as problem parameters change? This talk presents some recent work-in-progress in the finite-element fluid dynamics domain; we show some of the fruits of our attempt to map out the design space. Our goal is to build tools that automatically synthesise the optimal implementation. By getting the abstraction right, we can capture design choices far beyond what a conventional compiler can do. For example, we show that in low-order finite-element formulations, assembling the global sparse system matrix is efficient on CPUs, but on GPUs the balance is shifted to favour a different, local assembly algorithm, with a better memory access pattern. In high-order problems, this turns out to be attractive on CPUs as well. The choice of high- or low-order is a tunable parameter, giving a rich space of implementation alternatives with different accuracy-performance characteristics on different hardware.