<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://mw.hh.se/wg211/index.php?action=history&amp;feed=atom&amp;title=WG211%2FM7Kelly</id>
	<title>WG211/M7Kelly - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://mw.hh.se/wg211/index.php?action=history&amp;feed=atom&amp;title=WG211%2FM7Kelly"/>
	<link rel="alternate" type="text/html" href="http://mw.hh.se/wg211/index.php?title=WG211/M7Kelly&amp;action=history"/>
	<updated>2026-04-05T20:57:51Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.5</generator>
	<entry>
		<id>http://mw.hh.se/wg211/index.php?title=WG211/M7Kelly&amp;diff=231&amp;oldid=prev</id>
		<title>Admin: 1 revision</title>
		<link rel="alternate" type="text/html" href="http://mw.hh.se/wg211/index.php?title=WG211/M7Kelly&amp;diff=231&amp;oldid=prev"/>
		<updated>2011-12-12T10:06:27Z</updated>

		<summary type="html">&lt;p&gt;1 revision&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;[[Category:WG211]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;SIMD and SIMT Code Generation for Visual Effects using indexed dependence&lt;br /&gt;
metadata&amp;lt;/h1&amp;gt;&lt;br /&gt;
&amp;lt;h3&amp;gt;Paul H J Kelly&amp;lt;/h3&amp;gt; Imperial College London, UK&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
This talk is about a project to build a domain-specific library for film&lt;br /&gt;
post-production visual effects (VFX).  Visual effects algorithms are&lt;br /&gt;
structured using &amp;quot;indexed functors&amp;quot;, which characterise image operations&lt;br /&gt;
with dependence metadata.  Our source-to-source&lt;br /&gt;
code generator targets SSE and CUDA devices from a single-source&lt;br /&gt;
representation and implements an array of parallelism- and&lt;br /&gt;
bandwidth-enhancing optimisations, producing C++ and CUDA as output for&lt;br /&gt;
vendor compilers. Metadata captures and communicates&lt;br /&gt;
high-level dependence and data access pattern information to the code&lt;br /&gt;
generator, eliminating the need for interprocedural dataflow and&lt;br /&gt;
alias analyses to recover this information from the code. Our kernel&lt;br /&gt;
transformations rely on metadata for safety, which greatly simplifies&lt;br /&gt;
their implementation and enables more complex optimisations to be applied.&lt;br /&gt;
We exploit the polyhedral model in schedule&lt;br /&gt;
optimisation to manage the large fused, fragmented loop structures which are&lt;br /&gt;
key to improving temporal locality on CPUs. In our evaluation&lt;br /&gt;
we demonstrate 1.2-7.0x speed-ups on Intel and AMD multicore CPUs and&lt;br /&gt;
1.2-6.6x speed-ups on CUDA GPUs.&lt;br /&gt;
&lt;br /&gt;
(joint work with Jay Cornwall, Lee Howes (at Imperial) and Bruno Nicoletti&lt;br /&gt;
and Phil Parsonage at The Foundry Ltd).&lt;br /&gt;
&lt;br /&gt;
* [[Media:kelly-wg09.ppt | kelly-wg09.ppt ]]: Slides&lt;br /&gt;
&lt;br /&gt;
==File Attachments== &lt;br /&gt;
&lt;br /&gt;
*[[Media:kelly-wg09.ppt | kelly-wg09.ppt]]&lt;/div&gt;</summary>
		<author><name>Admin</name></author>
	</entry>
</feed>