Battling Memory Requirements of Array Programming through Streaming

Aktivitet: Tale eller præsentation - typerForedrag og mundtlige bidrag

James Emil Avery - Foredragsholder

  • Datalogisk Institut
A barrier to efficient array programming, for example in Python/NumPy, is that fully vectorized algorithms can lead to explosions in memory use. The present paper presents a solution to this problem using {\em array streaming}, implemented in the automatic parallelization high-performance framework {\em Bohrium}. This makes it possible to utilize array programming in Python/NumPy code directly, even when the apparent memory requirement exceeds the machine capacity, since the automatic streaming eliminates the temporary memory overhead by performing calculations in per-thread registers. Using Bohrium, we automatically fuse, JIT-compile, and execute NumPy array operations on GPGPUs without modification to the user programs. We present performance evaluations of three benchmarks, all of which show dramatic reductions in memory use from streaming, yielding corresponding improvements in speed and utilization of GPGPU-cores. The streaming-enabled Bohrium effortlessly runs programs on input sizes much beyond sizes that crash on pure NumPy due to exhausting system memory.
23 jun. 2016

Begivenhed (Konference)

TitelInternational Supercomputing Conference 2016
Forkortet titelISC
AfholdelsesstedHotel Marriot

ID: 163099361