The parallel nature of the multi-core architectural design can only be fully exploited by concurrent applications. This status quo pushed productivity to the forefront of the language design concerns. The community is demanding for new solutions in the design, compilation, and implementation of concurrent languages, making this research area one of great importance and impact. To that extent this paper proposes the expression of data parallelism at subroutine level. The calling of a subroutine in this context spawns several execution flows, each operating on distinct partitions of the input dataset. Such computations can be expressed by simply annotating sequential subroutines with data distribution and reduction policies, delegating the management of the parallel execution to a dedicated runtime system. The paper overviews the key concepts of the model, illustrating them with some small programming examples, and describes a Java implementation built on top of the X10 runtime system. A performance evaluation attests that this approach can provide good performance gains without burdening the programmer with the writing of specialized code.
|Title of host publication
|Published - 1 Jan 2012
|IEEE International Conference on High Performance Computing and Communications -
Duration: 1 Jan 2012 → …
|IEEE International Conference on High Performance Computing and Communications
|1/01/12 → …