LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems

Alexander Collins, Tim Harris, Murray Cole, Christian Fensch

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)
100 Downloads (Pure)

Abstract

Running multiple parallel programs on multi-socket multi-core machines using commodity hardware is increasingly common for data analytics and cluster workloads. These workloads exhibit bursty behavior and are rarely tuned to specific hardware. This leads to poor performance due to suboptimal decisions, such as poor choices for which programs run on the same socket. Consequently, there is a renewed importance for schedulers to consider the structure of the machine alongside the dynamic behavior of workloads.

This paper introduces LIRA, a spatial-scheduling heuristic for selecting which parallel applications should run on the same socket in a multi-socket machine. We devise two flavors of scheduler using this heuristic: (i) LIRA-static which collects performance data in an offline profiling step to decide the schedule when a program starts, and (ii) LIRA-adaptive which operates dynamically based on hardware performance counters available on off-the-shelf hardware. LIRA-adaptive does not require separate, offline workload characterization runs, and it accommodates a dynamically changing mix of applications, including those with phase changes.

We evaluate LIRA-static and LIRA-adaptive using programs from SPEC OMP and two graph analytics projects. We compare our approaches to the best possible performance obtained across all static mappings of 4 programs to 2 sockets, the libgomp OpenMP runtime that comes with GCC and Callisto, a state-of-the-art scheduler. LIRA-static improves system throughput by 10% compared to libgomp, and LIRA-adaptive improves system throughput by 13%. Compared to Callisto, LIRA-adaptive improves performance in 30 of the 32 combinations tested, with an improvement in system throughput of up to 7%, and 3% on average over 32 combinations.
Original languageEnglish
Title of host publicationROSS '15 Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery
Pages1-8
Number of pages8
ISBN (Electronic)9781450336062
DOIs
Publication statusPublished - 16 Jun 2015

Keywords

  • Multi-core
  • multi-socket
  • thread placement
  • adaptive scheduling

Fingerprint

Dive into the research topics of 'LIRA: Adaptive Contention-Aware Thread Placement for Parallel Runtime Systems'. Together they form a unique fingerprint.

Cite this