Analysis of Pivot Sampling in Dual-Pivot Quicksort: A Holistic Analysis of Yaroslavskiy's Partitioning Scheme



Nebel, Markus E, Wild, Sebastian ORCID: 0000-0002-6061-9177 and Martinez, Conrado
(2016) Analysis of Pivot Sampling in Dual-Pivot Quicksort: A Holistic Analysis of Yaroslavskiy's Partitioning Scheme. Algorithmica: an international journal in computer science, 75 (4). 632 - 683.

[img] Text
nebel-wild-martinez-2016.pdf - Accepted Version

Download (916kB) | Preview

Abstract

The new dual-pivot Quicksort by Vladimir Yaroslavskiy—used in Oracle’s Java runtime library since version 7—features intriguing asymmetries. They make a basic variant of this algorithm use less comparisons than classic single-pivot Quicksort. In this paper, we extend the analysis to the case where the two pivots are chosen as fixed order statistics of a random sample. Surprisingly, dual-pivot Quicksort then needs more comparisons than a corresponding version of classic Quicksort, so it is clear that counting comparisons is not sufficient to explain the running time advantages observed for Yaroslavskiy’s algorithm in practice. Consequently, we take a more holistic approach and give also the precise leading term of the average number of swaps, the number of executed Java Bytecode instructions and the number of scanned elements, a new simple cost measure that approximates I/O costs in the memory hierarchy. We determine optimal order statistics for each of the cost measures. It turns out that the asymmetries in Yaroslavskiy’s algorithm render pivots with a systematic skew more efficient than the symmetric choice. Moreover, we finally have a convincing explanation for the success of Yaroslavskiy’s algorithm in practice: compared with corresponding versions of classic single-pivot Quicksort, dual-pivot Quicksort needs significantly less I/Os, both with and without pivot sampling.

Item Type: Article
Uncontrolled Keywords: Quicksort, Dual-pivot, Yaroslavskiy's partitioning method, Median of three, Average-case analysis, I/O operations, External-memory model
Depositing User: Symplectic Admin
Date Deposited: 21 Oct 2019 14:05
Last Modified: 13 Aug 2022 05:18
DOI: 10.1007/s00453-015-0041-7
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3058958