Marius, Kuhrt, Bernhard, Seeger, Wild, Sebastian
ORCID: 0000-0002-6061-9177 and Goetz, Graefe
(2025)
Adaptive sorting for large keys, strings, and database rows.
In: Datenbanksysteme für Business, Technologie und Web (BTW 2025).
Abstract
Sorting a database table may require expensive comparisons, e.g., due to column count or column types such as long or international strings. Therefore, optimizing the count and cost of comparisons is important. Adaptive sort algorithms avoid comparisons by exploiting pre-existing order in the input. If $N$ keys happen to be sorted or reverse-sorted, $N-1$ comparisons suffice, in contrast to $\log_2(N!)$ comparisons in the expected and worst cases. Ideally, adaptive sorting ensures graceful degradation from the best case to the expected case, e.g., by merging sorted runs pre-existing in the input. Adaptive sorting has proven successful for integer keys but not for large keys, e.g., when comparing symbols or characters in text strings, bytes or words in binary strings, or column values in database rows. On the other hand, using longest common prefixes or offset-value codes, sorting $N$ strings with $K$ characters, bytes, or columns can be limited to $N \times K$ comparisons. If those comparisons are the dominant cost of sorting, e.g., due to interpreted execution of predicates etc., the cost of sorting is linear in the input size and, in fact, equal to the cost of verifying a claimed and correct sort order. By leveraging and combining a variety of techniques, some old and proven in practice, some new yet equally sound, we introduce sorting techniques that are efficient, scalable, and adaptive; that degrade gracefully from $N-1$ to $\log_2(N!)$ comparisons of strings or of database rows; and that guarantee at most $1.042 N \times K$ comparisons of bytes or of column values in the worst case. We offer algorithm analyses and experimental results to test our hypotheses and support our claims.
| Item Type: | Conference Item (Unspecified) |
|---|---|
| Divisions: | Faculty of Science and Engineering Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science |
| Depositing User: | Symplectic Admin |
| Date Deposited: | 31 Jul 2025 10:34 |
| Last Modified: | 31 Jul 2025 10:34 |
| DOI: | 10.18420/BTW2025-10 |
| Open Access URL: | https://doi.org/10.18420/BTW2025-10 |
| URI: | https://livrepository.liverpool.ac.uk/id/eprint/3193817 |
Altmetric
Altmetric