SDCPPMU presentation

Customizing the three-way comparison operator

Default comparison order

Given a class C, a subobject list is formed by the following subjects in order:

  • The direct base class subobjects of C, in declaration order.
  • The the non-static data members of C, in declaration order.
    • If any member subobject is of array type, it is expanded to the sequence of its elements, in the order of increasing subscript. The expansion is recursive: array elements of array types will be expanded again until there is no subobject of array type.

Reference: https://en.cppreference.com/w/cpp/language/default_comparisons#Default_comparison_order

Change comparison order

When order of non-static data members are changed the default operator⇔ also changes.

Change comparison order

  • Fix by restoring the comparison order, and declare operator==
  • A defaulted operator⇔ will also default operator==,
  • when operator⇔ is user defined, operator== is not implicitly defaulted

Change comparison order

Use std::tuple

Transform data member before comparison

reverse comparison

Swap arguments

Change the default three way comparison function for a data member

  • ⇔ on floats returns a std::partial_ordering object
  • But what if we want to use std::strong_order (totalOrder in ISO/IEC/IEEE 60559)?
  • The values are ordered in the following sequence:
  • Another example: case insensitive string compare

Change the default three way comparison function

  • One possible fix will be to write a wrapper class with a desired operator⇔
  • But this will require a wrapper class for every data member that needs a custom operator⇔

Change the default three way comparison function, reusable code

Use templates for reusable code

Summary

For a 3-way comparison for a class, we can

  • change the comparison order
  • transform a data member before comparing
  • reverse comparisons
  • change the default 3-way comparison function for a data member

Caveats of using std::tuple's operator<=>

Why do we need operator==

If operator== is removed for strong_order_float, the return type of operator<=> for E5 is std::weak_ordering

synth-three-way

class's defaulted operator<=> std::tuple's operator<=>

From cppreference.com:

  1. operator<=>(std::tuple), item (7)
  2. synth-three-way
  3. std::three_way_comparable_with, item (2)
  4. __WeaklyEqualityComparableWith, item (3) requires operator==

synth-three-way without operator==

Surprising behaviour

  • std::partial_ordering for operator<=>
  • std::weak_ordering for synth-three-way

std::partial_ordering values cannot be converted to std::weak_ordering values

std::compare_three_way

  • The algorithm std::lexicographical_compare_three_way uses function object std::compare_three_way as the default comparator
  • std::compare_three_way will not compile on classes without operator== as it needs to satisfy std::three_way_comparable_with concept
  • Simalar to synth-three-way, std::compare_three_way never executes code in the class's operator==

Summary

  • With operator<=>, if operator== is missing
    defaulted <=> of a class std::tuple and some other std containers std::compare_three_way
    <=> on each data member synth-three-way is used on each element, converts to std::weak_ordering fails to compile
  • for a class involved in a synth-three-way comparison, operator== for that class will be needed for synth-three-way to directly use the class's operator<=>

A more generic equal_range

lower_bound, upper_bound and equal_range

Binary searches defined in header <algorithm>
From cppreference:

lower_bound returns an iterator to the first element not less than the given value
upper_bound returns an iterator to the first element greater than a certain value
equal_range returns range of elements matching a specific key

For value=4,

lower_bound, upper_bound and equal_range

Note

This section will focus on integers and floats (without NaNs and infinities) as datatypes but can be extended to user-defined types.

For example, with comp as a comparator function:
a<b can be extended to comp(a,b)
a<=b can be extended to !comp(b,a)

Problem

  • In a sorted array, find the range that satisfies the condition
  • lower ≤ x ≤ upper, for all elements x in the range
  • If lower = upper, we can use equal_range
  • If lower<upper, we can use lower_bound with upper_bound
  • Multiple accesses to the same elements in the array

Optimize

or
  • If first2 is near first, not much reduction in search space
  • Arbitrary choice between the 2 alternatives

Make equal_range more generic

  • Main idea: use equal_range with a custom comparator that derives from the sorting comparator
  • equal_range requires all elements in the target range to be “equal” to a particular value
  • Write a comparator function that treats all objects in the closed interval [lower, upper] to be equal

Code

Quick bench, GCC 13

lower_bound_upper_bound - using lower_bound and upper_bound with the same iterator pair

lower_bound_upper_bound2 - use the result of lower_bound to reduce the search space for upper_bound

generic_equal_range - use equal_range with a custom comparator

  • 256 arrays of length 512

    lower_upper_range benchmark

  • 128 arrays of length 512

    lower_upper_range benchmark

  • 64 arrays of length 512

    lower_upper_range benchmark

Benchmarks, Skylake, GCC 13

lower_bound_upper_bound - using lower_bound and upper_bound with the same iterator pair

lower_bound_upper_bound2 - use the result of lower_bound to reduce the search space for upper_bound

generic_equal_range - use equal_range with a custom comparator

256 arrays of length 512, -ffast-math

  • -march=broadwell

    lower_upper_range benchmark

  • -march=skylake-avx512

    lower_upper_range benchmark

Algorithm for equal_range

High-level overview: Find a position in the result range with binary search, then use lower_bound and upper_bound to find the range.

  1. Initialize iterators: left=begin(), right=end(), mid=begin()+size()/2
  2. Evaluate value at mid
  3. If mid is not in the target subrange, update left or right.
  4. Set mid=left+(right-left)/2
  5. Repeat #2, #3 and #4 until mid is in the target subrange
  6. Then complete with lower_bound and upper_bound

l:left, m:mid, r:right, +:target subrange

Source code to implementations of equal_range: LLVM, GCC, MSVC

Summary

  • By changing the comparator function argument to equal_range, we can make it more generic.
  • It can replace all usages of a pair of lower_bound/upper_bound call where lower < upper
  • With better performance

Thank you