背景

Google Abseil 实现了一个高性能的哈希表 Swiss table。想知道有多高可以参考这里的 benchmark:简单来说,除了迭代遍历操作不算出众以外,其它都很不错。它的设计亮点是使用 SIMD 加速的查找操作、缓存友好的线性探测,以及(相比标准库主流实现)更加紧凑的内存布局,这些你也可以参考官方给的设计笔记。剩下的就不废话了,看代码吧。

NOTES:

Index

由于我的代码阅读习惯倾向于深度优先(见目录页),作为文章来说肯定是阅读不友好的。

所以这里先做一个索引页,整理出目录页不方便查看的位置:

内存布局

类型关系

首先需要了解 absl::flat_hash_map 内部类的关系才能梳理内存布局。

template <class K, class V, class Hash = DefaultHashContainerHash<K>,
          class Eq = DefaultHashContainerEq<K>,
          class Allocator = std::allocator<std::pair<const K, V>>>
class ABSL_ATTRIBUTE_OWNER flat_hash_map
    : public absl::container_internal::raw_hash_map<
          absl::container_internal::FlatHashMapPolicy<K, V>, Hash, Eq,
          Allocator> {
  using Base = typename flat_hash_map::raw_hash_map;
  flat_hash_map() {}
  using Base::Base;
  using Base::begin;
  using Base::end;
  // ...
};

template <class Policy, class Hash, class Eq, class Alloc>
class raw_hash_map : public raw_hash_set<Policy, Hash, Eq, Alloc> {
  // ...
};

template <class Policy, class Hash, class Eq, class Alloc>
class raw_hash_set {
  // Bundle together CommonFields plus other objects which might be empty.
  // CompressedTuple will ensure that sizeof is not affected by any of the empty
  // fields that occur after CommonFields.
  absl::container_internal::CompressedTuple<CommonFields, hasher, key_equal,
                                            allocator_type>
      settings_{CommonFields::CreateDefault<SooEnabled()>(), hasher{},
                key_equal{}, allocator_type{}};
};

整个哈希表唯一的成员是继承于 raw_hash_set 基类的 settings_。从注释可以得知,这个唯一的成员作为 CompressedTuple 类型使用是为了空基类优化(EBO),因为容器的 hasherkey_equalallocator_type 都可以是无状态的类型。另外需要关心的是 CommonFields 类型,这里才是常规意义上的成员。但是 CommonFields 类型的对象构造是以一个含 SooEnabled 标记的工厂函数去返回给 settings_,这也暗示了小对象优化(SOO)。

CompressedTuple

template <typename... Ts>
class ABSL_INTERNAL_COMPRESSED_TUPLE_DECLSPEC CompressedTuple
    : private internal_compressed_tuple::CompressedTupleImpl<
          CompressedTuple<Ts...>, absl::index_sequence_for<Ts...>,
          internal_compressed_tuple::ShouldAnyUseBase<Ts...>()> {
 private:
  template <int I>
  using ElemT = internal_compressed_tuple::ElemT<CompressedTuple, I>;

  template <int I>
  using StorageT = internal_compressed_tuple::Storage<ElemT<I>, I>;

  explicit constexpr CompressedTuple(const Ts&... base)
      : CompressedTuple::CompressedTupleImpl(absl::in_place, base...) {}

  template <int I>
  constexpr ElemT<I>& get() & {
    return StorageT<I>::get();
  }

  // ...
};

template <typename D, typename I, bool ShouldAnyUseBase>
struct ABSL_INTERNAL_COMPRESSED_TUPLE_DECLSPEC CompressedTupleImpl;

template <typename... Ts, size_t... I, bool ShouldAnyUseBase>
struct ABSL_INTERNAL_COMPRESSED_TUPLE_DECLSPEC CompressedTupleImpl<
    CompressedTuple<Ts...>, absl::index_sequence<I...>, ShouldAnyUseBase>
    : uses_inheritance,
      Storage<Ts, std::integral_constant<size_t, I>::value>... {
  constexpr CompressedTupleImpl() = default;
  template <typename... Vs>
  explicit constexpr CompressedTupleImpl(absl::in_place_t, Vs&&... args)
      : Storage<Ts, I>(absl::in_place, std::forward<Vs>(args))... {}
  friend CompressedTuple<Ts...>;
};

template <typename... Ts, size_t... I>
struct ABSL_INTERNAL_COMPRESSED_TUPLE_DECLSPEC CompressedTupleImpl<
    CompressedTuple<Ts...>, absl::index_sequence<I...>, false>
    : Storage<Ts, std::integral_constant<size_t, I>::value, false>... {
  constexpr CompressedTupleImpl() = default;
  template <typename... Vs>
  explicit constexpr CompressedTupleImpl(absl::in_place_t, Vs&&... args)
      : Storage<Ts, I, false>(absl::in_place, std::forward<Vs>(args))... {}
  friend CompressedTuple<Ts...>;
};

// The storage class provides two specializations:
//  - For empty classes, it stores T as a base class.
//  - For everything else, it stores T as a member.
template <typename T, size_t I, bool UseBase = ShouldUseBase<T>()>
struct Storage {
  T value;
  constexpr Storage() = default;
  template <typename V>
  explicit constexpr Storage(absl::in_place_t, V&& v)
      : value(std::forward<V>(v)) {}
  constexpr const T& get() const& { return value; }
  constexpr T& get() & { return value; }
  constexpr const T&& get() const&& { return std::move(*this).value; }
  constexpr T&& get() && { return std::move(*this).value; }
};

template <typename T, size_t I>
struct ABSL_INTERNAL_COMPRESSED_TUPLE_DECLSPEC Storage<T, I, true> : T {
  constexpr Storage() = default;

  template <typename V>
  explicit constexpr Storage(absl::in_place_t, V&& v) : T(std::forward<V>(v)) {}

  constexpr const T& get() const& { return *this; }
  constexpr T& get() & { return *this; }
  constexpr const T&& get() const&& { return std::move(*this); }
  constexpr T&& get() && { return std::move(*this); }
};

其实整个类的动机在前面已经说明了,也没有继续分析的必要。简单来说,CompressedTuple 通过 CompressedTupleImpl -> Storage -> T 的继承链路(类型推导 → 存储方式 → 实质类型)处理空类体积问题,然后访问成员是通过 index_sequence 编译时确定,因此使用方式也和 std::tuple 差不多(get<>())。

CommonFields

// CommonFields hold the fields in raw_hash_set that do not depend
// on template parameters. This allows us to conveniently pass all
// of this state to helper functions as a single argument.
class CommonFields : public CommonFieldsGenerationInfo {
 public:
  CommonFields() : capacity_(0), size_(0), heap_or_soo_(EmptyGroup()) {}
  explicit CommonFields(soo_tag_t) : capacity_(SooCapacity()), size_(0) {}
  explicit CommonFields(full_soo_tag_t)
      : capacity_(SooCapacity()), size_(size_t{1} << HasInfozShift()) {}

  // ...

  // The number of slots in the backing array. This is always 2^N-1 for an
  // integer N. NOTE: we tried experimenting with compressing the capacity and
  // storing it together with size_: (a) using 6 bits to store the corresponding
  // power (N in 2^N-1), and (b) storing 2^N as the most significant bit of
  // size_ and storing size in the low bits. Both of these experiments were
  // regressions, presumably because we need capacity to do find operations.
  size_t capacity_;

  // The size and also has one bit that stores whether we have infoz.
  size_t size_;

  // Either the control/slots pointers or the SOO slot.
  HeapOrSoo heap_or_soo_;
};

从这里可以看出 CommonFields = [capacity, size, ?] 的内存布局,最后一个成员 heap_or_soo_ 表意模糊是因为又套了一层工程裹脚布(保命声明:作为一个基础库这是有必要的),先剧透一下:? = [control, slot_array],这个注释也有提到。

说到注释,这里 Google 记录了一段有趣的负优化历史。他们尝试将 capacity_size_t 压缩到同一个 size_t 类型,毕竟对于二的幂使用 6 位就足够表示 N 了。但是实测下来反而是负反馈。

NOTES:

  • CommonFieldsGenerationInfo 是仅调试时使用,这里忽略。
  • 这里还提到 infoz,这也是调试(统计)目的,忽略。
  • 这里的布局并不完善,可以参考后续的 InitializeSlots 流程。

HeapOrSoo

// Manages the backing array pointers or the SOO slot. When raw_hash_set::is_soo
// is true, the SOO slot is stored in `soo_data`. Otherwise, we use `heap`.
union HeapOrSoo {
  HeapOrSoo() = default;
  explicit HeapOrSoo(ctrl_t* c) : heap(c) {}

  ctrl_t*& control() {
    ABSL_SWISSTABLE_IGNORE_UNINITIALIZED_RETURN(heap.control);
  }
  ctrl_t* control() const {
    ABSL_SWISSTABLE_IGNORE_UNINITIALIZED_RETURN(heap.control);
  }
  MaybeInitializedPtr& slot_array() {
    ABSL_SWISSTABLE_IGNORE_UNINITIALIZED_RETURN(heap.slot_array);
  }
  MaybeInitializedPtr slot_array() const {
    ABSL_SWISSTABLE_IGNORE_UNINITIALIZED_RETURN(heap.slot_array);
  }
  void* get_soo_data() {
    ABSL_SWISSTABLE_IGNORE_UNINITIALIZED_RETURN(soo_data);
  }
  const void* get_soo_data() const {
    ABSL_SWISSTABLE_IGNORE_UNINITIALIZED_RETURN(soo_data);
  }

  HeapPtrs heap;
  unsigned char soo_data[sizeof(HeapPtrs)];
};

struct HeapPtrs {
  HeapPtrs() = default;
  explicit HeapPtrs(ctrl_t* c) : control(c) {}

  // The control bytes (and, also, a pointer near to the base of the backing
  // array).
  //
  // This contains `capacity + 1 + NumClonedBytes()` entries, even
  // when the table is empty (hence EmptyGroup).
  //
  // Note that growth_info is stored immediately before this pointer.
  // May be uninitialized for SOO tables.
  ctrl_t* control;

  // The beginning of the slots, located at `SlotOffset()` bytes after
  // `control`. May be uninitialized for empty tables.
  // Note: we can't use `slots` because Qt defines "slots" as a macro.
  MaybeInitializedPtr slot_array;
};

注意 HeapOrSoo 是一个 union,因此是在两个指针大小内(control + slot_array)尝试做 SOO。后面再具体看怎么用。

初始化

default

从前面的构造函数可以看出,Base::Base 基本什么都不干,唯独需要 CommonFiled 构造。

class CommonFields /* ... */ {
  // 这里传递 raw_hash_set::SooEnabled()
  template <bool kSooEnabled>
  static CommonFields CreateDefault() {
    return kSooEnabled ? CommonFields{soo_tag_t{}} : CommonFields{};
  }
  // ...
};

class raw_hash_set /* ... */ {
  constexpr static bool SooEnabled() {
    // 首个条件默认为 true
    return PolicyTraits::soo_enabled() &&
           sizeof(slot_type) <= sizeof(HeapOrSoo) &&
           alignof(slot_type) <= alignof(HeapOrSoo);
  }
  // ...
};

class CommonFields /* ... */ {
  explicit CommonFields(soo_tag_t) : capacity_(SooCapacity()), size_(0) {}
  // ...
};

// We only allow a maximum of 1 SOO element, which makes the implementation
// much simpler. Complications with multiple SOO elements include:
// - Satisfying the guarantee that erasing one element doesn't invalidate
//   iterators to other elements means we would probably need actual SOO
//   control bytes.
// - In order to prevent user code from depending on iteration order for small
//   tables, we would need to randomize the iteration order somehow.
constexpr size_t SooCapacity() { return 1; }

总之,由于构造过程默认使用了 SOO,而 SOO 由于复杂度只可能是规模为 1,所以 capacity_ = 1, size_ = 0

initializer list

absl::flat_hash_map<int, int> example { {998, 244} };

那如果是通过初始化列表等方式去构造的话,会怎样?

  // Instead of accepting std::initializer_list<value_type> as the first
  // argument like std::unordered_set<value_type> does, we have two overloads
  // that accept std::initializer_list<T> and std::initializer_list<init_type>.
  // This is advantageous for performance.
  //
  //   // Turns {"abc", "def"} into std::initializer_list<std::string>, then
  //   // copies the strings into the set.
  //   std::unordered_set<std::string> s = {"abc", "def"};
  //
  //   // Turns {"abc", "def"} into std::initializer_list<const char*>, then
  //   // copies the strings into the set.
  //   absl::flat_hash_set<std::string> s = {"abc", "def"};
  //
  // The same trick is used in insert().
  //
  // The enabler is necessary to prevent this constructor from triggering where
  // the copy constructor is meant to be called.
  //
  //   absl::flat_hash_set<int> a, b{a};
  //
  // RequiresNotInit<T> is a workaround for gcc prior to 7.1.
  // 默认不走这里
  template <class T, RequiresNotInit<T> = 0, RequiresInsertable<T> = 0>
  raw_hash_set(std::initializer_list<T> init, size_t bucket_count = 0,
               const hasher& hash = hasher(), const key_equal& eq = key_equal(),
               const allocator_type& alloc = allocator_type())
      : raw_hash_set(init.begin(), init.end(), bucket_count, hash, eq, alloc) {}

  // 走这里,init_type = std::pair</*non const*/ key_type, mapped_type>
  raw_hash_set(std::initializer_list<init_type> init, size_t bucket_count = 0,
               const hasher& hash = hasher(), const key_equal& eq = key_equal(),
               const allocator_type& alloc = allocator_type())
      : raw_hash_set(init.begin(), init.end(), bucket_count, hash, eq, alloc) {}

  // 转换为迭代器风格
  template <class InputIter>
  raw_hash_set(InputIter first, InputIter last, size_t bucket_count = 0,
               const hasher& hash = hasher(), const key_equal& eq = key_equal(),
               const allocator_type& alloc = allocator_type())
      : raw_hash_set(SelectBucketCountForIterRange(first, last, bucket_count),
                     hash, eq, alloc) {
    insert(first, last);
  }

template <class InputIter>
size_t SelectBucketCountForIterRange(InputIter first, InputIter last,
                                     size_t bucket_count) {
  if (bucket_count != 0) {
    return bucket_count;
  }
  using InputIterCategory =
      typename std::iterator_traits<InputIter>::iterator_category;
  if (std::is_base_of<std::random_access_iterator_tag,
                      InputIterCategory>::value) {
    return GrowthToLowerboundCapacity(
        static_cast<size_t>(std::distance(first, last)));
  }
  return 0;
}

// Given `growth`, "unapplies" the load factor to find how large the capacity
// should be to stay within the load factor.
//
// This might not be a valid capacity and `NormalizeCapacity()` should be
// called on this.
inline size_t GrowthToLowerboundCapacity(size_t growth) {
  // `growth*8/7`
  // kWidth 表示 slots per group
  // SSE2 的 kWidth == 16,不符合条件
  if (Group::kWidth == 8 && growth == 7) {
    // x+(x-1)/7 does not work when x==7.
    return 8;
  }
  // 允许比 growth 多一点,反向使用 load factor (7/8)
  return growth + static_cast<size_t>((static_cast<int64_t>(growth) - 1) / 7);
}

  ABSL_ATTRIBUTE_NOINLINE explicit raw_hash_set(
      size_t bucket_count, const hasher& hash = hasher(),
      const key_equal& eq = key_equal(),
      const allocator_type& alloc = allocator_type())
      : settings_(CommonFields::CreateDefault<SooEnabled()>(), hash, eq,
                  alloc) {
    if (bucket_count > (SooEnabled() ? SooCapacity() : 0)) {
      resize(NormalizeCapacity(bucket_count));
    }
  }

在 SOO 的情况下,我们这里的示例为 bucket_count == 1,因此构造并不需要 resize(),只需 insert()

NOTE: 注释里提到的初始化列表为 std::initializer_list<init_type> 则是设计笔记中提到的延迟构造。具体的 init_type 定义在这里,约等于 std::pair<Key, Value>,注意这里没有标准库要求的 const Key

插入操作

insert

  template <class InputIt>
  void insert(InputIt first, InputIt last) {
    for (; first != last; ++first) emplace(*first);
  }

  // This overload kicks in if we can deduce the key from args. This enables us
  // to avoid constructing value_type if an entry with the same key already
  // exists.
  //
  // For example:
  //
  //   flat_hash_map<std::string, std::string> m = { {"abc", "def"} };
  //   // Creates no std::string copies and makes no heap allocations.
  //   m.emplace("abc", "xyz");
  template <class... Args, typename std::enable_if<
                               IsDecomposable<Args...>::value, int>::type = 0>
  std::pair<iterator, bool> emplace(Args&&... args)
      ABSL_ATTRIBUTE_LIFETIME_BOUND {
    // EmplaceDecomposable 只是一个 functor 类型,它的构造没有副作用
    return PolicyTraits::apply(EmplaceDecomposable{*this},
                               std::forward<Args>(args)...);
  }

  // This overload kicks in if we cannot deduce the key from args. It constructs
  // value_type unconditionally and then either moves it into the table or
  // destroys.
  template <class... Args, typename std::enable_if<
                               !IsDecomposable<Args...>::value, int>::type = 0>
  std::pair<iterator, bool> emplace(Args&&... args)
      ABSL_ATTRIBUTE_LIFETIME_BOUND {
    alignas(slot_type) unsigned char raw[sizeof(slot_type)];
    slot_type* slot = to_slot(&raw);

    construct(slot, std::forward<Args>(args)...);
    const auto& elem = PolicyTraits::element(slot);
    return PolicyTraits::apply(InsertSlot<true>{*this, std::move(*slot)}, elem);
  }

这里需要 PolicyTraits::applyEmplaceDecomposable

Policy

// Policy: a policy defines how to perform different operations on
// the slots of the hashtable (see hash_policy_traits.h for the full interface
// of policy).
template <class Policy, class Hash, class Eq, class Alloc>
class raw_hash_set {
  using PolicyTraits = hash_policy_traits<Policy>;
  // ...
};

template <class Policy, class Hash, class Eq, class Alloc>
class raw_hash_map : public raw_hash_set<Policy, Hash, Eq, Alloc> {
  // ...
};

class ABSL_ATTRIBUTE_OWNER flat_hash_map
    : public absl::container_internal::raw_hash_map<
          // 这里是具体的 Policy
          absl::container_internal::FlatHashMapPolicy<K, V>, Hash, Eq,
          Allocator> { /*...*/ };

由于 Abseil 库具有多个 map/set 实现(比如为了迎合标准库的指针稳定性要求,引入 node 变种),因此这里使用 Policy 抽离了一些不同的机制。同时这也是 map 可以继承 set 的原因。

// Defines how slots are initialized/destroyed/moved.
template <class Policy, class = void>
struct hash_policy_traits : common_policy_traits<Policy> {
  // Provides generalized access to the key for elements, both for elements in
  // the table and for elements that have not yet been inserted (or even
  // constructed).  We would like an API that allows us to say: `key(args...)`
  // but we cannot do that for all cases, so we use this more general API that
  // can be used for many things, including the following:
  //
  //   - Given an element in a table, get its key.
  //   - Given an element initializer, get its key.
  //   - Given `emplace()` arguments, get the element key.
  //
  // Implementations of this must adhere to a very strict technical
  // specification around aliasing and consuming arguments:
  //
  // Let `value_type` be the result type of `element()` without ref- and
  // cv-qualifiers. The first argument is a functor, the rest are constructor
  // arguments for `value_type`. Returns `std::forward<F>(f)(k, xs...)`, where
  // `k` is the element key, and `xs...` are the new constructor arguments for
  // `value_type`. It's allowed for `k` to alias `xs...`, and for both to alias
  // `ts...`. The key won't be touched once `xs...` are used to construct an
  // element; `ts...` won't be touched at all, which allows `apply()` to consume
  // any rvalues among them.
  //
  // If `value_type` is constructible from `Ts&&...`, `Policy::apply()` must not
  // trigger a hard compile error unless it originates from `f`. In other words,
  // `Policy::apply()` must be SFINAE-friendly. If `value_type` is not
  // constructible from `Ts&&...`, either SFINAE or a hard compile error is OK.
  //
  // If `Ts...` is `[cv] value_type[&]` or `[cv] init_type[&]`,
  // `Policy::apply()` must work. A compile error is not allowed, SFINAE or not.
  template <class F, class... Ts, class P = Policy>
  static auto apply(F&& f, Ts&&... ts)
      -> decltype(P::apply(std::forward<F>(f), std::forward<Ts>(ts)...)) {
    return P::apply(std::forward<F>(f), std::forward<Ts>(ts)...);
  }
  // ...
};

template <class K, class V>
struct FlatHashMapPolicy {
  template <class F, class... Args>
  static decltype(absl::container_internal::DecomposePair(
      std::declval<F>(), std::declval<Args>()...))
  apply(F&& f, Args&&... args) {
    return absl::container_internal::DecomposePair(std::forward<F>(f),
                                                   std::forward<Args>(args)...);
  }
  // ...
};

// A helper function for implementing apply() in map policies.
template <class F, class... Args>
auto DecomposePair(F&& f, Args&&... args)
    -> decltype(memory_internal::DecomposePairImpl(
        std::forward<F>(f), PairArgs(std::forward<Args>(args)...))) {
  return memory_internal::DecomposePairImpl(
      std::forward<F>(f), PairArgs(std::forward<Args>(args)...));
}

// Given arguments of an std::pair's constructor, PairArgs() returns a pair of
// tuples with references to the passed arguments. The tuples contain
// constructor arguments for the first and the second elements of the pair.
//
// The following two snippets are equivalent.
//
// 1. std::pair<F, S> p(args...);
//
// 2. auto a = PairArgs(args...);
//    std::pair<F, S> p(std::piecewise_construct,
//                      std::move(a.first), std::move(a.second));
inline std::pair<std::tuple<>, std::tuple<>> PairArgs() { return {}; }
template <class F, class S>
std::pair<std::tuple<F&&>, std::tuple<S&&>> PairArgs(F&& f, S&& s) {
  return {std::piecewise_construct, std::forward_as_tuple(std::forward<F>(f)),
          std::forward_as_tuple(std::forward<S>(s))};
}
template <class F, class S>
std::pair<std::tuple<const F&>, std::tuple<const S&>> PairArgs(
    const std::pair<F, S>& p) {
  return PairArgs(p.first, p.second);
}
template <class F, class S>
std::pair<std::tuple<F&&>, std::tuple<S&&>> PairArgs(std::pair<F, S>&& p) {
  return PairArgs(std::forward<F>(p.first), std::forward<S>(p.second));
}
template <class F, class S>
auto PairArgs(std::piecewise_construct_t, F&& f, S&& s)
    -> decltype(std::make_pair(memory_internal::TupleRef(std::forward<F>(f)),
                               memory_internal::TupleRef(std::forward<S>(s)))) {
  return std::make_pair(memory_internal::TupleRef(std::forward<F>(f)),
                        memory_internal::TupleRef(std::forward<S>(s)));
}

template <class F, class K, class V>
decltype(std::declval<F>()(std::declval<const K&>(), std::piecewise_construct,
                           std::declval<std::tuple<K>>(), std::declval<V>()))
DecomposePairImpl(F&& f, std::pair<std::tuple<K>, V> p) {
  const auto& key = std::get<0>(p.first);
  return std::forward<F>(f)(key, std::piecewise_construct, std::move(p.first),
                            std::move(p.second));
}

policy 看着有点啰嗦,忽略转发的话 apply(f, args...) 就是 f(args...)。然后作为 map policy,它还有必要从 args... 中分离出 key 和其它参数(注意 init_type 的定义,必须要手动分离),因此又写了一大堆 pair 和 tuple 的脏工作。总之它希望 map 特化 apply 能达成 f(key, args...) 的效果。

NOTE: 没细看,翻一半就转用调试器拿结果反推。

Decomposable

  struct EmplaceDecomposable {
    template <class K, class... Args>
    std::pair<iterator, bool> operator()(const K& key, Args&&... args) const {
      auto res = s.find_or_prepare_insert(key);
      if (res.second) {
        s.emplace_at(res.first, std::forward<Args>(args)...);
      }
      return res;
    }
    raw_hash_set& s;
  };

目前上下文的 apply 具体操作是 EmplaceDecomposable,字面意思是查找对应 key 然后 emplace_at

slot

  // Constructs the value in the space pointed by the iterator. This only works
  // after an unsuccessful find_or_prepare_insert() and before any other
  // modifications happen in the raw_hash_set.
  //
  // PRECONDITION: iter was returned from find_or_prepare_insert(k), where k is
  // the key decomposed from `forward<Args>(args)...`, and the bool returned by
  // find_or_prepare_insert(k) was true.
  // POSTCONDITION: *m.iterator_at(i) == value_type(forward<Args>(args)...).
  template <class... Args>
  void emplace_at(iterator iter, Args&&... args) {
    construct(iter.slot(), std::forward<Args>(args)...);

    assert(PolicyTraits::apply(FindElement{*this}, *iter) == iter &&
           "constructed value does not match the lookup key");
  }

查找过程比较复杂,那就先跳过。假设我们找到了某个位置(得到了 iterator 迭代器),后面就做 construct()。整个过程非常简单,但是需要注意迭代器和 slot() 成员函数的复杂性。

  // 简化的迭代器
  class iterator : private /* ... */ {
    ctrl_t* control() const { return ctrl_; }
    slot_type* slot() const { return slot_; }

    // We use EmptyGroup() for default-constructed iterators so that they can
    // be distinguished from end iterators, which have nullptr ctrl_.
    ctrl_t* ctrl_ = EmptyGroup();
    // To avoid uninitialized member warnings, put slot_ in an anonymous union.
    // The member is not initialized on singleton and end iterators.
    union {
      slot_type* slot_;
    };
  };

迭代器本身使用了 union 的技巧了提供合法未初始化的 slot_ 成员,注意它的类型别名 slot_type*

using slot_type = map_slot_type<K, V>;

// The internal storage type for key-value containers like flat_hash_map.
//
// It is convenient for the value_type of a flat_hash_map<K, V> to be
// pair<const K, V>; the "const K" prevents accidental modification of the key
// when dealing with the reference returned from find() and similar methods.
// However, this creates other problems; we want to be able to emplace(K, V)
// efficiently with move operations, and similarly be able to move a
// pair<K, V> in insert().
//
// The solution is this union, which aliases the const and non-const versions
// of the pair. This also allows flat_hash_map<const K, V> to work, even though
// that has the same efficiency issues with move in emplace() and insert() -
// but people do it anyway.
//
// If kMutableKeys is false, only the value member can be accessed.
//
// If kMutableKeys is true, key can be accessed through all slots while value
// and mutable_value must be accessed only via INITIALIZED slots. Slots are
// created and destroyed via mutable_value so that the key can be moved later.
//
// Accessing one of the union fields while the other is active is safe as
// long as they are layout-compatible, which is guaranteed by the definition of
// kMutableKeys. For C++11, the relevant section of the standard is
// https://timsong-cpp.github.io/cppwp/n3337/class.mem#19 (9.2.19)
template <class K, class V>
union map_slot_type {
  map_slot_type() {}
  ~map_slot_type() = delete;
  using value_type = std::pair<const K, V>;
  using mutable_value_type =
      std::pair<absl::remove_const_t<K>, absl::remove_const_t<V>>;

  value_type value;
  mutable_value_type mutable_value;
  absl::remove_const_t<K> key;
};

slot_type 是设计笔记中所提到的优化,也就是内部使用 mutable Key,外部表现为 const Key。注释提到通过 union 访问活跃对象以外的字段是安全的,不会违背严格别名规则。我已经忘掉这些规则了,就当它说得对。

construct

  template <typename... Args>
  inline void construct(slot_type* slot, Args&&... args) {
    PolicyTraits::construct(&alloc_ref(), slot, std::forward<Args>(args)...);
  }

  // PRECONDITION: `slot` is UNINITIALIZED
  // POSTCONDITION: `slot` is INITIALIZED
  template <class Alloc, class... Args>
  static void construct(Alloc* alloc, slot_type* slot, Args&&... args) {
    Policy::construct(alloc, slot, std::forward<Args>(args)...);
  }

  template <class Allocator, class... Args>
  static void construct(Allocator* alloc, slot_type* slot, Args&&... args) {
    slot_policy::construct(alloc, slot, std::forward<Args>(args)...);
  }

  // If pair<const K, V> and pair<K, V> are layout-compatible, we can accept one
  // or the other via slot_type. We are also free to access the key via
  // slot_type::key in this case.
  using kMutableKeys = memory_internal::IsLayoutCompatible<K, V>;

  template <class Allocator, class... Args>
  static void construct(Allocator* alloc, slot_type* slot, Args&&... args) {
    emplace(slot);
    if (kMutableKeys::value) {
      // 这里 absl 是直接 using std::... 实现(历史原因保留命名)
      absl::allocator_traits<Allocator>::construct(*alloc, &slot->mutable_value,
                                                   std::forward<Args>(args)...);
    } else {
      absl::allocator_traits<Allocator>::construct(*alloc, &slot->value,
                                                   std::forward<Args>(args)...);
    }
  }

  // 为了处理严格别名
  static void emplace(slot_type* slot) {
    // The construction of union doesn't do anything at runtime but it allows us
    // to access its members without violating aliasing rules.
    new (slot) slot_type;
  }

总之,insert() 流程找到 iterator.slot() 后,后续 construct() 就是一顿 placement new。

prepare insert

  // Attempts to find `key` in the table; if it isn't found, returns an iterator
  // where the value can be inserted into, with the control byte already set to
  // `key`'s H2. Returns a bool indicating whether an insertion can take place.
  template <class K>
  std::pair<iterator, bool> find_or_prepare_insert(const K& key) {
    AssertOnFind(key);
    // 走这里
    // is_soo() 要求当前 capacity() 不超过 SOO 范围
    if (is_soo()) return find_or_prepare_insert_soo(key);
    // 这个留到后面再解释
    return find_or_prepare_insert_non_soo(key);
  }

续上前面忽略的查找操作:在实际插入前需要查找对应的 key 从而构造出对应的 iterator。

这里区分两种情况,SOO 和 non-SOO。注意这种查找操作是为插入操作定制的,而常规查找接口为 find()。后续分两章来解释这种特殊的查找操作,为了避免歧义,暂称为预插入操作。

预插入(SOO)

  template <class K>
  std::pair<iterator, bool> find_or_prepare_insert_soo(const K& key) {
    // 空容器的首次执行
    if (empty()) {
      const HashtablezInfoHandle infoz = try_sample_soo();
      // 恒为 false
      if (infoz.IsSampled()) {
        resize_with_soo_infoz(infoz);
      } else {
        // 表示 SOO 已填充,size() = 1。后续 empty() 不再符合条件
        // 注意 size_ 成员和 size() 函数并不相等,后者才算外部接口
        common().set_full_soo();
        // 返回这里
        return {soo_iterator(), true};
      }
    // 插入后第二次执行,size() == 1
    } else if (PolicyTraits::apply(EqualElement<K>{key, eq_ref()},
                                   PolicyTraits::element(soo_slot()))) {
      // 如果有匹配(已存在)的 key
      return {soo_iterator(), false};
    // SOO 没有匹配
    } else {
      // 这是预插入阶段,因此没有位置就得扩容(扩容后不再 SOO)
      // NextCapacity(n) 按照 n = n * 2 + 1 扩容,得到 3
      resize(NextCapacity(SooCapacity()));
    }
    // 特定上下文的定位操作
    // 上面 resize() 操作肯定能确定 index==1 已占位
    // 那么只会在 0 和 2 下标处定位
    const size_t index =
        PrepareInsertAfterSoo(hash_ref()(key), sizeof(slot_type), common());
    return {iterator_at(index), true};
  }

  iterator soo_iterator() {
    // 不使用 generation_ptr,即 nullptr
    return {SooControl(), soo_slot(), common().generation_ptr()};
  }

  CommonFields& common() { return settings_.template get<0>(); }

想不明白,谷歌怎么老挂念它那小得可怜的 SOO 容量,毫无意义。

首次操作是比较简单的 SOO 场景,因为必为 empty(),所以直接构造就好。这里 soo_iterator 只是简单的从 common 拿到指针,然后返回。那么接下来就知道 slot 的地址,往里面插入即可。

同理,如果第二次插入操作可以匹配到 SOO slot,那也是直接构造,因为 SOO map 场景 capacity 不会超过 1,只需要 pair.second 提示已存在。如果这一次插入不能匹配,那么需要扩容。

扩容操作

  // Resizes table to the new capacity and move all elements to the new
  // positions accordingly.
  //
  // Note that for better performance instead of
  // find_first_non_full(common(), hash),
  // HashSetResizeHelper::FindFirstNonFullAfterResize(
  //    common(), old_capacity, hash)
  // can be called right after `resize`.
  void resize(size_t new_capacity) {
    raw_hash_set::resize_impl(common(), new_capacity, HashtablezInfoHandle{});
  }

resize_impl

  // Resizes set to the new capacity.
  // It is a static function in order to use its pointer in GetPolicyFunctions.
  ABSL_ATTRIBUTE_NOINLINE static void resize_impl(
      CommonFields& common, size_t new_capacity,
      HashtablezInfoHandle forced_infoz) {
    raw_hash_set* set = reinterpret_cast<raw_hash_set*>(&common);
    assert(IsValidCapacity(new_capacity));
    assert(!set->fits_in_soo(new_capacity));
    // 开启 SOO 且 capacity() 符合 SOO 要求
    const bool was_soo = set->is_soo();
    // slot 插入过元素
    const bool had_soo_slot = was_soo && !set->empty();
    // 得到 7 bit 大小的 H2 值,可以参考官方的设计笔记
    // hash_of(slot)
    // => PolicyTraits::apply(HashElement{hash_ref()}, PolicyTraits::element(slot))
    // => HashElement{hash_ref()}(key, static_cast<std::pair<const K, V>&>(slot->value))
    // => hash_ref()(key)
    //
    // H2(size_t hash): uint8_t
    // => hash & 0x7F
    // 如果此前不存在则返回空的 kEmpty (0b10000000)
    const ctrl_t soo_slot_h2 =
        had_soo_slot ? static_cast<ctrl_t>(H2(set->hash_of(set->soo_slot())))
                     : ctrl_t::kEmpty;
    HashSetResizeHelper resize_helper(common, was_soo, had_soo_slot,
                                      forced_infoz);
    // Initialize HashSetResizeHelper::old_heap_or_soo_. We can't do this in
    // HashSetResizeHelper constructor because it can't transfer slots when
    // transfer_uses_memcpy is false.
    if (PolicyTraits::transfer_uses_memcpy() || !had_soo_slot) {
      // old_heap_or_soo 的初始化,数据原样拷贝,从而使得后续 common 扩容再迁移
      resize_helper.old_heap_or_soo() = common.heap_or_soo();
    } else {
      // to_slot() 从 uchar[] 转为 map_slot_type*
      // 移动 slot,需要销毁操作
      set->transfer(set->to_slot(resize_helper.old_soo_data()),
                    set->soo_slot());
    }
    // 下面准备扩容
    common.set_capacity(new_capacity);
    // Note that `InitializeSlots` does different number initialization steps
    // depending on the values of `transfer_uses_memcpy` and capacities.
    // Refer to the comment in `InitializeSlots` for more details.
    const bool grow_single_group =
        // 执行分配、构造操作
        // 这里不会做销毁或解分配操作(看本函数最后一段代码),因为 common 可以是 SOO 形式
        resize_helper.InitializeSlots<CharAlloc, sizeof(slot_type),
                                      PolicyTraits::transfer_uses_memcpy(),
                                      SooEnabled(), alignof(slot_type)>(
            common, CharAlloc(set->alloc_ref()), soo_slot_h2, sizeof(key_type),
            sizeof(value_type));

    // In the SooEnabled() case, capacity is never 0 so we don't check.
    if (!SooEnabled() && resize_helper.old_capacity() == 0) {
      // InitializeSlots did all the work including infoz().RecordRehash().
      return;
    }
    assert(resize_helper.old_capacity() > 0);
    // Nothing more to do in this case.
    if (was_soo && !had_soo_slot) return;

    slot_type* new_slots = set->slot_array();
    if (grow_single_group) {
      if (PolicyTraits::transfer_uses_memcpy()) {
        // InitializeSlots did all the work.
        return;
      }
      if (was_soo) {
        // 和前面的 transfer 不同,这是 old_soo_data 到 new_slots 之间的操作
        // 只有 SOO slot 要处理
        set->transfer(new_slots + resize_helper.SooSlotIndex(),
                      to_slot(resize_helper.old_soo_data()));
        return;
      } else {
        // We want GrowSizeIntoSingleGroup to be called here in order to make
        // InitializeSlots not depend on PolicyTraits.
        resize_helper.GrowSizeIntoSingleGroup<PolicyTraits>(common,
                                                            set->alloc_ref());
      }
    // !grow_single_group
    } else {
      // InitializeSlots prepares control bytes to correspond to empty table.
      const auto insert_slot = [&](slot_type* slot) {
        size_t hash = PolicyTraits::apply(HashElement{set->hash_ref()},
                                          PolicyTraits::element(slot));
        // 这个函数和 target 变量留到后面再说
        // 从名字可知是一个查找过程,返回一个包含位置的信息
        auto target = find_first_non_full(common, hash);
        // 和前面的操作类似,找到 slot 位置(offset)就修改对应的 ctrl
        SetCtrl(common, target.offset, H2(hash), sizeof(slot_type));
        // 并且迁移对应的 slot 元素
        set->transfer(new_slots + target.offset, slot);
        return target.probe_length;
      };
      if (was_soo) {
        insert_slot(to_slot(resize_helper.old_soo_data()));
        return;
      } else {
        auto* old_slots = static_cast<slot_type*>(resize_helper.old_slots());
        size_t total_probe_length = 0;
        for (size_t i = 0; i != resize_helper.old_capacity(); ++i) {
          if (IsFull(resize_helper.old_ctrl()[i])) {
            total_probe_length += insert_slot(old_slots + i);
          }
        }
        common.infoz().RecordRehash(total_probe_length);
      }
    }
    // 最后的解分配工作
    resize_helper.DeallocateOld<alignof(slot_type)>(CharAlloc(set->alloc_ref()),
                                                    sizeof(slot_type));
  }

上面的函数细节很多,注释写了一点概况,更多细节见下面小节。

transfer

    // 如果可以使用 memcpy 进行转移,需要判断 Policy::transfer() 返回类型是否为 std::true_type
    // 这取决于:
    // FlatHashMapPolicy::slot_policy::transfer()
    // => container_internal::map_slot_policy<K, V>::transfer()
    // => absl::is_trivially_relocatable<std::pair<const K, V>>
    // 在 absl 的实现中它会 fallback 到 std::is_trivially_copyable
    if (PolicyTraits::transfer_uses_memcpy() || !had_soo_slot) {
      // old_heap_or_soo 的初始化
      resize_helper.old_heap_or_soo() = common.heap_or_soo();
    } else {
      // https://github.com/abseil/abseil-cpp/issues/755#issuecomment-670136413
      // transfer 使用 std::launder 的解释
      // to_slot 从 uchar[] 转为 map_slot_type*
      set->transfer(set->to_slot(resize_helper.old_soo_data()),
                    set->soo_slot());
    }
Trivial relocation Relocation using move constructor
// allocate destination memory // allocate destination memory
dest = dest =
::operator new(sizeof(Type)); ::operator new(sizeof(Type));
   
// copy bytes // move construct
memcpy(dest, source,sizeof(Type)); ::new(dest) Type(std::move(*source));
   
// deallocate source // destruct source
::operator delete(source); source->~Type();
   
  // deallocate source
  ::operator delete(source);

在前面的 resize_impl() 代码中,转移操作区分了两个版本,条件 transfer_uses_memcpy() 实际取决于 absl::is_trivially_relocatable<std::pair<const K, V>>,也就是移动和析构能否使用 memcpy() 替换。这是一种优化手段,具体可以看下提案 P1144,更加具体的动机示例可以看这份演讲稿或者上表总结。目前使用这个条件来复制 common.heap_or_soo(),也就是 backing array。

  inline void transfer(slot_type* to, slot_type* from) {
    // 做了一些统计处理,不考虑的话就是直接调用 lambda
    common().RunWithReentrancyGuard(
        [&] { PolicyTraits::transfer(&alloc_ref(), to, from); });
  }
  template <class Allocator>
  static auto transfer(Allocator* alloc, slot_type* new_slot,
                       slot_type* old_slot) {
    return slot_policy::transfer(alloc, new_slot, old_slot);
  }

  template <class Allocator>
  static auto transfer(Allocator* alloc, slot_type* new_slot,
                       slot_type* old_slot) {
    // 前面提过了,transfer 取决于 relocatable
    auto is_relocatable =
        typename absl::is_trivially_relocatable<value_type>::type();

    // 为了处理 union 激活,可理解为空实现
    emplace(new_slot);
#if defined(__cpp_lib_launder) && __cpp_lib_launder >= 201606
    if (is_relocatable) {
      // value 使用了类型双关,为了避免问题直接拿 std::launder() 洗掉优化
      std::memcpy(static_cast<void*>(std::launder(&new_slot->value)),
                  static_cast<const void*>(&old_slot->value),
                  sizeof(value_type));
      return is_relocatable;
    }
#endif

    // 如果可以使用 mutable key
    if (kMutableKeys::value) {
      absl::allocator_traits<Allocator>::construct(
          *alloc, &new_slot->mutable_value, std::move(old_slot->mutable_value));
    } else {
      absl::allocator_traits<Allocator>::construct(*alloc, &new_slot->value,
                                                   std::move(old_slot->value));
    }
    destroy(alloc, old_slot);
    return is_relocatable;
  }

如果具备 trivially relocatable 条件,只需 std::memcpy()

如果不具备 trivially relocatable 条件,那就需要显式地迁移 slot,也就是需要分配器策略去处理 construct()destroy()

InitializeSlots

  // Allocates a backing array for the hashtable.
  // Reads `capacity` and updates all other fields based on the result of
  // the allocation.
  //
  // It also may do the following actions:
  // 1. initialize control bytes
  // 2. initialize slots
  // 3. deallocate old slots.
  //
  // We are bundling a lot of functionality
  // in one ABSL_ATTRIBUTE_NOINLINE function in order to minimize binary code
  // duplication in raw_hash_set<>::resize.
  //
  // `c.capacity()` must be nonzero.
  // POSTCONDITIONS:
  //  1. CommonFields is initialized.
  //
  //  if IsGrowingIntoSingleGroupApplicable && TransferUsesMemcpy
  //    Both control bytes and slots are fully initialized.
  //    old_slots are deallocated.
  //    infoz.RecordRehash is called.
  //
  //  if IsGrowingIntoSingleGroupApplicable && !TransferUsesMemcpy
  //    Control bytes are fully initialized.
  //    infoz.RecordRehash is called.
  //    GrowSizeIntoSingleGroup must be called to finish slots initialization.
  //
  //  if !IsGrowingIntoSingleGroupApplicable
  //    Control bytes are initialized to empty table via ResetCtrl.
  //    raw_hash_set<>::resize must insert elements regularly.
  //    infoz.RecordRehash is called if old_capacity == 0.
  //
  //  Returns IsGrowingIntoSingleGroupApplicable result to avoid recomputation.
  // 这里注释已经说明了大概的思路,就是分配和构造新的 ctrl + slots,然后销毁旧副本
  // 也提到一些琐碎的实践,比如标记 noinline 来减少二进制体积
  // 其他的复杂度就是区分不同的上下文进行微操优化
  template <typename Alloc, size_t SizeOfSlot, bool TransferUsesMemcpy,
            bool SooEnabled, size_t AlignOfSlot>
  ABSL_ATTRIBUTE_NOINLINE bool InitializeSlots(CommonFields& c, Alloc alloc,
                                               ctrl_t soo_slot_h2,
                                               size_t key_size,
                                               size_t value_size) {
    assert(c.capacity());
    // 空实现
    HashtablezInfoHandle infoz =
        ShouldSampleHashtablezInfo<Alloc>()
            ? SampleHashtablezInfo<SooEnabled>(SizeOfSlot, key_size, value_size,
                                               old_capacity_, was_soo_,
                                               forced_infoz_, c)
            : HashtablezInfoHandle{};

    const bool has_infoz = infoz.IsSampled();
    RawHashSetLayout layout(c.capacity(), AlignOfSlot, has_infoz);
    // 接近于 alloc::allocate(),只是为了处理对齐
    char* mem = static_cast<char*>(Allocate<BackingArrayAlignment(AlignOfSlot)>(
        &alloc, layout.alloc_size(SizeOfSlot)));
    const GenerationType old_generation = c.generation();
    c.set_generation_ptr(
        reinterpret_cast<GenerationType*>(mem + layout.generation_offset()));
    c.set_generation(NextGeneration(old_generation));
    // 可以看出 control 和 slot 共享同一块分配的 mem 内存地址
    c.set_control(reinterpret_cast<ctrl_t*>(mem + layout.control_offset()));
    // 从下面 RawHashSetLayout 可以推导出
    // mem = [GrowInfo, ctrl, [padding,] slots]
    // 其中:
    // * ctrl 所占字节数比 capacity 数值要大一点(capacity + Group::kWidth)
    //     因为这是要包含 cloned bytes
    // * GrowInfo 需要占用一个 size_t 大小,尚未初始化
    c.set_slots(mem + layout.slot_offset());
    // 处理 GrowInfo,记录数值 capacity()*7/8 - size()
    // 这里 7/8 是 Swiss Table 的最大负载因子
    // NOTE: 还是有点小心思,最高位是特殊删除标记
    ResetGrowthLeft(c);

    const bool grow_single_group =
        // 两个条件:
        // 1. capacity 不超过 一个 group(layout.capacity <= Group::kWidth;)
        // 2. old capacity 小于 capacity,注释写道:为了更快拷贝 8 字节
        // 推测条件 2 的意思是 ctrl 处理 cloned bytes 不需要取模了
        IsGrowingIntoSingleGroupApplicable(old_capacity_, layout.capacity());
    if (SooEnabled && was_soo_ && grow_single_group) {
      InitControlBytesAfterSoo(c.control(), soo_slot_h2, layout.capacity());
      if (TransferUsesMemcpy && had_soo_slot_) {
        TransferSlotAfterSoo(c, SizeOfSlot);
      }
      // SooEnabled implies that old_capacity_ != 0.
    } else if ((SooEnabled || old_capacity_ != 0) && grow_single_group) {
      if (TransferUsesMemcpy) {
        // TODO
        GrowSizeIntoSingleGroupTransferable(c, SizeOfSlot);
        DeallocateOld<AlignOfSlot>(alloc, SizeOfSlot);
      } else {
        GrowIntoSingleGroupShuffleControlBytes(c.control(), layout.capacity());
      }
    // 从空容器进行 resize 的话,只需初始化 ctrl 为 kEmpty 和 kSentinel
    } else {
      ResetCtrl(c, SizeOfSlot);
    }

    c.set_has_infoz(has_infoz);
    if (has_infoz) {
      infoz.RecordStorageChanged(c.size(), layout.capacity());
      if ((SooEnabled && was_soo_) || grow_single_group || old_capacity_ == 0) {
        infoz.RecordRehash(0);
      }
      c.set_infoz(infoz);
    }
    return grow_single_group;
  }

///////////////////////////////////////////////////////////////// 下面的都是辅助实现

// Helper class for computing offsets and allocation size of hash set fields.
// 计算 backing array 各个成员的起始偏移
class RawHashSetLayout {
 public:
  explicit RawHashSetLayout(size_t capacity, size_t slot_align, bool has_infoz)
      : capacity_(capacity),
        // ControlOffset() 返回 sizeof(HashtablezInfoHandle)? + sizeof(GrowthInfo)
        // 这里只考虑后者 sizeof(GrowthInfo)
        control_offset_(ControlOffset(has_infoz)),
        // NumControlBytes(capacity) 返回 capacity + 1 + NumClonedBytes()
        // NumClonedBytes() 等于 Group::kWidth - 1
        generation_offset_(control_offset_ + NumControlBytes(capacity)),
        slot_offset_(
            // NumGenerationBytes() 返回 0
            // 可能在前面需要留下 padding
            (generation_offset_ + NumGenerationBytes() + slot_align - 1) &
            (~slot_align + 1)) {
    assert(IsValidCapacity(capacity));
  }

  // Returns the capacity of a table.
  size_t capacity() const { return capacity_; }

  // Returns precomputed offset from the start of the backing allocation of
  // control.
  size_t control_offset() const { return control_offset_; }

  // Given the capacity of a table, computes the offset (from the start of the
  // backing allocation) of the generation counter (if it exists).
  size_t generation_offset() const { return generation_offset_; }

  // Given the capacity of a table, computes the offset (from the start of the
  // backing allocation) at which the slots begin.
  size_t slot_offset() const { return slot_offset_; }

  // Given the capacity of a table, computes the total size of the backing
  // array.
  size_t alloc_size(size_t slot_size) const {
    return slot_offset_ + capacity_ * slot_size;
  }

 private:
  size_t capacity_;
  size_t control_offset_;
  size_t generation_offset_;
  size_t slot_offset_;
};

// Returns the number of "cloned control bytes".
//
// This is the number of control bytes that are present both at the beginning
// of the control byte array and at the end, such that we can create a
// `Group::kWidth`-width probe window starting from any control byte.
constexpr size_t NumClonedBytes() { return Group::kWidth - 1; }

// Allocates at least n bytes aligned to the specified alignment.
// Alignment must be a power of 2. It must be positive.
//
// Note that many allocators don't honor alignment requirements above certain
// threshold (usually either alignof(std::max_align_t) or alignof(void*)).
// Allocate() doesn't apply alignment corrections. If the underlying allocator
// returns insufficiently alignment pointer, that's what you are going to get.
template <size_t Alignment, class Alloc>
void* Allocate(Alloc* alloc, size_t n) {
  static_assert(Alignment > 0, "");
  assert(n && "n must be positive");
  using M = AlignedType<Alignment>;
  using A = typename absl::allocator_traits<Alloc>::template rebind_alloc<M>;
  using AT = typename absl::allocator_traits<Alloc>::template rebind_traits<M>;
  // On macOS, "mem_alloc" is a #define with one argument defined in
  // rpc/types.h, so we can't name the variable "mem_alloc" and initialize it
  // with the "foo(bar)" syntax.
  A my_mem_alloc(*alloc);
  void* p = AT::allocate(my_mem_alloc, (n + sizeof(M) - 1) / sizeof(M));
  assert(reinterpret_cast<uintptr_t>(p) % Alignment == 0 &&
         "allocator does not respect alignment");
  return p;
}

inline void ResetGrowthLeft(CommonFields& common) {
  // 初始化 grow_info 唯一的成员,其数值为传入的参数
  common.growth_info().InitGrowthLeftNoDeleted(
      CapacityToGrowth(common.capacity()) - common.size());
}

// General notes on capacity/growth methods below:
// - We use 7/8th as maximum load factor. For 16-wide groups, that gives an
//   average of two empty slots per group.
// - For (capacity+1) >= Group::kWidth, growth is 7/8*capacity.
// - For (capacity+1) < Group::kWidth, growth == capacity. In this case, we
//   never need to probe (the whole table fits in one group) so we don't need a
//   load factor less than 1.

// Given `capacity`, applies the load factor; i.e., it returns the maximum
// number of values we should put into the table before a resizing rehash.
// 玄学调参,返回 7/8 的 capacity
inline size_t CapacityToGrowth(size_t capacity) {
  assert(IsValidCapacity(capacity));
  // `capacity*7/8`
  if (Group::kWidth == 8 && capacity == 7) {
    // x-x/8 does not work when x==7.
    return 6;
  }
  return capacity - capacity / 8;
}

///////////////////////////////////////////////////////////////// 不同的上下文

// If the table was SOO, initializes new control bytes. `h2` is the control
// byte corresponding to the full slot. Must be called only if
// IsGrowingIntoSingleGroupApplicable returned true.
// Requires: `had_soo_slot_ || h2 == ctrl_t::kEmpty`.
void HashSetResizeHelper::InitControlBytesAfterSoo(ctrl_t* new_ctrl, ctrl_t h2,
                                                   size_t new_capacity) {
  assert(is_single_group(new_capacity));
  // ctrl 全部初始化为 kEmpty
  std::memset(new_ctrl, static_cast<int8_t>(ctrl_t::kEmpty),
              NumControlBytes(new_capacity));
  assert(HashSetResizeHelper::SooSlotIndex() == 1);
  // This allows us to avoid branching on had_soo_slot_.
  assert(had_soo_slot_ || h2 == ctrl_t::kEmpty);
  // 固定 index=1 设为原来的 H2
  // 后面那个是 cloned byte
  new_ctrl[1] = new_ctrl[new_capacity + 2] = h2;
  // 哨兵字节
  new_ctrl[new_capacity] = ctrl_t::kSentinel;
}

void HashSetResizeHelper::TransferSlotAfterSoo(CommonFields& c,
                                               size_t slot_size) {
  assert(was_soo_);
  assert(had_soo_slot_);
  assert(is_single_group(c.capacity()));
  // 对应下标的 slot 初始化,注意 SooSlotIndex() 固定为 1
  std::memcpy(SlotAddress(c.slot_array(), SooSlotIndex(), slot_size),
              old_soo_data(), slot_size);
  // 配合 ASAN 算法,对 non-full slot 投毒,略
  PoisonSingleGroupEmptySlots(c, slot_size);
}

初始化会通过 allocator 分配新的 backing array,具体内存布局为 [GrowInfo, ctrl, [padding,] slots],这四个「成员」共用一次 allocate 就能完成分配。其中:

  • ctrl 所占字节数比 capacity 数值要大一点(capacity + Group::kWidth),因为这是要包含 cloned bytes。通常 SSE2 版本的 kWidth 设为 16。
  • GrowInfo 需要占用一个 size_t 大小,简单来说就是提前算好下次扩容前还能容纳几个元素,这个值已经考虑了负载因子。注意最高位是特殊的删除标记,表示 Topmost bit signal whenever there are deleted slots。
  • slots 所占字节数为 capacity * sizeof(slot_type),但是考虑对齐的话可能还会在前面多出 padding 区域。

分配后会进行「构造」,ctrl[capacity] 设为 kSentinelctrl[1] 和 cloned byte 都设为迁移元素的 H2 哈希值,其他区域均为 kEmpty。注意这里下标 1 是固定的,因此原有的(唯一)元素会迁移到 slot 1。

NOTE: H2 是完整哈希值的低 7 位,面向 ctrl。剩余的高 57 位则是 H1,后面会看到。

prepare insert after SOO

size_t PrepareInsertAfterSoo(size_t hash, size_t slot_size,
                             CommonFields& common) {
  assert(common.capacity() == NextCapacity(SooCapacity()));
  // After resize from capacity 1 to 3, we always have exactly the slot with
  // index 1 occupied, so we need to insert either at index 0 or index 2.
  assert(HashSetResizeHelper::SooSlotIndex() == 1);
  // 增加 size() 一个单位
  PrepareInsertCommon(common);
  // H1(hash, ctrl)
  // => (hash >> 7) ^ (reinterpret_cast<uintptr_t>(ctrl) >> 12)
  // 一些小优化,H1 还会对 ctrl 加盐(使得扩容时改动哈希值的分布),2^^12 刚好是页表大小
  // 注意这里是 H1() & 2,所以下标要么是 0,要么是 2
  const size_t offset = H1(hash, common.control()) & 2;
  // 减少 growth_info 一个单位
  common.growth_info().OverwriteEmptyAsFull();
  // 等价于 ctrl[offset] = ctrl[cloned] = H2(hash)
  SetCtrlInSingleGroupTable(common, offset, H2(hash), slot_size);
  common.infoz().RecordInsert(hash, /*distance_from_desired=*/0);
  // 返回下标,给迭代器用
  return offset;
}

///////////////////////////////////////////////////////////////// 一些辅助函数

// 整个函数等价于 size() += 1
inline void PrepareInsertCommon(CommonFields& common) {
  common.increment_size();
  common.maybe_increment_generation_on_insert();
}

// Extracts the H1 portion of a hash: 57 bits mixed with a per-table salt.
inline size_t H1(size_t hash, const ctrl_t* ctrl) {
  return (hash >> 7) ^ PerTableSalt(ctrl);
}

// Returns a per-table, hash salt, which changes on resize. This gets mixed into
// H1 to randomize iteration order per-table.
//
// The seed consists of the ctrl_ pointer, which adds enough entropy to ensure
// non-determinism of iteration order in most cases.
inline size_t PerTableSalt(const ctrl_t* ctrl) {
  // The low bits of the pointer have little or no entropy because of
  // alignment. We shift the pointer to try to use higher entropy bits. A
  // good number seems to be 12 bits, because that aligns with page size.
  return reinterpret_cast<uintptr_t>(ctrl) >> 12;
}

// Like SetCtrl, but in a single group table, we can save some operations when
// setting the cloned control byte.
inline void SetCtrlInSingleGroupTable(const CommonFields& c, size_t i, ctrl_t h,
                                      size_t slot_size) {
  assert(is_single_group(c.capacity()));
  DoSanitizeOnSetCtrl(c, i, h, slot_size);
  ctrl_t* ctrl = c.control();
  ctrl[i] = h;
  ctrl[i + c.capacity() + 1] = h;
}

扩容执行后,还需要为迭代器定位到一个具体的下标。这里的一点小心思是针对特定上下文的优化,因为明确扩容后的下标 1 是已被占位的,所以只需通过 H1 从 0 和 2 当中定位即可。

这里也有点小心思,H1(hash = H(key)) = (hash >> 7) ^ salt(ctrl)。也就是说 H1 还会对 ctrl 加盐,使得扩容时(因 ctrl 指针变动而)重平衡哈希值的分布。这里的盐值取 uintptr_t(ctrl) >> 12,而 \(2^{12}\) 是一个页表的大小,尽量避免分配器低位对齐造成影响。

预插入(non-SOO)

  template <class K>
  std::pair<iterator, bool> find_or_prepare_insert_non_soo(const K& key) {
    assert(!is_soo());
    // 因为下面要进行计算哈希,这里要异步预取堆上 ctrl 到缓存行
    prefetch_heap_block();
    auto hash = hash_ref()(key);
    // 生成一个 probe seq,用于后续线性探测
    auto seq = probe(common(), hash);
    const ctrl_t* ctrl = control();
    while (true) {
      // 使用 SIMD 加速载入部分 ctrl 到 Group
      Group g{ctrl + seq.offset()};
      // 如果批量命中,再往块中查找
      for (uint32_t i : g.Match(H2(hash))) {
        if (ABSL_PREDICT_TRUE(PolicyTraits::apply(
                EqualElement<K>{key, eq_ref()},
                PolicyTraits::element(slot_array() + seq.offset(i)))))
          // 找到就可以收工
          return {iterator_at(seq.offset(i)), false};
      }
      // 如果这里有空位
      auto mask_empty = g.MaskEmpty();
      if (ABSL_PREDICT_TRUE(mask_empty)) {
        size_t target = seq.offset(
            // 等价于 uint32_t 类型的 mask_empty.LowestBitSet(),其他参数为调试用途
            GetInsertionOffset(mask_empty, capacity(), hash, control()));
        return {iterator_at(PrepareInsertNonSoo(common(), hash,
                                                // FindInfo 等价于 {offset, probe_length}
                                                FindInfo{target, seq.index()},
                                                // 提供一堆 policy 函数接口
                                                GetPolicyFunctions())),
                true};
      }
      seq.next();
      // 肯定有容身之处!
      assert(seq.index() <= capacity() && "full table!");
    }
  }

prefetch

  // Prefetch the heap-allocated memory region to resolve potential TLB and
  // cache misses. This is intended to overlap with execution of calculating the
  // hash for a key.
  // 异步预取非 SOO 的 ctrl bytes
  void prefetch_heap_block() const {
    assert(!is_soo());
#if ABSL_HAVE_BUILTIN(__builtin_prefetch) || defined(__GNUC__)
    __builtin_prefetch(control(), 0, 1);
#endif
  }

似乎有点用的小心思。因为接下来的操作是计算哈希,不如顺手发出 ctrl 异步载入缓存行的指令。这里的 ctrl 可能是足够长的,不能行内覆盖到具体的位置,不过也算尽力了。

probe

// Begins a probing operation on `common.control`, using `hash`.
inline probe_seq<Group::kWidth> probe(const ctrl_t* ctrl, const size_t capacity,
                                      size_t hash) {
  return probe_seq<Group::kWidth>(H1(hash, ctrl), capacity);
}
inline probe_seq<Group::kWidth> probe(const CommonFields& common, size_t hash) {
  return probe(common.control(), common.capacity(), hash);
}

// The state for a probe sequence.
//
// Currently, the sequence is a triangular progression of the form
//
//   p(i) := Width * (i^2 + i)/2 + hash (mod mask + 1)
//
// The use of `Width` ensures that each probe step does not overlap groups;
// the sequence effectively outputs the addresses of *groups* (although not
// necessarily aligned to any boundary). The `Group` machinery allows us
// to check an entire group with minimal branching.
//
// Wrapping around at `mask + 1` is important, but not for the obvious reason.
// As described above, the first few entries of the control byte array
// are mirrored at the end of the array, which `Group` will find and use
// for selecting candidates. However, when those candidates' slots are
// actually inspected, there are no corresponding slots for the cloned bytes,
// so we need to make sure we've treated those offsets as "wrapping around".
//
// It turns out that this probe sequence visits every group exactly once if the
// number of groups is a power of two, since (i^2+i)/2 is a bijection in
// Z/(2^m). See https://en.wikipedia.org/wiki/Quadratic_probing
// 有两个要点:
// * 环绕到最后前因为有 cloned bytes,所以 SIMD 依然可以找对人
// * 探测过程既不会重复,也不会缺漏
template <size_t Width>
class probe_seq {
 public:
  // Creates a new probe sequence using `hash` as the initial value of the
  // sequence and `mask` (usually the capacity of the table) as the mask to
  // apply to each value in the progression.
  probe_seq(size_t hash, size_t mask) {
    assert(((mask + 1) & mask) == 0 && "not a mask");
    mask_ = mask;
    // 初始为 hash
    offset_ = hash & mask_;
  }

  // The offset within the table, i.e., the value `p(i)` above.
  size_t offset() const { return offset_; }
  size_t offset(size_t i) const { return (offset_ + i) & mask_; }

  void next() {
    // index 序列为  0+{w, 2w, 3w, 4w, 5w, ...}
    // offset 序列为 H+{w, 3w, 6w, 10w, 15w, ...}
    // 总之就是 hash 加上 index 的前缀和(别忘了取模)
    // 也就是注释说的 triangular progression
    index_ += Width;
    offset_ += index_;
    offset_ &= mask_;
  }
  // 0-based probe index, a multiple of `Width`.
  size_t index() const { return index_; }

 private:
  size_t mask_;
  size_t offset_;
  size_t index_ = 0;
};

probe() 是延迟计算的,只是返回一个表达式对象。

Group

#ifdef ABSL_INTERNAL_HAVE_SSE2
using Group = GroupSse2Impl;
using GroupFullEmptyOrDeleted = GroupSse2Impl;
#elif /* ... */
#endif

// Group 实现类
struct GroupSse2Impl {
  static constexpr size_t kWidth = 16;  // the number of slots per group

  explicit GroupSse2Impl(const ctrl_t* pos) {
    // 载入 16 个 ctrl byte
    ctrl = _mm_loadu_si128(reinterpret_cast<const __m128i*>(pos));
  }

  // Returns a bitmask representing the positions of slots that match hash.
  BitMask<uint16_t, kWidth> Match(h2_t hash) const {
    auto match = _mm_set1_epi8(static_cast<char>(hash));
    return BitMask<uint16_t, kWidth>(
        // _mm_cmpeq_epi8 字节相等时 match 设为 0xff,不相等设为 0x00
        // _mm_movemask_epi8 压缩向量的每个最高位布尔值为标量的低 16 位
        static_cast<uint16_t>(_mm_movemask_epi8(_mm_cmpeq_epi8(match, ctrl))));
  }

  // Returns a bitmask representing the positions of empty slots.
  NonIterableBitMask<uint16_t, kWidth> MaskEmpty() const {
// SSSE3 要指定 -march,默认编译器不提供支持
#ifdef ABSL_INTERNAL_HAVE_SSSE3
    // This only works because ctrl_t::kEmpty is -128.
    // 这是精心设计过的数值
    // 假设 r = _mm_sign_epi8(a, b),那么 r[i] = b[i] < 0 ? -a[i] : a[i];
    // 虽然 i8 是不存在 128 的数值,但是一个数值 x 的负数可以解释为 ~x + 1
    // 那么 -128 (0b 1000 0000) 的环绕仍是 -128
    // 所以只有 kEmpty 最高位保持 1,然后通过 _mm_movemask_epi8 筛选出来
    return NonIterableBitMask<uint16_t, kWidth>(
        static_cast<uint16_t>(_mm_movemask_epi8(_mm_sign_epi8(ctrl, ctrl))));
#else
    // 这种写法比较直白了,但是要三条向量指令
    auto match = _mm_set1_epi8(static_cast<char>(ctrl_t::kEmpty));
    return NonIterableBitMask<uint16_t, kWidth>(
        static_cast<uint16_t>(_mm_movemask_epi8(_mm_cmpeq_epi8(match, ctrl))));
#endif
  }

  // ...

  __m128i ctrl;
};

// An abstract bitmask, such as that emitted by a SIMD instruction.
//
// Specifically, this type implements a simple bitset whose representation is
// controlled by `SignificantBits` and `Shift`. `SignificantBits` is the number
// of abstract bits in the bitset, while `Shift` is the log-base-two of the
// width of an abstract bit in the representation.
// This mask provides operations for any number of real bits set in an abstract
// bit. To add iteration on top of that, implementation must guarantee no more
// than the most significant real bit is set in a set abstract bit.
template <class T, int SignificantBits, int Shift = 0>
class NonIterableBitMask {
 public:
  explicit NonIterableBitMask(T mask) : mask_(mask) {}

  explicit operator bool() const { return this->mask_ != 0; }

  // ...

  T mask_;
};

Group 是以 SIMD 宽度来设计的。目前会用到的 SIMD 操作有 Match()MaskEmpty(),都写好注释了。这也解释了 ctrl_t::kEmpty 必须为 -128 的原因。

ctrl_t

// The values here are selected for maximum performance. See the static asserts
// below for details.

// A `ctrl_t` is a single control byte, which can have one of four
// states: empty, deleted, full (which has an associated seven-bit h2_t value)
// and the sentinel. They have the following bit patterns:
//
//      empty: 1 0 0 0 0 0 0 0
//    deleted: 1 1 1 1 1 1 1 0
//       full: 0 h h h h h h h  // h represents the hash bits.
//   sentinel: 1 1 1 1 1 1 1 1
//
// These values are specifically tuned for SSE-flavored SIMD.
// The static_asserts below detail the source of these choices.
//
// We use an enum class so that when strict aliasing is enabled, the compiler
// knows ctrl_t doesn't alias other types.
enum class ctrl_t : int8_t {
  kEmpty = -128,   // 0b10000000
  kDeleted = -2,   // 0b11111110
  kSentinel = -1,  // 0b11111111
};
static_assert(
    (static_cast<int8_t>(ctrl_t::kEmpty) &
     static_cast<int8_t>(ctrl_t::kDeleted) &
     static_cast<int8_t>(ctrl_t::kSentinel) & 0x80) != 0,
     // 非哈希值(H2)都会固定符号位为 1。同时这也是 H2 取 7 bit 的原因,因为要丢弃符号位
     // 注:谷歌之前说过(忘记来源了 orz),7 bit 只是一种实现既定的设计,后面可能会改
    "Special markers need to have the MSB to make checking for them efficient");
static_assert(
    ctrl_t::kEmpty < ctrl_t::kSentinel && ctrl_t::kDeleted < ctrl_t::kSentinel,
    // 保证 kEmpty 和 kDeleted 均小于 kSentinel
    // 有些场合不需要严格区分空还是已删除,只需对比一次 kSentinel 数值就能完成判断
    // 这也是 IsEmptyOrDeleted() 的实现
    "ctrl_t::kEmpty and ctrl_t::kDeleted must be smaller than "
    "ctrl_t::kSentinel to make the SIMD test of IsEmptyOrDeleted() efficient");
static_assert(
    ctrl_t::kSentinel == static_cast<ctrl_t>(-1),
    // 虽然没太明白意图,但是 pcmpeqd 相等时是赋值 all-one 到寄存器的,也就是仍等于 -1
    "ctrl_t::kSentinel must be -1 to elide loading it from memory into SIMD "
    "registers (pcmpeqd xmm, xmm)");
static_assert(ctrl_t::kEmpty == static_cast<ctrl_t>(-128),
              // 前面 Group 解释过这个设计
              "ctrl_t::kEmpty must be -128 to make the SIMD check for its "
              "existence efficient (psignb xmm, xmm)");
static_assert(
    (~static_cast<int8_t>(ctrl_t::kEmpty) &
     ~static_cast<int8_t>(ctrl_t::kDeleted) &
     // 使用了最低位 0 的共享
     static_cast<int8_t>(ctrl_t::kSentinel) & 0x7F) != 0,
    // 用于 _mm_cmpgt_epi8(_mm_set1_epi8(kSentinel), ctrl) 对比,并生成 mask
    // 从实现来看,似乎没有这个 shared bit 必要?
    "ctrl_t::kEmpty and ctrl_t::kDeleted must share an unset bit that is not "
    "shared by ctrl_t::kSentinel to make the scalar test for "
    "MaskEmptyOrDeleted() efficient");
static_assert(ctrl_t::kDeleted == static_cast<ctrl_t>(-2),
              // -2 是用算法凑出来的,具体见 ConvertSpecialToEmptyAndFullToDeleted()
              // 这个函数可以达成效果:
              // * kDeleted -> kEmpty
              // * kEmpty -> kEmpty
              // * _ -> kDeleted
              //
              // 实现方式:
              //   _mm_or_si128(_mm_shuffle_epi8(_mm_set1_epi8(126), ctrl), _mm_set1_epi8(-128))
              //   其中 126 = 0b 0111 1110
              //   又因为 shuffle 操作区分符号位,所以:
              //     对于非哈希值,0 | -128 == kEmpty
              //     对于哈希值,126 | -128 == kDeleted
              "ctrl_t::kDeleted must be -2 to make the implementation of "
              "ConvertSpecialToEmptyAndFullToDeleted efficient");

其实 static_assert() 提示了 ctrl_t 不同数值的设计,虽然解释并不算详细。

// Helpers for checking the state of a control byte.
inline bool IsEmpty(ctrl_t c) { return c == ctrl_t::kEmpty; }
inline bool IsFull(ctrl_t c) {
  // Cast `c` to the underlying type instead of casting `0` to `ctrl_t` as `0`
  // is not a value in the enum. Both ways are equivalent, but this way makes
  // linters happier.
  return static_cast<std::underlying_type_t<ctrl_t>>(c) >= 0;
}
inline bool IsDeleted(ctrl_t c) { return c == ctrl_t::kDeleted; }
inline bool IsEmptyOrDeleted(ctrl_t c) { return c < ctrl_t::kSentinel; }

这里顺手看下 IsFull()IsEmptyOrDeleted() 的实现:

  • full:符号位必须为 0,因此只需判断 >= 0
  • empty|deleted:这两个值均小于 kSentinel,因此比较即可。

至于为什么 ctrl_t::kDeleted 一定是 -2,我在上面注释也解释了。

prepare insert non-SOO

// Given the hash of a value not currently in the table and the first empty
// slot in the probe sequence, finds a viable slot index to insert it at.
//
// In case there's no space left, the table can be resized or rehashed
// (for tables with deleted slots, see FindInsertPositionWithGrowthOrRehash).
//
// In the case of absence of deleted slots and positive growth_left, the element
// can be inserted in the provided `target` position.
//
// When the table has deleted slots (according to GrowthInfo), the target
// position will be searched one more time using `find_first_non_full`.
//
// REQUIRES: Table is not SOO.
// REQUIRES: At least one non-full slot available.
// REQUIRES: `target` is a valid empty position to insert.
size_t PrepareInsertNonSoo(CommonFields& common, size_t hash, FindInfo target,
                           const PolicyFunctions& policy) {
  // When there are no deleted slots in the table
  // and growth_left is positive, we can insert at the first
  // empty slot in the probe sequence (target).
  // 未曾删除且有增长空间的场景是热路径(case 0),可以直接插入首个 empty slot
  const bool use_target_hint =
      // Optimization is disabled when generations are enabled.
      // We have to rehash even sparse tables randomly in such mode.
      !SwisstableGenerationsEnabled() &&
      // growth_info 没有高位删除标记,并且数值大于 0
      common.growth_info().HasNoDeletedAndGrowthLeft();
  // 删除过元素,或者没有余位,这里使用 unlikely 标记
  // 这些冷路径都需要重新确认 target
  if (ABSL_PREDICT_FALSE(!use_target_hint)) {
    // Notes about optimized mode when generations are disabled:
    // We do not enter this branch if table has no deleted slots
    // and growth_left is positive.
    // We enter this branch in the following cases listed in decreasing
    // frequency:
    // 1. Table without deleted slots (>95% cases) that needs to be resized.
    // 2. Table with deleted slots that has space for the inserting element.
    // 3. Table with deleted slots that needs to be rehashed or resized.
    // 看样子这些冷路径的 if-else 分支是按照场景频率来设计的
    // 虽然我不知道概率数据是怎么测出来的
    //
    // case 1: 没删过,没余位(likely)
    if (ABSL_PREDICT_TRUE(common.growth_info().HasNoGrowthLeftAndNoDeleted())) {
      const size_t old_capacity = common.capacity();
      // 见之前的扩容操作
      policy.resize(common, NextCapacity(old_capacity), HashtablezInfoHandle{});
      // 一个特定上下文的查找操作,可以对比 case 2 使用通用的 find_first_non_full()
      target = HashSetResizeHelper::FindFirstNonFullAfterResize(
          common, old_capacity, hash);
    } else {
      // Note: the table may have no deleted slots here when generations
      // are enabled.
      const bool rehash_for_bug_detection =
          common.should_rehash_for_bug_detection_on_insert();
      // 不管,默认 false
      if (rehash_for_bug_detection) {
        // Move to a different heap allocation in order to detect bugs.
        const size_t cap = common.capacity();
        policy.resize(common,
                      common.growth_left() > 0 ? cap : NextCapacity(cap),
                      HashtablezInfoHandle{});
      }
      // case 2: 删过,有余位
      if (ABSL_PREDICT_TRUE(common.growth_left() > 0)) {
        target = find_first_non_full(common, hash);
      // case 3: 删过,没余位
      } else {
        target = FindInsertPositionWithGrowthOrRehash(common, hash, policy);
      }
    }
  }
  // case 0 热路径
  //
  // 等价于 size() += 1
  PrepareInsertCommon(common);
  // 等价于 grow_info -= IsEmpty(common.control()[target.offset])
  common.growth_info().OverwriteControlAsFull(common.control()[target.offset]);
  SetCtrl(common, target.offset, H2(hash), policy.slot_size);
  common.infoz().RecordInsert(hash, target.probe_length);
  return target.offset;
}


  // Optimized for small groups version of `find_first_non_full`.
  // Beneficial only right after calling `raw_hash_set::resize`.
  // It is safe to call in case capacity is big or was not changed, but there
  // will be no performance benefit.
  // It has implicit assumption that `resize` will call
  // `GrowSizeIntoSingleGroup*` in case `IsGrowingIntoSingleGroupApplicable`.
  // Falls back to `find_first_non_full` in case of big groups.
  static FindInfo FindFirstNonFullAfterResize(const CommonFields& c,
                                              size_t old_capacity,
                                              size_t hash) {
    if (!IsGrowingIntoSingleGroupApplicable(old_capacity, c.capacity())) {
      return find_first_non_full(c, hash);
    }
    // Find a location for the new element non-deterministically.
    // Note that any position is correct.
    // It will located at `half_old_capacity` or one of the other
    // empty slots with approximately 50% probability each.
    size_t offset = probe(c, hash).offset();

    // Note that we intentionally use unsigned int underflow.
    if (offset - (old_capacity + 1) >= old_capacity) {
      // Offset fall on kSentinel or into the mostly occupied first half.
      offset = old_capacity / 2;
    }
    assert(IsEmpty(c.control()[offset]));
    return FindInfo{offset, 0};
  }

// Probes an array of control bits using a probe sequence derived from `hash`,
// and returns the offset corresponding to the first deleted or empty slot.
//
// Behavior when the entire table is full is undefined.
//
// NOTE: this function must work with tables having both empty and deleted
// slots in the same group. Such tables appear during `erase()`.
template <typename = void>
inline FindInfo find_first_non_full(const CommonFields& common, size_t hash) {
  auto seq = probe(common, hash);
  const ctrl_t* ctrl = common.control();
  if (IsEmptyOrDeleted(ctrl[seq.offset()]) &&
      !ShouldInsertBackwards(common.capacity(), hash, ctrl)) {
    return {seq.offset(), /*probe_length=*/0};
  }
  while (true) {
    GroupFullEmptyOrDeleted g{ctrl + seq.offset()};
    auto mask = g.MaskEmptyOrDeleted();
    if (mask) {
      return {
          seq.offset(GetInsertionOffset(mask, common.capacity(), hash, ctrl)),
          seq.index()};
    }
    seq.next();
    assert(seq.index() <= common.capacity() && "full table!");
  }
}

TODO. 《界轨》通关中,小部分遗留代码先放着。

挖坑不填坑,没人比我更卑鄙。