add article

This commit is contained in:
abbycin 2023-10-04 10:38:36 +08:00
parent 2c8f06a961
commit 22f12c27d1

155
swisstable -- a joke.md Executable file
View File

@ -0,0 +1,155 @@
## swisstable -- a joke
`swisstable`是`abseil-cpp`中`flat_hash_set/map`中使用的hash表实现使用的是二次探测法号称性能优于STL下面讲讲这个`swisstable`是怎么回事
这个hash表使用了一个`group`的概念,将一块连续的内存分为若干个`group`,每个`group`有16个元素包括`meta`和`slot`两部分,其中`meta`存放64位的`hash`值的7位称为h2)`slot`用于存放KeyKey通过`hash`值的57位称为h1来索引使用SSE2同时比较128位的ctrl值快速找到Key在`group`中的偏移这也是为什么是16个`meta`是一组128 / 8 = 16而`meta`又有三个特殊值`-1`、`-2`和`-128`分别对应sentinel、invaliddelete和empty一开始所有`meta`被初始化位`empty`
查找过程:
1. 计算hash得到h1和h2
2. 通过h2得到起始的`group`
3. 使用h2和这一`group`的`meta`匹配 使用SSE2
4. 若匹配返回到起始位置的offset
5. 若匹配到 `empty` 则返回不存在
代码如下
```c++
auto pattern = _mm_set1_epi8(H2(hash)); // 1
prober seq { H1(hash), groups_ - 1 }; // 1
matcher m { 0 };
while (true) {
group g { ctrl_ + seq.offset() }; // 2
m = g.match(pattern); // 3
while (m) {
offset = seq.offset(m.index());
assert(offset < cap_);
if (likely(slot_[offset] == key)) // 4
return true;
++m;
}
if (likely(g.match_empty())) // 5
return false;
seq.next();
}
```
这个`swisstable`宣传查找比STL要快但不管是使用`absl::flat_hash_set`的官方实现还是本人的实现结果都不如STL少数时候能比STL快一点点
```shell
abby@Serenity ~/s/build> ./bench2
----------- insert ------------
unordered_set => 183.605811ms
flat_hash_set => 41.932002ms
----------- search ------------
unordered_set => 4.378100ms
flat_hash_set => 22.928702ms
unordered_set cap 1056323 size 787268 load_factor 0.745291
flat_hash_set cap 1048575 size 787268 load_factor 0.750798
abby@Serenity ~/s/build> ./bench2
----------- insert ------------
unordered_set => 182.890869ms
flat_hash_set => 39.940193ms
----------- search ------------
unordered_set => 4.542800ms
flat_hash_set => 21.498796ms
unordered_set cap 1056323 size 786505 load_factor 0.744569
flat_hash_set cap 1048575 size 786505 load_factor 0.750070
abby@Serenity ~/s/build> ./bench2
----------- insert ------------
unordered_set => 183.804469ms
flat_hash_set => 38.301994ms
----------- search ------------
unordered_set => 5.001399ms
flat_hash_set => 22.470496ms
unordered_set cap 1056323 size 787110 load_factor 0.745141
flat_hash_set cap 1048575 size 787110 load_factor 0.750647
abby@Serenity ~/s/build> ./bench
----------- insert ------------
unordered_set => 187.271974ms
swiss => 22.850596ms
----------- search ------------
unordered_set => 4.307000ms
swiss => 4.016499ms
unordered_set cap 1056323 size 786815 load_factor 0.744862
swiss cap 1048576 size 786815 load_factor 0.750365
abby@Serenity ~/s/build> ./bench
----------- insert ------------
unordered_set => 197.831773ms
swiss => 20.787797ms
----------- search ------------
unordered_set => 4.811299ms
swiss => 4.107399ms
unordered_set cap 1056323 size 787381 load_factor 0.745398
swiss cap 1048576 size 787381 load_factor 0.750905
```
据说rust也有使用了swisstable实现实际测试如下
```shell
abby@Serenity ~/hash_test> cargo run --release ~/numbers.txt
Compiling libc v0.2.148
Compiling gcc v0.3.55
Compiling autocfg v1.1.0
Compiling version_check v0.9.4
Compiling ahash v0.8.3
Compiling num-traits v0.2.16
Compiling cfg-if v1.0.0
Compiling once_cell v1.18.0
Compiling fasthash-sys v0.3.2
Compiling cfg-if v0.1.10
Compiling rand v0.4.6
Compiling seahash v3.0.7
Compiling allocator-api2 v0.2.16
Compiling xoroshiro128 v0.3.0
Compiling hashbrown v0.14.1
Compiling fasthash v0.4.0
Compiling hash_test v0.1.0 (/home/abby/hash_test)
Finished release [optimized] target(s) in 12.99s
Running `target/release/hash_test /home/abby/numbers.txt`
swiss insert => 73ms
swiss search => 57ms
std insert => 76ms
std search => 69ms
abby@Serenity ~/hash_test> cargo run --release ~/numbers.txt
Finished release [optimized] target(s) in 0.01s
Running `target/release/hash_test /home/abby/numbers.txt`
swiss insert => 71ms
swiss search => 56ms
std insert => 71ms
std search => 55ms
abby@Serenity ~/hash_test> cargo run --release ~/numbers.txt
Finished release [optimized] target(s) in 0.02s
Running `target/release/hash_test /home/abby/numbers.txt`
swiss insert => 73ms
swiss search => 78ms
std insert => 72ms
std search => 61ms
abby@Serenity ~/hash_test>
```
结论:
为什么swisstable实现会比STL的慢可能时因为这里面的实现`meta`和`slot`是分开存放的,类似
```
| meta | slot |
+-------------------+----------------------------------------+
```
导致在查找时需要在`meta`和`slot`之间来回跳转,对缓存不友好
如果将`meta`和`slot`做到和 L1 cache对齐效果应该要好一些