dcrd/gcs/fastreduce.go
Dave Collins 2c3a4e3054
gcs: Implement version 2 filters.
This implements new version 2 filters which have 4 changes as compared
to version 1 filters:

- Support for independently specifying the false positive rate and
  Golomb coding bin size which allows minimizing the filter size
- A faster (incompatible with version 1) reduction function
- A more compact serialization for the number of members in the set
- Deduplication of all hash collisions prior to reducing and serializing
  the deltas

In addition, it adds a full set of tests and updates the benchmarks to
use the new version 2 filters.

The primary motivating factor for these changes is the ability to
minimize the size of the filters, however, the following is a before and
after comparison of version 1 and 2 filters in terms of performance and
allocations.

It is interesting to note the results for attempting to match a single
item is not very representative due to the fact the actual hash value
itself dominates to the point it can significantly vary due to the very
low ns timings involved.  Those differences average out when matching
multiple items, which is the much more realistic scenario, and the
performance increase is in line with the expected values.  It is also
worth nothing that filter construction now takes a bit longer due to the
additional deduplication step.  While the performance numbers for filter
construction are about 25% larger in relative terms, it is only a few ms
difference in practice and therefore is an acceptable trade off for the
size savings provided.

benchmark                      old ns/op    new ns/op    delta
-----------------------------------------------------------------
BenchmarkFilterBuild50000      16194920     20279043     +25.22%
BenchmarkFilterBuild100000     32609930     41629998     +27.66%
BenchmarkFilterMatch           620          593          -4.35%
BenchmarkFilterMatchAny        2687         2302         -14.33%

benchmark                      old allocs   new allocs   delta
-----------------------------------------------------------------
BenchmarkFilterBuild50000      6            17           +183.33%
BenchmarkFilterBuild100000     6            18           +200.00%
BenchmarkFilterMatch           0            0            +0.00%
BenchmarkFilterMatchAny        0            0            +0.00%

benchmark                      old bytes    new bytes    delta
-----------------------------------------------------------------
BenchmarkFilterBuild50000      688366       2074653      +201.39%
BenchmarkFilterBuild100000     1360064      4132627      +203.86%
BenchmarkFilterMatch           0            0            +0.00%
BenchmarkFilterMatchAny        0            0            +0.00%
2019-09-03 10:30:31 -05:00

40 lines
1.5 KiB
Go

// Copyright (c) 2019 The Decred developers
// Use of this source code is governed by an ISC
// license that can be found in the LICENSE file.
//+build go1.12
package gcs
import (
"math/bits"
)
// fastReduce calculates a mapping that is more or less equivalent to x mod N.
// However, instead of using a mod operation that can lead to slowness on many
// processors when not using a power of two due to unnecessary division, this
// uses a "multiply-and-shift" trick that eliminates all divisions as described
// in a blog post by Daniel Lemire, located at the following site at the time
// of this writing:
// https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
//
// Since that link might disappear, the general idea is to multiply by N and
// shift right by log2(N). Since N is a 64-bit integer in this case, it
// becomes:
//
// (x * N) / 2^64 == (x * N) >> 64
//
// This is a fair map since it maps integers in the range [0,2^64) to multiples
// of N in [0, N*2^64) and then divides by 2^64 to map all multiples of N in
// [0,2^64) to 0, all multiples of N in [2^64, 2*2^64) to 1, etc. This results
// in either ceil(2^64/N) or floor(2^64/N) multiples of N.
func fastReduce(x, N uint64) uint64 {
// This uses math/bits to perform the 128-bit multiplication as the compiler
// will replace it with the relevant intrinsic on most architectures.
//
// The high 64 bits in a 128-bit product is the same as shifting the entire
// product right by 64 bits.
hi, _ := bits.Mul64(x, N)
return hi
}