This ensures any empty/nil data elements in the input array to version 2
filter construction are not added to the filter since empty elements do
not make sense given how the filters are used. It also updates the
tests to ensure proper behavior.
The single match function already failed attempts to match an empty
element as intended, however, prior to this change, it was possible to
match an empty item in the multi-item matching path. This is not
desirable since the matching behavior must be consistent in both the
single and multi-match cases.
This implements new version 2 filters which have 4 changes as compared
to version 1 filters:
- Support for independently specifying the false positive rate and
Golomb coding bin size which allows minimizing the filter size
- A faster (incompatible with version 1) reduction function
- A more compact serialization for the number of members in the set
- Deduplication of all hash collisions prior to reducing and serializing
the deltas
In addition, it adds a full set of tests and updates the benchmarks to
use the new version 2 filters.
The primary motivating factor for these changes is the ability to
minimize the size of the filters, however, the following is a before and
after comparison of version 1 and 2 filters in terms of performance and
allocations.
It is interesting to note the results for attempting to match a single
item is not very representative due to the fact the actual hash value
itself dominates to the point it can significantly vary due to the very
low ns timings involved. Those differences average out when matching
multiple items, which is the much more realistic scenario, and the
performance increase is in line with the expected values. It is also
worth nothing that filter construction now takes a bit longer due to the
additional deduplication step. While the performance numbers for filter
construction are about 25% larger in relative terms, it is only a few ms
difference in practice and therefore is an acceptable trade off for the
size savings provided.
benchmark old ns/op new ns/op delta
-----------------------------------------------------------------
BenchmarkFilterBuild50000 16194920 20279043 +25.22%
BenchmarkFilterBuild100000 32609930 41629998 +27.66%
BenchmarkFilterMatch 620 593 -4.35%
BenchmarkFilterMatchAny 2687 2302 -14.33%
benchmark old allocs new allocs delta
-----------------------------------------------------------------
BenchmarkFilterBuild50000 6 17 +183.33%
BenchmarkFilterBuild100000 6 18 +200.00%
BenchmarkFilterMatch 0 0 +0.00%
BenchmarkFilterMatchAny 0 0 +0.00%
benchmark old bytes new bytes delta
-----------------------------------------------------------------
BenchmarkFilterBuild50000 688366 2074653 +201.39%
BenchmarkFilterBuild100000 1360064 4132627 +203.86%
BenchmarkFilterMatch 0 0 +0.00%
BenchmarkFilterMatchAny 0 0 +0.00%
This modifies the code to support an independent false positive rate and
Golomb coding bin size. Among other things, this permits more optimal
parameters for minimizing the filter size to be specified.
This capability will be used in the upcoming version 2 filters that will
ultimately be included in header commitments.
For a concrete example, the current version 1 filter for block 89341 on
mainnet contains 2470 items resulting in a full serialized size of 6,669
bytes. In contrast, if the optimal parameters were specified as
described by the comments in this commit, with no other changes to the
items included in the filter, that same filter would be 6,505 bytes,
which is a size reduction of about 2.46%. This might not seem like a
significant amount, but consider that there is a filter for every block,
so it really adds up.
Since the internal filter no longer directly has a P parameter, this
moves the method to obtain it to the FilterV1 type and adds a new test
to ensure it is returned properly.
Additionally, all of the tests are converted to use the parameters while
retaining the same effective parameters to help prove correctness of the
new code.
Finally, it also significantly reduces the number of allocations
required to construct a filter resulting in faster filter construction
and reduced pressure on the GC and does some other minor consistency
cleanup while here.
In terms of the reduction in allocations, the following is a before and
after comparison of building filters with 50k and 100k elements:
benchmark old ns/op new ns/op delta
--------------------------------------------------------------
BenchmarkFilterBuild50000 18095111 15680001 -13.35%
BenchmarkFilterBuild100000 31980156 31389892 -1.85%
benchmark old allocs new allocs delta
--------------------------------------------------------------
BenchmarkFilterBuild50000 31 6 -80.65%
BenchmarkFilterBuild100000 34 6 -82.35%
benchmark old bytes new bytes delta
--------------------------------------------------------------
BenchmarkFilterBuild50000 1202343 688271 -42.76%
BenchmarkFilterBuild100000 2488472 1360000 -45.35%
This simply rearranges the funcs so they are more logically grouped in
order to provide cleaner diffs for upcoming changes. There are no
functional changes.
This optimizes the Hash method of gcs filters by making use of the new
zero-alloc hashing funcs available in crypto/blake256.
The following is a before and after comparison:
benchmark old ns/op new ns/op delta
-------------------------------------------------
BenchmarkHash 1786 1315 -26.37%
benchmark old allocs new allocs delta
-------------------------------------------------
BenchmarkHash 2 0 -100.00%
benchmark old bytes new bytes delta
-------------------------------------------------
BenchmarkHash 176 0 -100.00%
Currently, the filters provide two different serialization formats per
version. The first is the raw filter bytes without the number of items
in its data set and is implemented by the Bytes and FromBytesV1
functions. The second includes that information and is implemented by
the NBytes and FromNBytesV1 functions.
In practice, the ability to serialize the filter independently from the
number of items in its data set is not very useful since that
information is required to be able to query the filter and, unlike the
other parameters which are fixed (e.g. false positive rate and key), the
number of items varies per filter. For this reason, all usage in
practice calls NBytes and FromNBytesV1.
Consequently, this simplifies the API for working with filters by
standardizing on a single serialization format per filter version which
includes the number of items in its data set.
In order to accomplish this, the current Bytes and FromBytesV1 functions
are removed and the NBytes and FromNBytesV1 functions are renamed to
take their place.
This also updates all tests and callers in the repo accordingly.
This ensures filters properly match search items which happen to hash to
zero and adds a test for the condition. While here, it also rewrites
the MatchAny function to make it easier to reason about.
This was discovered by the new tests which intentionally added tests
with a high false positive rate and random keys.
This refactors the primary gcs filter logic into an internal struct with
a version parameter in in order to pave the way for supporting v2
filters which will have a different serialization that makes them
incompatible with v1 filters while still retaining the ability to work
with v1 filters in the interim.
The exported type is renamed to FilterV1 and the new internal struct is
embedded so its methods are externally available.
The tests and all callers in the repo have been updated accordingly.
This updates the error handling in the gcs package to be consistent with
the rest of the code base to provide a proper error type and error codes
that can be programmatically detected.
This is part of the ongoing process to cleanup and improve the gcs
module to the quality level required by consensus code for ultimate
inclusion in header commitments.
This rewrites the tests to make them more consistent with the rest of
the code base and significantly increases their coverage of the code.
It also reworks the benchmarks to actually benchmark what their names
claim, renames them for consistency, and make them more stable by
ensuring the same prng seed is used each run to eliminate variance
introduced by different values.
Finally, it removes an impossible to hit condition from the bit reader
and adds a couple of additional checks to harden the filters against
potential misuse.
This is part of the ongoing process to cleanup and improve the gcs
module to the quality level required by consensus code for ultimate
inclusion in header commitments.
This adds support for empty filters versus being an error along with a
full set of tests to ensure the empty filter works as intended.
It is part of the onging process to cleanup and improve the gcs module
to the quality level required by consensus code for ultimate inclusion
in header commitments.
This removes the unused and undesired FromPBytes and FromNPBytes
functions and associated tests from the gcs module in preparation for
upcoming changes aimed to support new version filters for use
in header commitments.
Since these changes, and several planned upcoming ones, constitute
breaking pubic API changes, this bumps the major version of the gcs
module, adds a replacement for gcs/v2 to the main module and updates all
other modules to make use of it.
It also bumps the rpcclient module to v5 since it makes use of the
gcs.Filter type in its API, adds a replacement for rpcclient/v5 to the
main module and updates all other modules to make use of it.
Note that this also marks the start of a new approach towards handling
module versioning between release cycles to reduce the maintenance
burden.
The new approach is as follows.
Whenever a new breaking change to a module's API is introduced, the
following will happen:
- Bump the major version in the go.mod of the affected module if not
already done since the last release tag
- Add a replacement to the go.mod in the main module if not already
done since the last release tag
- Update all imports in the repo to use the new major version as
necessary
- Make necessary modifications to allow all other modules to use the
new version in the same commit
- Repeat the process for any other modules the require a new major as a
result of consuming the new major(s)
Finally, once the repo is frozen for software release, all modules will
be tagged in dependency order to stabilize them and all module
replacements will be removed in order to ensure releases are only using
fully tagged and released code.
This updates the main module to use the latest available minor and patch
versions of all modules and reverts the recent change that incorrectly
removed all of the blake256 references from the various go.sum files.
The following required direct dependencies are bumped as follows:
- github.com/decred/dcrd/blockchain/stake@v1.2.1
- github.com/decred/dcrd/blockchain/stake/v2@v2.0.1
- github.com/decred/dcrd/certgen@v1.1.0
- github.com/decred/dcrd/chaincfg@v1.5.2
- github.com/decred/dcrd/chaincfg/chainhash@v1.0.2
- github.com/decred/dcrd/chaincfg/v2@v2.2.0
- github.com/decred/dcrd/dcrutil/v2@v2.0.0
- github.com/decred/dcrd/gcs@v1.1.0
- github.com/decred/dcrd/hdkeychain/v2@v2.0.1
- github.com/decred/dcrd/txscript/v2@v2.0.0
- github.com/decred/dcrwallet/rpc/jsonrpc/types@v1.2.0
This updates the code to make use of the new crypto/blake256 module instead of
github.com/dchest/blake256.
* change the references in the chaincfg/chainhash and gcs modules
* update chaincfg/chainhash to use the new Sum256 API
* remove references to dchest/blake256 from all go.sum files
This updates the gcs dependencies and serves as a base for gcs/v1.1.0.
The updated direct dependencies in this commit are as follows:
- github.com/decred/dcrd/blockchain/stake/v2@v2.0.0
The full list of updated direct dependencies since the previous
gcs/v1.0.2 release are as follows:
- github.com/decred/dcrd/blockchain/stake/v2@v2.0.0
- github.com/dchest/siphash@v1.2.1
- github.com/decred/dcrd/txscript/v2@v2.0.0
- github.com/decred/dcrd/wire@v1.2.0
This optimizes the AddSigScript method on the blockcf.Entries types by
making use of the zero-alloc tokenizer now available in the txscript/v2
module.
The following is a before and after comparison of analyzing typical
standard signature script to redeem a pay-to-pubkey-hash:
benchmark old ns/op new ns/op delta
---------------------------------------------------------
BenchmarkAddSigScript 494 374 -24.29%
benchmark old allocs new allocs delta
---------------------------------------------------------
BenchmarkAddSigScript 3 2 -33.33%
benchmark old bytes new bytes delta
---------------------------------------------------------
BenchmarkAddSigScript 128 80 -37.50%
This updates the gcs module to use v2 of the txscript module.
The updated direct dependencies are as follows:
- github.com/decred/dcrd/txscript/v2@v2.0.0
This updates all of the modules that still rely on v1 of the chaincfg
module to use v1.5.1 which builds correctly with the major API bump
introduced in the edwards v1.0.0 module.
While here, it also updates all references to the v0 edwards and dcrec
modules to their tagged v1 counterparts, tidies all of the modules via
go mod tidy, and removes all unnecessary indirect entries from the mod
files to keep the modules using the versions the dependencies are tested
with.
The primary motivation for this change is that the chaincfg/v2 module
requires edwards/v1 which is not API compatible with edwards/v0.
Consequently, in order for each module to be incrementally updated to
use chaincfg/v2, all of its dependencies must also be able to build with
edwards/v1 which is the case as of chaincfg v1.5.1.
This adds the go 1.11 directive to all of the modules in order to
clearly mark they build and work with that version. Go 1.12 modified
the tools such that tidy will automatically add the new version to
modules that do not already have a directive and that would prevent
builds on Go 1.11 through Go 1.11.3 which is not desirable.
This modifies all of the modules, with the exception of the root module,
to remove all replacement directives from their go.mod files and update
the requirements and module sums accordingly.
While it is nice to be able to build and test directly from each module
directory and have it pull in the latest untagged changes when
developing, having all of the overrides in each module makes it
infeasible to use the module tools to help maintain the modules and thus
makes it quite difficult to ensure they are all independently accurate
for external consumers.
By maintaining all of the overrides in the root module and invoking all
builds and tests from it, the overrides will apply to ensure the latest
code is being built and tested.
This also modifies the tests script used with in CI to run all of the
tests from the root module accordingly.
This bumps the various module versions as follows:
- github.com/decred/dcrd/addrmgr@v1.0.2
- github.com/decred/dcrd/wire@v1.1.0
- github.com/decred/dcrd/chaincfg@v1.1.1
- github.com/decred/dcrd/connmgr@v1.0.1
- github.com/decred/dcrd/dcrutil@v1.1.1
- github.com/decred/dcrd/database@v1.0.1
- github.com/decred/dcrd/hdkeychain@v1.1.0
- github.com/decred/dcrd/txscript@v1.0.1
- github.com/decred/dcrd/blockchain/stake@v1.0.1
- github.com/decred/dcrd/gcs@v1.0.1
- github.com/decred/dcrd/blockchain@v1.0.1
- github.com/decred/dcrd/mining@v1.0.1
- github.com/decred/dcrd/mempool@v1.0.1
- github.com/decred/dcrd/peer@v1.0.1
- github.com/decred/dcrd/rpcclient@v1.0.1
It also updates all of the dependencies for every module accordingly and
adds a few missing overrides for transitive dependencies.
This further refines the modules to add the following new modules
instead of depending on the entire dcrd module:
- github.com/decred/dcrd/dcrjson@v1.0.0
- github.com/decred/dcrd/blockchain@v1.0.0
- github.com/decred/dcrd/blockchain/stake@v1.0.0
- github.com/decred/dcrd/gcs@v1.0.0
- github.com/decred/dcrd/mining@v1.0.0
- github.com/decred/dcrd/mempool@v1.0.0
- github.com/decred/dcrd/peer@v1.0.0
- github.com/decred/dcrd/rpcclient@v1.0.0
Also, it ensures modules that rely on other modules within the repo are
provided replacements to the latest repo code to ensure builds against
master and continuous integration use the latest code.
- github.com/decred/dcrd/addrmgr
- github.com/decred/dcrd/blockchain
- github.com/decred/dcrd/blockchain/stake
- github.com/decred/dcrd/chaincfg
- github.com/decred/dcrd/connmgr
- github.com/decred/dcrd/database
- github.com/decred/dcrd/dcrec/secp256k1
- github.com/decred/dcrd/dcrjson
- github.com/decred/dcrd/dcrutil
- github.com/decred/dcrd/gcs
- github.com/decred/dcrd/hdkeychain
- github.com/decred/dcrd/mempool
- github.com/decred/dcrd/mining
- github.com/decred/dcrd/peer
- github.com/decred/dcrd/rpcclient
- github.com/decred/dcrd/txscript
- github.com/decred/dcrd/wire
This removes a bunch of build modules that rely on a version of dcrd
that doesn't exist and updates the root module so dcrd can be built with
the upcoming go1.11 release.
This adds module support for the versioned go toolchain. In particular,
the following packages are defined as modules:
* addrmgr
* blockchain
* certgen
* chaincfg
* connmgr
* database
* dcrjson
* dcrutil
* gcs
* hdkeychain
* mempool
* mining
* peer
* rpcclient
* txscript
* wire
It does not update the travis build environment or README since it is
experimental at this point.
This change begins the work of bringing committed filters to the
network consensus daemon. Committed filters are designed to enable
light wallets without many of the privacy issues associated with
server-side bloom filtering.
The new gcs package provides the primitives for creating and matching
against Golomb-coded sets (GCS) filters while the blockcf package
provides creation of filters and filter entries for data structures
found in blocks.
The wire package has been updated to define a new protocol version and
service flag for advertising CF support and includes types for the
following new messages: cfheaders, cfilter, cftypes, getcfheaders,
getcfilter, getcftypes. The peer package and server implementation
have been updated to include support for the new protocol version and
messages.
Filters are created using a collision probability of 2^-20 and are
saved to a new optional database index when running with committed
filter support enabled (the default). At first startup, if support is
not disabled, the index will be created and populated with filters and
filter headers for all preexisting blocks, and new filters will be
recorded for processed blocks.
Multiple filter types are supported. The regular filter commits to
output scripts and previous outpoints that any non-voting wallet will
require access to. Scripts and previous outpoints that can only be
spent by votes and revocations are not committed to the filter. The
extended filter is a supplementary filter which commits to all
transaction hashes and script data pushes from the input scripts of
non-coinbase regular and ticket purchase transactions. Creating these
filters is based on the algorithm defined by BIP0158 but is modified
to only commit "regular" data in stake transactions to prevent
committed filters from being used to create SPV voting wallets.