This refactors the script engine to store and step through raw scripts
by making using of the new zero-allocation script tokenizer as opposed
to the less efficient method of storing and stepping through parsed
opcodes. It also improves several aspects while refactoring such as
optimizing the disassembly trace, showing all scripts in the trace in
the case of execution failure, and providing additional comments
describing the purpose of each field in the engine.
It should be noted that this is a step towards removing the parsed
opcode struct and associated supporting code altogether, however, in
order to ease the review process, this retains the struct and all
function signatures for opcode execution which make use of an individual
parsed opcode. Those will be updated in future commits.
The following is an overview of the changes:
- Modify internal engine scripts slice to use raw scripts instead of
parsed opcodes
- Introduce a tokenizer to the engine to track the current script
- Remove no longer needed script offset parameter from the engine since
that is tracked by the tokenizer
- Add an opcode index counter for disassembly purposes to the engine
- Update check for valid program counter to only consider the script
index
- Update tests for bad program counter accordingly
- Rework the NewEngine function
- Store the raw scripts
- Setup the initial tokenizer
- Explicitly check against version 0 instead of DefaultScriptVersion
which would break consensus if changed
- Check the scripts parse according to version 0 semantics to retain
current consensus rules
- Improve comments throughout
- Rework the Step function
- Use the tokenizer and raw scripts
- Create a parsed opcode on the fly for now to retain existing
opcode execution function signatures
- Improve comments throughout
- Update the Execute function
- Explicitly check against version 0 instead of DefaultScriptVersion
which would break consensus if changed
- Improve the disassembly tracing in the case of error
- Update the CheckErrorCondition function
- Modify clean stack error message to make sense in all cases
- Improve the comments
- Update the DisasmPC and DisasmScript functions on the engine
- Use the tokenizer
- Optimize construction via the use of strings.Builder
- Modify the subScript function to return the raw script bytes since the
parsed opcodes are no longer stored
- Update the various signature checking opcodes to use the raw opcode
data removal and signature hash calculation functions since the
subscript is now a raw script
- opcodeCheckSig
- opcodeCheckMultiSig
- opcodeCheckSigAlt
This removes the ScriptVerifyDERSignatures flag from the txscript
package, changes the default semantics to always enforce its behavior
and updates all callers in the repository accordingly.
This change is being made to simplify the script engine code since the
flag has always been active and required by consensus in Decred, so
there is no need to require a flag to conditionally toggle it.
It should be noted that the tests removed from script_tests.json
specifically dealt with ensuring non-DER-compliant signatures were
handled properly when the ScriptVerifyDERSignatures flag was not set.
Therefore, they are no longer necessary.
Finally, the DERSIG indicator to enable the flag in the test data has
been retained for now in order to keep the logic changes separate.
This removes the ScriptBip16 flag from the txscript package, changes the
default semantics to always enforce its behavior, and updates all
callers in the repository accordingly.
This change is being made to simplify the script engine code since the
flag has always been active and required by consensus in Decred, so there is
no need to require a flag to conditionally toggle it.
Also, since it is no longer possible to invoke the script engine without
the flag with the clean stack flag, it removes the now unused
ErrInvalidFlags error and associated tests.
It should be noted that the test removed from script_tests.json
specifically dealt with ensuring a signature script that contained
non-data-pushing opcodes was successful when neither the ScriptBip16 or
ScriptVerifySigPushOnly flags were set. Therefore, it is no longer
necessary.
Finally, the P2SH indicator to enable the flag in the test data has been
retained for now in order to keep the logic changes separate.
This converts the majority of script errors from generic errors created
via errors.New and fmt.Errorf to use a concrete type that implements the
error interface with an error code and description.
This allows callers to programmatically detect the type of error via
type assertions and an error code while still allowing the errors to
provide more context.
For example, instead of just having an error the reads "disabled opcode"
as would happen prior to these changes when a disabled opcode is
encountered, the error will now read "attempt to execute disabled opcode
OP_FOO".
While it was previously possible to programmatically detect many errors
due to them being exported, they provided no additional context and
there were also various instances that were just returning errors
created on the spot which callers could not reliably detect without
resorting to looking at the actual error message, which is nearly always
bad practice.
Also, while here, export the MaxStackSize and MaxScriptSize constants
since they can be useful for consumers of the package and perform some
minor cleanup of some of the tests.
This removes the ScriptVerifyStrictEncoding flag from the txscript
package, changes the default semantics to always enforce its behavior
and updates all callers in the repository accordingly.
This change is being made to simplify the script engine code since the
flag has always been active and required by consensus in Decred, so
there is no need to require a flag to conditionally toggle it.
It should be noted that the tests removed from script_valid.json
specifically dealt with ensuring signatures not compliant with DER
encoding did not cause execution to halt early on invalid signatures
when neither of the ScriptVerifyStrictEncoding or
ScriptVerifyDERSignatures flags were set. Therefore, they are no longer
necessary.
For nearly the same reason, the tx test related to the empty pubkey
tx_valid.json was moved to tx_invalid.json. In particular, an empty
pubkey without ScriptVerifyStrictEncoding simply failed the signature
check and continued execution, while the same condition with the flag
halts execution. Thus, without the flag the final NOT in the script
would allow the script to succeed, while it does not under the strict
encoding rules.
Finally, the STRICTENC indicator to enable the flag in the test data has
been retained for now in order to keep the logic changes separate.
Putting the test code in the same package makes it easier for forks
since they don't have to change the import paths as much and it also
gets rid of the need for internal_test.go to bridge.
Also, do some light cleanup on a few tests while here.
Decred's serialized format for transactions split the 32-bit version
field into two 16-bit components such that the upper bits are used to
encode a serialization type and the lower 16 bits are the actual
transaction version.
Unfortunately, when this was done, the in-memory transaction struct was
not also updated to hide this complexity, which means that callers
currently have to understand and take special care when dealing with the
version field of the transaction.
Since the main purpose of the wire package is precisely to hide these
details, this remedies the situation by introducing a new field on the
in-memory transaction struct named SerType which houses the
serialization type and changes the Version field back to having the
desired semantics of actually being the real transaction version. Also,
since the maximum version can only be a 16-bit value, the Version field
has been changed to a uint16 to properly reflect this.
The serialization and deserialization functions now deal with properly
converting to and from these fields to the actual serialized format as
intended.
Finally, these changes also include a fairly significant amount of
related code cleanup and optimization along with some bug fixes in order
to allow the transaction version to be bumped as intended.
The following is an overview of all changes:
- Introduce new SerType field to MsgTx to specify the serialization type
- Change MsgTx.Version to a uint16 to properly reflect its maximum
allowed value
- Change the semantics of MsgTx.Version to be the actual transaction
version as intended
- Update all callers that had special code to deal with the previous
Version field semantics to use the new semantics
- Switch all of the code that deals with encoding and decoding the
serialized version field to use more efficient masks and shifts
instead of binary writes into buffers which cause allocations
- Correct several issues that would prevent producing expected
serializations for transactions with actual transaction versions that
are not 1
- Simplify the various serialize functions to use a single func which
accepts the serialization type to reduce code duplication
- Make serialization type switch usage more consistent with the rest of
the code base
- Update the utxoview and related code to use uint16s for the
transaction version as well since it should not care about the
serialization type due to using its own
- Make code more consistent in how it uses bytes.Buffer
- Clean up several of the comments regarding hashes and add some new
comments to better describe the serialization types
Contains the following commits:
- 711f33450c
- b6b1e55d1e
- Reverted because Travis is already at a more recent version
- bd4e64d1d4
Also, the merge commit contains the necessary decred-specific
alterations, converts all other references to sha to hash to keep with
the spirit of the merged commits, and various other cleanup intended to
bring the code bases more in line with one another.
This commit is the first stage of several that are planned to convert
the blockchain package into a concurrent safe package that will
ultimately allow support for multi-peer download and concurrent chain
processing. The goal is to update btcd proper after each step so it can
take advantage of the enhancements as they are developed.
In addition to the aforementioned benefit, this staged approach has been
chosen since it is absolutely critical to maintain consensus.
Separating the changes into several stages makes it easier for reviewers
to logically follow what is happening and therefore helps prevent
consensus bugs. Naturally there are significant automated tests to help
prevent consensus issues as well.
The main focus of this stage is to convert the blockchain package to use
the new database interface and implement the chain-related functionality
which it no longer handles. It also aims to improve efficiency in
various areas by making use of the new database and chain capabilities.
The following is an overview of the chain changes:
- Update to use the new database interface
- Add chain-related functionality that the old database used to handle
- Main chain structure and state
- Transaction spend tracking
- Implement a new pruned unspent transaction output (utxo) set
- Provides efficient direct access to the unspent transaction outputs
- Uses a domain specific compression algorithm that understands the
standard transaction scripts in order to significantly compress them
- Removes reliance on the transaction index and paves the way toward
eventually enabling block pruning
- Modify the New function to accept a Config struct instead of
inidividual parameters
- Replace the old TxStore type with a new UtxoViewpoint type that makes
use of the new pruned utxo set
- Convert code to treat the new UtxoViewpoint as a rolling view that is
used between connects and disconnects to improve efficiency
- Make best chain state always set when the chain instance is created
- Remove now unnecessary logic for dealing with unset best state
- Make all exported functions concurrent safe
- Currently using a single chain state lock as it provides a straight
forward and easy to review path forward however this can be improved
with more fine grained locking
- Optimize various cases where full blocks were being loaded when only
the header is needed to help reduce the I/O load
- Add the ability for callers to get a snapshot of the current best
chain stats in a concurrent safe fashion
- Does not block callers while new blocks are being processed
- Make error messages that reference transaction outputs consistently
use <transaction hash>:<output index>
- Introduce a new AssertError type an convert internal consistency
checks to use it
- Update tests and examples to reflect the changes
- Add a full suite of tests to ensure correct functionality of the new
code
The following is an overview of the btcd changes:
- Update to use the new database and chain interfaces
- Temporarily remove all code related to the transaction index
- Temporarily remove all code related to the address index
- Convert all code that uses transaction stores to use the new utxo
view
- Rework several calls that required the block manager for safe
concurrency to use the chain package directly now that it is
concurrent safe
- Change all calls to obtain the best hash to use the new best state
snapshot capability from the chain package
- Remove workaround for limits on fetching height ranges since the new
database interface no longer imposes them
- Correct the gettxout RPC handler to return the best chain hash as
opposed the hash the txout was found in
- Optimize various RPC handlers:
- Change several of the RPC handlers to use the new chain snapshot
capability to avoid needlessly loading data
- Update several handlers to use new functionality to avoid accessing
the block manager so they are able to return the data without
blocking when the server is busy processing blocks
- Update non-verbose getblock to avoid deserialization and
serialization overhead
- Update getblockheader to request the block height directly from
chain and only load the header
- Update getdifficulty to use the new cached data from chain
- Update getmininginfo to use the new cached data from chain
- Update non-verbose getrawtransaction to avoid deserialization and
serialization overhead
- Update gettxout to use the new utxo store versus loading
full transactions using the transaction index
The following is an overview of the utility changes:
- Update addblock to use the new database and chain interfaces
- Update findcheckpoint to use the new database and chain interfaces
- Remove the dropafter utility which is no longer supported
NOTE: The transaction index and address index will be reimplemented in
another commit.
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up. In particular,
this updates the code to make use of the new chainhash package.
Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.
The following is an overview of the changes:
- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
- wire.BlockHeader.BlockSha -> BlockHash
- wire.MsgBlock.BlockSha -> BlockHash
- wire.MsgBlock.TxShas -> TxHashes
- wire.MsgTx.TxSha -> TxHash
- blockchain.ShaHashToBig -> HashToBig
- peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
instead
- Update for function name changes in other dependent packages such as
btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
Introduce an ECDSA signature verification into btcd in order to
mitigate a certain DoS attack and as a performance optimization.
The benefits of SigCache are two fold. Firstly, usage of SigCache
mitigates a DoS attack wherein an attacker causes a victim's client to
hang due to worst-case behavior triggered while processing attacker
crafted invalid transactions. A detailed description of the mitigated
DoS attack can be found here: https://bitslog.wordpress.com/2013/01/23/fixed-bitcoin-vulnerability-explanation-why-the-signature-cache-is-a-dos-protection/
Secondly, usage of the SigCache introduces a signature verification
optimization which speeds up the validation of transactions within a
block, if they've already been seen and verified within the mempool.
The server itself manages the sigCache instance. The blockManager and
txMempool respectively now receive pointers to the created sigCache
instance. All read (sig triplet existence) operations on the sigCache
will not block unless a separate goroutine is adding an entry (writing)
to the sigCache. GetBlockTemplate generation now also utilizes the
sigCache in order to avoid unnecessarily double checking signatures
when generating a template after previously accepting a txn to the
mempool. Consequently, the CPU miner now also employs the same
optimization.
The maximum number of entries for the sigCache has been introduced as a
config parameter in order to allow users to configure the amount of
memory consumed by this new additional caching.
- Move reference tests to test package since they are intended to
exercise the engine as callers would
- Improve the short form script parsing to allow additional opcodes:
DATA_#, OP_#, FALSE, TRUE
- Make use of a function to decode hex strings rather than manually
defining byte slices
- Update the tests to make use of the short form script parsing logic
rather than manually defining byte slices
- Consistently replace all []byte{} and [][]byte{} with nil
- Define tests only used in a specific function inside that func
- Move invalid flag combination test to engine_test since that is what
it is testing
- Remove all redundant script tests in favor of the JSON-based tests in
the data directory.
- Move several functions from internal_test.go to the test files
associated with what the tests are checking
This commit renames the Script type to Engine to better reflect its
purpose. It also renames the NewScript function to NewEngine to match.
This is being done because name Script for the engine is confusing since
it implies it is an actual script rather than the execution environment
for the script. It also paves the way for eventually supplying a
ParsedScript type which will be less likely to be confused with the
execution environment.
While moving the code, some additional variable names and comments have
been updated to better match the style used throughout the rest of the
code base. In addition, an attempt has been made to use consistent naming
of the engine as 'vm' instead of using different variables names as it was
previously.
Finally, the relevant engine code has been moved into a new file named
engine.go and related tests moved to engine_test.go.