This resolves https://github.com/ledgerwatch/erigon/issues/10135
All enums are constrained by their owning type which forces package
includsion and hence type registration.
Added tests for each type to check the construction cycle.
running `go generate ./...` fails with:
```
codecgen error: error running 'go run codecgen-main-2.generated.go': exit status 1, console: panic: encoding alphabet includes duplicate symbols
goroutine 1 [running]:
encoding/base64.NewEncoding(...)
/usr/local/go/src/encoding/base64/base64.go:82
github.com/ugorji/go/codec.init()
/Users/milen/go/pkg/mod/github.com/ugorji/go/codec@v1.1.13/gen.go:168 +0xf1c
exit status 2
```
this is a problem when using go1.22 and it has been fixed here:
-
8286c2dc98
- issue: https://github.com/ugorji/go/issues/407
When adding bor waypont types I have removed snaptype.AllTypes because
it causes package cross-dependencies.
This fixes the places where all types have been used post the merge
changes.
Implementation of db and snapshot storage for additional synced hiemdall
waypoint types
* Checkpoint
* Milestones
This is targeted at the Astrid downloader which uses waypoints to verify
headers during syncing and fork choice selection.
Post milestones for heimdall these types are currently downloaded by
erigon but not persisted locally. This change adds persistence for these
types.
In addition to the pure persistence changes this PR also contains a
refactor step which is part of the process of extracting polygon related
types from erigon core into a seperate package which may eventually be
extracted to a separate module and possibly repo.
The aim is rather than the core `turbo\snapshotsync\freezeblocks` having
to know about types it manages and how to exaract and index their
contents this can concern it self with a set of macro shard management
actions.
This process is partially completed by this PR, a final step will be to
remove BorSnapshots and to simplify the places in the code which has to
remeber to deal with them. This requires further testing so has been
left out of this PR to avoid delays in delivering the base types.
# Status
* Waypont types and storage are complete and integrated in to the
BorHeimdall stage, The code has been tested to check that types are
inserted into mdbx, extracted and merged correctly
* I have verified that when produced from block 0 the new snapshot
correctly follow the merging strategy of existing snapshots
* The functionality is enables by a **--bor.waypoints=true** this is
false by default.
# Testing
This has been tested as follows:
* Run a Mumbai instance to the tip and check current processing for
milestones and checkpoints
# Post merge steps
* Produce and release snapshots for mumbai and bor mainnet
* Check existing node upgrades
* Remove --bor.waypoints flags
Enabled diagnostics by default to collect data. It will allow to connect
to node and get stored data. It includes three new flags:
- "diagnostics.disabled" - it's set to "false" by default. Set to "true"
if you want to disable diagnostics.
- "diagnostics.endpoint.addr" - address of HTTP endpoint to get
diagnostics data
- "diagnostics.endpoint.port" - port of HTTP endpoint to get diagnostics
data
[DO NOT MERGE] as it depend on:
- https://github.com/ledgerwatch/erigon/pull/10069
- update support command
- update diagnostics UI
- use sonar badge for code coverage
- remove unnecessary "Coverage" GitHub action and unnecessary duplicate
test run on "devel" CI for it
- the existing coverage job + badge didn't seem to be accurate (wasn't
taking into account `erigon-lib` sub-module)
<img width="982" alt="Screenshot 2024-04-29 at 12 06 46"
src="https://github.com/ledgerwatch/erigon/assets/94537774/e47367ed-340d-42b5-ad00-2f59edce100c">
for https://github.com/ledgerwatch/erigon/issues/10099
for things like `eth_getTransactionReceipt`,
`ots_searchTransactionsAfter`, etc...
Also moved:
- moved `api.chainConfig()` inside `api.getReceipts()`
- switched `ots` to use blocks/receipts lru
- switched price oracle to use blocks/receipts
Problem: if --pageSize parameter not set - we using `default pagesize`
instead of `real pagesize of db`. And it causing different `dirtySpace`
size (because it's accounted in "pages")
Tweaks I did:
1) Decreased attestation expiry down to 30 minutes
2) Removed slot check in committeeSubAggregation
3) More reliable algorithm for the dependent root
Results:
* Better aggregates
* Less strain on the node
* No blocks/attestations missed
Added command to print databases tables basic info. There are two
options :
- print all info: ./build/bin/diag dbs all
- print only populated tables and dbs: ./build/bin/diag dbs pop
Here is example output:

@taratorio if you want I can add flag which will print specific DB.
- replaces usages of `moq` in `erigon-lib` with `mockgen` (gomock)
- adds a `make mocks` and `mocks-clean` command for `erigon`
- updates existing `make mocks` command and adds a `mocks-clean` common
for `erigon-lib`
`HasNext` will return true even with existing error and the application
will expect a next entry. The `Next` function can get into an internal
error (such as a `panic()`) while fetching the next cursor item and thus
fail to return the error.
---------
Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
This PR add operations inclusion.
## Normal operations
* BlsExecutionChange
* VoluntaryExit
* Slashings
Each of these operations blacklist the index they work on so we do not
have repeating indices for the same operations twice. we assume all
signatures are pre-validated and just see if it is a good time to
produce a block with them (by looking at their slot)
## Aggregated Attestations
There is a lot of trash attestations on the network so we separate our
algorithm in 3 steps:
### Eligibility
We iterate over the entire pool of accumulated attestations and filter
out all attestations who cannot be included at the current slot, and
compute their expected reward. (filter out if 0).
### Ranking
We rank the `Attestation`s by their expected reward (we just sort the
array of candidates) by expected reward in ascending order.
### Filtering by superset
We may have some supersets left-over, filter attestation which ends up
being supersets of other. this process is done from highest reward down
to lowest reward.
For period where there are not many sync events (mostly testnets) sync
event fecthing can be slow becuase sync events are fetched at the end of
every sprint.
Fetching the next and looking at its block number optimizes this because
fetches can be skipped until the next known block with sync events.
Pros:
- it allows to not pre-alloc files:
https://github.com/ledgerwatch/erigon/issues/8688
- it allows to not "sig-bus" when no space left on disk (return
user-friendly error). see:
https://github.com/ledgerwatch/erigon/issues/8500 - but DB will be MMAP
anyway and may get "sig-bus"
FYI:
- seems no perf difference (but i tested only on cloud drives)
- erigon will anyway open it as mmap
Cons:
- i did implemented `fsync` for mmap (
https://github.com/anacrolix/torrent/pull/755 ) - probably will need
implement it for bufio: https://github.com/anacrolix/torrent/pull/937
- no zero-copy: more `alloc` memory will be holded by APP (PageCache
starvation). I see 2x mem usage (at `--torrent.download.slots=500` 20gb
vs 40gb)
- i see "10K threads exchaused" error earlier (on
`--torrent.download.slots=500`).
- what else?
TL;DR: on a reorg, the common ancestor block is not being published to
subscribers of newHeads
#### Expected behavior
if the reorg's common ancestor is 2, I expect 2 to be republished
1, 2, **2**, **3**, **4**
#### Actual behavior
2 is not republished, and 3's parentHash points to a 2 header that was
never received
1, 2, **3**, **4**
This PR is the same thing as
https://github.com/ledgerwatch/erigon/pull/9738 except with a test.
Note... the test passes, but **this does not actually work in
production** (for Ethereum mainnet with prysm as external CL).
Why? Because in production, `h.sync.PrevUnwindPoint()` is always nil:
a5270bccf5/turbo/stages/stageloop.go (L291)
which means the initial "if block" is never entered, and thus we have
**no control** of increment/decrement `notifyFrom` during reorgs
a5270bccf5/eth/stagedsync/stage_finish.go (L137-L146)
I don't know why `h.sync.PrevUnwindPoint()` is seemingly always nil, or
how the test can pass if it fails in prod. I'm hoping to pass the baton
to someone who might. Thank you @indanielo for original fix.
If we can figure this bug out, it closes#8848 and closes#9568 and
closes#10056
---------
Co-authored-by: Daniel Gimenez <25278291+indanielo@users.noreply.github.com>
In the current go 1.21 version used in the project, slices are no longer
an experimental feature and have entered the standard library
Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
- Added method `tx.Context()` - because Tx already bounded to context by
`db.BeginRo(ctx)`
- Removed ctx parameter from `BlockWithSenders` method in interfaces
- Added `dbg.ToContext()` and `dbg.Enabled(ctx)` methods to set/get
debugging tag to `ctx`.
Added way to debug single http request:
To print more detailed logs for 1 request - add `--http.dbg.single=true`
flag. Then can send HTTP header `"dbg: true"`:
```
curl -X POST -H "dbg: true" -H "Content-Type: application/json" --data '{"jsonrpc": "2.0", "method": "eth_blockNumber", "params": [], "id":1}' localhost:8545
```
---------
Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com>
This PR makes the channel that is used to send logs to subscriptions
configurable so logs are not dropped when the channel gets filled. See
issue 9699.
This is just an initial version since I wanted to gather some feedback
and was unsure if this is the correct approach to solve this.
In PR:
- new .lock format introduced by
https://github.com/ledgerwatch/erigon/pull/9766 is not backward
compatible. In the past “empty .lock” did mean “all prohibited” and it
was changed to “all allowed”.
- commit
Not in PR: I have idea to make .lock also forward compatible - by making
it whitelist instead of blacklist: after adding new snap type it will
not be downloaded by accident. Will do it in next PR.
But I need one more confirmation - why do we need exceptions from .lock?
Why we breaking "download once" invariant for some type of files? Can we
avoid it?