erigon

mirror of https://github.com/ledgerwatch/erigon.git synced 2026-02-06 11:21:50 +00:00

Author	SHA1	Message	Date
Mark Holt	1558fc7835	Refactored types to force runtime registrations to be type dependent (#10147 ) This resolves https://github.com/ledgerwatch/erigon/issues/10135 All enums are constrained by their owning type which forces package includsion and hence type registration. Added tests for each type to check the construction cycle.	2024-05-01 06:41:19 +07:00
Mark Holt	714c259acc	Bor waypoint storage (#9793 ) Implementation of db and snapshot storage for additional synced hiemdall waypoint types * Checkpoint * Milestones This is targeted at the Astrid downloader which uses waypoints to verify headers during syncing and fork choice selection. Post milestones for heimdall these types are currently downloaded by erigon but not persisted locally. This change adds persistence for these types. In addition to the pure persistence changes this PR also contains a refactor step which is part of the process of extracting polygon related types from erigon core into a seperate package which may eventually be extracted to a separate module and possibly repo. The aim is rather than the core `turbo\snapshotsync\freezeblocks` having to know about types it manages and how to exaract and index their contents this can concern it self with a set of macro shard management actions. This process is partially completed by this PR, a final step will be to remove BorSnapshots and to simplify the places in the code which has to remeber to deal with them. This requires further testing so has been left out of this PR to avoid delays in delivering the base types. # Status * Waypont types and storage are complete and integrated in to the BorHeimdall stage, The code has been tested to check that types are inserted into mdbx, extracted and merged correctly * I have verified that when produced from block 0 the new snapshot correctly follow the merging strategy of existing snapshots * The functionality is enables by a --bor.waypoints=true this is false by default. # Testing This has been tested as follows: * Run a Mumbai instance to the tip and check current processing for milestones and checkpoints # Post merge steps * Produce and release snapshots for mumbai and bor mainnet * Check existing node upgrades * Remove --bor.waypoints flags	2024-04-29 18:31:51 +01:00
luchenhan	06dfaea457	chore: fix some function names (#10117 ) Signed-off-by: luchenhan <hanluchen@aliyun.com>	2024-04-29 12:48:26 +00:00
Alex Sharov	3ad651e286	nodedb: UpdateNode method to create 1 rwtx instead of 2 (#10109 )	2024-04-29 09:47:51 +07:00
carehabit	9001183632	all: use the built-in slices library (#9842 ) In the current go 1.21 version used in the project, slices are no longer an experimental feature and have entered the standard library Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>	2024-04-26 03:21:25 +00:00
Alex Sharov	ab361e4747	move `temporal` package to erigon-lib (#10015 ) Co-authored-by: awskii <artem.tsskiy@gmail.com>	2024-04-22 15:29:25 +01:00
milen	bca27f3b70	p2p/sentry/sentry_multi_client: flag to disable block download code (#9957 ) ### Change ### Adds a `disableBlockDownload` boolean flag to current implementation of sentry multi client to disable built in header and body download funcitonality. ### Long Term ### Long term we are planning to refactor sentry multi client and de-couple it from custom header and body download logic. ### Context ### Astrid uses its own body download logic which is de-coupled from sentry multi client. When both are used at the same time (using `--polygon.sync=true`) there are 2 problematic scenarios: - restarting Astrid takes a very long time due to the init logic of sentry multi client. It calls `HeaderDownload.RecoverFromDb` which is coupled to the Headers stage in the stage loop. So if Astrid has fetched 1 million headers but hasn't committed execution yet then this will result in very slow start up since all 1 million blocks have to be read from the DB. Example logs: ``` [INFO] [04-16\|12:55:42.254] [downloader] recover headers from db left=65536 ... [INFO] [04-16\|13:03:42.254] [downloader] recover headers from db left=65536 ``` - debug log messages warning about sentry consuming being slow since Astrid does not use `HeaderDownload` and `BodyDownload` so there is nothing consuming the headers and bodies from these data structures. This has no logical impact, however clogs resources. Example logs: ``` [DBUG] [04-16\|14:03:15.311] [sentry] consuming is slow, drop 50% of old messages msgID=BLOCK_HEADERS_66 [DBUG] [04-16\|14:03:15.311] [sentry] consuming is slow, drop 50% of old messages msgID=BLOCK_HEADERS_66 ```	2024-04-17 14:57:02 +03:00
milen	890b8b52c0	p2p/sentry: fix missing disconnect events after penalising a peer (#9928 ) Problematic situation `runPeer` blocks on `rw.ReadMsg()`, however in the meantime the peer gets penalised. Expected behaviour the peer to get disconnected and for sentry to generate a Disconnect event Actual behaviour no disconnect event gets generated, peer is stuck in `rw.ReadMsg()` Fix call `pi.peer.Disconnect(reason)` as part of `peerInfo.Remove(reason)` during `Penalize` 1. `Disconnect` sends a disc reason to `p.disc` channel 2. `p.disc` channel is read in `Peer.run` - https://github.com/ledgerwatch/erigon/blob/devel/p2p/peer.go#L279 3. it causes the function to exit and in its defer call close `p.closed` channel 4. `p.closed` channel is used as a closing channel in the `protoRW.closed` in both `ReadMsg` and `WriteMsg` so once it is closed those functions exit	2024-04-15 10:40:15 +00:00
battlmonstr	fcad3a0328	data races running TestMiningBenchmark (#8704 ) (#9926 ) * fix GrpcServer.p2pServer race * fix HeaderDownload.latestMinedBlockNumber race see https://github.com/ledgerwatch/erigon/issues/8704	2024-04-15 06:14:30 +00:00
awskii	43b20c1b40	fix race conditions (#9906 ) fixes two minor race conditions	2024-04-12 17:58:10 +07:00
battlmonstr	63c71181b2	p2p/sentry: StatusDataProvider ReadCurrentHeader error (#9890 ) P2P fails on restart because rawdb.ReadCurrentHeader returns a nil header. It looks like ReadHeadHeaderHash fails to find the current header hash. However the correct hash is returned by ReadHeadBlockHash. Let's use ReadHeadBlockHash, because the status needs to report a header for which we have a full block body.	2024-04-11 07:35:26 +00:00
milen	48592ea7ad	p2p/sentry: allow SendMessageById(GetBlockBodiesMsg) (#9825 )	2024-04-09 09:45:06 +02:00
Shoham Chakraborty	95c8e37be4	tests: Remove torrent simulator (#9845 )	2024-04-02 22:42:49 +08:00
Shoham Chakraborty	1d04dc52b7	Heimdall simulator (#9819 )	2024-03-30 08:39:36 +00:00
Alex Sharov	ff709f9474	1 seed for all block indices (#9719 ) this salt used for creating RecSplit indices. all nodes will have different salt, but 1 node will use same salt for all files. it allows using value of `murmur3(key)` to read from all files (importnat for non-existing keys - which require all indices check). - add `snapshots/salt-blocks.txt - this PR doesn't require re-index - it's step1. in future releases we will add data_migration script which will "check if all indices use same salt or re-index" (but at this time most of users will not affected). - new indices will use `salt-blocks.txt`	2024-03-28 14:52:11 +02:00
battlmonstr	f3f4756471	p2p/sentry: status data provider refactoring (#9747 ) The responsibility to maintain the status data is moved from the stageloop Hook and MultiClient to the new StatusDataProvider. It reads the latest data from a RoDB when asked. That happens at the end of each stage loop iteration, and sometimes when any sentry stream loop reconnects a sentry client. sync.Service and MultiClient require an instance of the StatusDataProvider now. The MessageListener is updated to depend on an external statusDataFactory.	2024-03-27 12:35:23 +00:00
Jacek Glen	af429b8527	forkvalidator: remove unsued references (#9760 ) Remove unused references to `forkValidator` and simplify parameters. No change to any logic	2024-03-20 12:29:06 +01:00
pavedroad	89954cd562	chore: remove repetitive words (#9685 ) Signed-off-by: pavedroad <qcqs@outlook.com>	2024-03-13 09:50:19 +00:00
Dmytro	4608add377	Dvovk/sentry peer refactor (#9633 )	2024-03-09 09:43:41 +07:00
Alex Sharov	b61da39cf0	Less crowded trace logs (#9620 )	2024-03-08 09:09:54 +00:00
milen	e4d37a8c53	polygon/p2p: message listener to penalize for invalid rlp (#9581 )	2024-03-05 15:44:35 +00:00
Alex Sharov	01ab4be532	don't filter out e3 files (#9423 ) backport of #9422	2024-02-28 08:51:44 +00:00
milen	6b39197f0b	polygon/p2p: implement download headers (#9399 )	2024-02-19 12:29:17 +01:00
Mark Holt	413a931acc	Bor related snapshot changes (#9311 ) This PR contains a couple of changes related to bor snapsots: Its bigger than intended as I used it to produce patch bor snapsots - and the changes are no difficult to untangle so I want to merge them as a set. 1. It has some downloader changes which add the following features: - Added snapshot-lock.json which contains a list of the files/hashes downloaded - which can be used to manage local state - Remove version flag and added this to a snapshot type - it has been used for testing v2 download but is set at v1 for thor PR (see below for details) - Manage the state of downloads in the download db - this optimises meta data look-ups on restart during/after download. For mumbai retrieving torrent info can take up to 15mins even after download is completed. 2. It has a rationalization of the snapshot processing code to remove duplicate code between snapshot types and standardize the interfaces to extract blocks (Dump...) and Index blocks. - This enables the removal of a separate BorSnapshot and probably CaplinSnapshot type as the base snapshot code can handle the addition of new snapshot types. - Simplifies the addition of new snapshot types (I want to add borchecploints and bormilestones) as the can be some - Removes the double iteration from retire blocks - Aid the insertion of bor validation code on indexing as the common insertion point is now well defined. I have tested these changes by syncing mumbai from scratch and by using it for producing a bor-mainner patch - which starts sync in the middle of the chain by downloading a previously existing snapshot segment. I have identified the following issues that I think need to be resolved before we can use v2 .segs for polygon: 1. For both mumbai and mainnet - downloads are very slow. This looks like its because lack of peers means that we're hitting the web erver with many small requests for pieces, which I think the server interpres as some for of DOS and stops giving us data. 2. Because of the lack of torrents - we can't get metadata - thus don't start downloading - even if a webpeer is availible. I'll look to resolve these in the next week or so at which point I can update the .toml files to include v2 and retest a sync from scratch.	2024-02-03 08:43:56 +00:00
milen	f8ca251a61	p2p/sentry/simulator: skip TestSimulatorStart - manual runs only for now (#9292 )	2024-01-23 20:32:43 +02:00
battlmonstr	e979d79c08	p2p: panic in enode DB Close on shutdown (#9237 ) (#9240 ) If any DB method is called while Close() is waiting for db.kv.Close() (it waits for ongoing method calls/transactions to finish) a panic: "WaitGroup is reused before previous Wait has returned" might happen. Use context cancellation to ensure that new method calls immediately return during db.kv.Close().	2024-01-16 15:34:31 +07:00
ddl	79499b5cac	refactor(p2p/dnsdisc): replace strings.IndexByte with strings.Cut (#9236 ) similar to https://github.com/ledgerwatch/erigon/pull/9202	2024-01-15 18:46:26 +00:00
battlmonstr	04498180dc	p2p/discv4: revert gotreply handler change from #8661 (#9119 ) (#9195 ) The handler had race conditions in the candidates processing goroutine.	2024-01-11 15:04:46 +00:00
Mark Holt	19bc328a07	Added db loggers to all db callers and fixed flag settings (#9099 ) Mdbx now takes a logger - but this has not been pushed to all callers - meaning it had an invalid logger This fixes the log propagation. It also fixed a start-up issue for http.enabled and txpool.disable created by a previous merge	2023-12-31 17:10:08 +07:00
Mark Holt	79ed8cad35	E2 snapshot uploading (#9056 ) This change introduces additional processes to manage snapshot uploading for E2 snapshots: ## erigon snapshots upload The `snapshots uploader` command starts a version of erigon customized for uploading snapshot files to a remote location. It breaks the stage execution process after the senders stage and then uses the snapshot stage to send uploaded headers, bodies and (in the case of polygon) bor spans and events to snapshot files. Because this process avoids execution in run signifigantly faster than a standard erigon configuration. The uploader uses rclone to send seedable (100K or 500K blocks) to a remote storage location specified in the rclone config file. The uploader is configured to minimize disk usage by doing the following: * It removes snapshots once they are loaded * It aggressively prunes the database once entities are transferred to snapshots in addition to this it has the following performance related features: * maximizes the workers allocated to snapshot processing to improve throughput * Can be started from scratch by downloading the latest snapshots from the remote location to seed processing ## snapshots command Is a stand alone command for managing remote snapshots it has the following sub commands * cmp - compare snapshots * copy - copy snapshots * verify - verify snapshots * manifest - manage the manifest file in the root of remote snapshot locations * torrent - manage snapshot torrent files	2023-12-27 22:05:09 +00:00
Mark Holt	df0699a12b	Added sentry simulator implementation (#9087 ) This adds a simulator object with implements the SentryServer api but takes objects from a pre-existing snapshot file. If the snapshot is not available locally it will download and index the .seg file for the header range being asked for. It is created as follows: ```go sim, err := simulator.NewSentry(ctx, "mumbai", dataDir, 1, logger) ``` Where the arguments are: * ctx - a callable context where cancel will close the simulator torrent and file connections (it also has a Close method) * chain - the name of the chain to take the snapshots from * datadir - a directory potentially containing snapshot .seg files. If not files exist in this directory they will be downloaded * num peers - the number of peers the simulator should create * logger - the loger to log actions to It can be attached to a client as follows: ```go simClient := direct.NewSentryClientDirect(66, sim) ``` At the moment only very basic functionality is implemented: * get headers will return headers by range or hash (hash assumes a pre-downloaded .seg as it needs an index * the header replay semantics need to be confirmed * eth 65 and 66(+) messaging is supported * For details see: `simulator_test.go More advanced peer behavior (e.g. header rewriting) can be added Bodies/Transactions handling can be added	2023-12-27 14:56:57 +00:00
battlmonstr	c1146bda49	p2p: skip TestUDPv4_smallNetConvergence on Linux (#8731 ) (#8962 )	2023-12-12 17:06:48 +07:00
Alex Sharov	427f2637d2	mdbx: hard-limit of small db's dirty_space (#8850 ) it didn't cause problems yet. but it seems a good idea in-general.	2023-11-29 15:09:55 +01:00
milen	230b013096	metrics: separate usage of prometheus counter and gauge interfaces (#8793 )	2023-11-24 16:15:12 +01:00
Alex Sharov	3db9467c94	increase peer tasks queue size (#8825 ) Current value: 16 was added by me 1 year ago and didn't mean anything. Never seen this field holding much data, probably can increase. Currently I see logs like (and 10x like this): [DBUG] [11-24\|06:59:38.353] slow peer or too many requests, dropping its old requests name=erigon/v2.54.0-aeec5...	2023-11-24 12:42:08 +01:00
Alex Sharov	23f23bc971	disable disc tests on Mac (#8822 ) TestUDPv4_smallNetConvergence tests are often timeout on mac - disabling this tests on mac CI	2023-11-23 16:00:42 +07:00
milen	34c0fe29ad	metrics: swap remaining VictoriaMetrics usages with erigon-lib/metrics (#8762 ) # Background Erigon currently uses a combination of Victoria Metrics and Prometheus client for providing metrics. We want to rationalize this and use only the Prometheus client library, but we want to maintain the simplified Victoria Metrics methods for constructing metrics. This task is currently partly complete and needs to be finished to a stage where we can remove the Victoria Metrics module from the Erigon code base. # Summary of changes - Adds missing `NewCounter`, `NewSummary`, `NewHistogram`, `GetOrCreateHistogram` functions to `erigon-lib/metrics` similar to the interface VictoriaMetrics lib provides - Minor tidy up for consistency inside `erigon-lib/metrics/set.go` around return types (panic vs err consistency for funcs inside the file), error messages, comments - Replace all remaining usages of `github.com/VictoriaMetrics/metrics` with `github.com/ledgerwatch/erigon-lib/metrics` - seamless (only import changes) since interfaces match	2023-11-20 12:23:23 +00:00
battlmonstr	a5ff524740	p2p: fix discovery shutdown (#8725 ) - alternative fix (#8757 ) Making the addReplyMatcher channel unbuffered makes the loop going too slow sometimes for serving parallel requests. This is an alternative fix for keeping the channel buffered.	2023-11-17 11:02:28 +01:00
battlmonstr	3ca7fdf7e9	p2p: fix discovery shutdown (#8725 ) (#8735 ) Problem: Some goroutines are blocked on shutdown: 1. table close <-tab.closed // because table loop pending 1. table loop <-refreshDone // because lookup shutdown blocks doRefresh 1. lookup shutdown <-it.replyCh // because it.queryfunc (findnode - ensureBond) is blocked, and not returning errClosed (if it returns and pushes to it.replyCh, then shutdown() will unblock) 1. findnode - ensureBond <-rm.errc // because the related replyMatcher was added after loop() exited, so there's nothing to push errClosed and unlock it If addReplyMatcher channel is buffered, it is possible that UDPv4.pending() adds a new reply matcher after closeCtx.Done(). Such reply matcher's errc result channel will never be updated, because the UDPv4.loop() has exited at this point. Subsequent discovery operations will deadlock. Solution: Revert to an unbuffered channel.	2023-11-17 09:13:44 +07:00
Giulio rebuffo	274f84598c	Automation tool to automatically upload caplin's snapshot files to R2 (#8747 ) Upload beacon snapshots to R2 every week by default	2023-11-16 20:59:43 +01:00
Alex Sharov	35bfffd621	sys deps up (#8695 )	2023-11-11 15:04:18 +03:00
Mark Holt	509a7af26a	Discovery zero refresh timer (#8661 ) This fixes an issue where the mumbai testnet node struggle to find peers. Before this fix in general test peer numbers are typically around 20 in total between eth66, eth67 and eth68. For new peers some can struggle to find even a single peer after days of operation. These are the numbers after 12 hours or running on a node which previously could not find any peers: eth66=13, eth67=76, eth68=91. The root cause of this issue is the following: - A significant number of mumbai peers around the boot node return network ids which are different from those currently available in the DHT - The available nodes are all consequently busy and return 'too many peers' for long periods These issues case a significant number of discovery timeouts, some of the queries will never receive a response. This causes the discovery read loop to enter a channel deadlock - which means that no responses are processed, nor timeouts fired. This causes the discovery process in the node to stop. From then on it just re-requests handshakes from a relatively small number of peers. This check in fixes this situation with the following changes: - Remove the deadlock by running the timer in a separate go-routine so it can run independently of the main request processing. - Allow the discovery process matcher to match on port if no id match can be established on initial ping. This allows subsequent node validation to proceed and if the node proves to be valid via the remainder of the look-up and handshake process it us used as a valid peer. - Completely unsolicited responses, i.e. those which come from a completely unknown ip:port combination continue to be ignored. -	2023-11-07 08:48:58 +00:00
battlmonstr	d92898a508	p2p: silkworm sentry (#8527 )	2023-11-02 08:35:13 +07:00
Dmytro	9adf31b8eb	bytes transfet separated by capability and category (#8568 ) Co-authored-by: Mark Holt <mark@distributed.vision>	2023-10-27 22:30:28 +03:00
battlmonstr	f1c81dc14e	devnet: fix node startup on macOS (#8569 ) * call getEnode before NodeStarted to make sure it is ready for RPC calls * fix connection error detection on macOS * use a non-default p2p port to avoid conflicts * disable bor milestones on local heimdall * generate node keys for static peers config	2023-10-26 12:58:01 +07:00
Dmytro	ec59be2261	Dvovk/sentinel and sentry peers data collect (#8533 )	2023-10-23 17:33:08 +03:00
a	436493350e	Sentinel refactor (#8296 ) 1. changes sentinel to use an http-like interface 2. moves hexutil, crypto/blake2b, metrics packages to erigon-lib	2023-10-22 01:17:18 +02:00
battlmonstr	e04dee12fd	p2p: bad p2p server port in the log (#8493 ) Problem: "Started P2P networking" log message contains port zero on startup, e.g.: 127.0.0.1:0 because of the outdated localnodeAddrCache. Solution: Call updateLocalNodeStaticAddrCache after updating the port.	2023-10-17 10:40:02 +07:00
Alex Sharov	6d9a4f4d94	rpcdaemon: must not create db - because doesn't know right parameters (#8445 )	2023-10-12 14:11:46 +07:00
Alex Sharov	404719c292	Medbx: add label to error messages, UpdateForkChoice: add ctx to erorrs, MemDb: increase db-limit from 512Mb to 512Gb (#8434 )	2023-10-11 12:53:34 +07:00

1 2 3 4 5 ...

757 Commits