The format is based on Keep a Changelog, and this project adheres to Semantic Versioning. See MAINTAINERS.md for instructions to keep up to date.
- Fix a panic (nil pointer) when skipping blocks via indexes on stores on tier2
- This new endpoint removes the need for complex "mangling" of the package on the client side.
- Instead of expecting
sf.substreams.v1.Modules(with the client having to apply parameters, network, etc.), thesf.substreams.rpc.v3.Requestnow expects:- a
sf.substreams.v1.Package. - a
map<string, string>ofparams - the
networkstring which will all be applied to the package server-side.
- a
- It returns the same object as the v2 endpoint, i.e. a stream of
sf.substreams.rpc.v2.Response - It is added on top of the existing 'v2' endpoint, both being active at the same time.
- To enable it, operators will simply need to ensure that their routing allows the
/sf.substreams.rpc.v3.Stream/*path. - Cached spkg on the server will now contain protobuf definitions, simplifying debugging of user requests.
- Emitted metrics for requests can now be
sf.substreams.rpc.v3/Blocksinstead of alwayssf.substreams.rpc.v2/Blocks, make sure that your metering endpoint can support it.
Note: recent substreams clients will support both endpoints, first trying the v3 and automatically falling back to v2 if they hit a "404 Not Found" or "Not Implemented" error.
- Fixed a bug with BlockFilter: a skipped module would send BlockScopedData (in dev or near HEAD, to follow progress) with an empty module name, breaking some sinks. Module name was present if requesting a module dependent on that skipped module. Now the module name is always included.
- Bumped to firehose-core v1.11.3
-
Improved panic message when reader node encounter a block whose finality is bigger than the block itself to include
lib_num,block_num,distance, andmax_distancefor easier debugging. -
Updated
firehose-networksdependency tov0.2.2(latest). -
Fixed
common-one-block-store-urlflag not expanding environment variables in all apps.
-
- Updated Wasmtime runtime from v30.0.0 to v36.0.0, bringing performance improvements, inlining support, Component Model async implementation, and enhanced security features.
- Added WASM bindgen shims support for Wasmtime runtime to handle WASM modules with WASM bindgen imports (when Substreams Module binary is defined as type
wasm/rust-v1+wasm-bindgen-shims). - Added support for foundational-store (in wasmtime and wazero).
- Added foundational-store grpc client to substreams engine.
- Fixed module caching to properly handle modules with different runtime extensions.
- 'paymentgateway' metering plugin renamed to
tgm, now supports theindexer-api-keyparameter.
-
Concurrent streams and workers limits are now handled under the new session plugin, available under
common-session-pluginargument. -
The following flags were removed, now handled by that session plugin
substreams-tier1-global-worker-pool-addresssubstreams-tier1-global-request-pool-addresssubstreams-tier1-global-worker-pool-keep-alive-delaysubstreams-tier1-global-request-pool-keep-alive-delaysubstreams-tier1-default-max-request-per-usesubstreams-tier1-default-minimal-request-life-time-second
-
To use thegraph.market as a session plugin, use:
--common-session-plugin=tgm://session.thegraph.market:443?indexer-api-key={your-api-key}(requires specific indexer API key) see https://github.com/streamingfast/tgm-gateway/tree/develop/session for details on the various flags -
To use simple local session management, use:
--common-session-plugin=local://?max_sessions=30&max_sessions_per_user=3&max_workers_per_user=10&max_workers_per_session=10see https://github.com/streamingfast/dsession/tree/main/local for details on those flags -
Note: The 'max_sessions' parameter from the
common-session-pluginis now also used to limit the number of firehose streams. -
If you were using a custom GRPC implementation for
--substreams-tier1-global-worker-pool-addressand--substreams-tier1-global-request-pool-address(ex: localhost:9010), simply use this value for the session plugin:--common-session-plugin=tgm://localhost:9010?plaintext=true, it is compatible.
- Fix a slow memory leak around metering plugin on tier2
- Add a maximum execution time for a full tier2 segment. By default, this is 60 minutes. It will fail with
rpc error: code = DeadlineExceeded desc = request active for too long. It can be configured from the --substreams-tier2-segment-execution-timeout flag - Fix
subscription channel at max capacityerror: when the LIVE channel is full (ex: slow module execution or slow client reader), the request will be continued from merged files instead of failing, and gracefully recover if performance is restored. - Improve log message for 'request active for a long time', adding stats.
- Fix thread leak on filereader (tier1)
People using their own authentication layer will need to consider these changes before upgrading!
- Renamed config headers that come from authentication layer:
x-sf-user-idrenamed tox-user-id(from dauth module)x-sf-api-key-idrenamed tox-api-key-id(from dauth module)x-sf-metarenamed tox-meta(from dauth module)x-sf-substreams-parallel-jobsrenamed tox-substreams-parallel-workers
- Allow decreasing
x-substreams-parallel-workersthrough an HTTP headers (auth layer determines higher bound) - Detect value for the 'stage layer parallel executor max count' based on the
x-plan-tierheader (removedx-sf-substreams-stage-layer-parallel-executor-max-counthandling)
- Added
tgm://auth.thegraph.market?indexer-api-key=<API_KEY>&reissue-jwt-max-age-secs=600plugin that allows an indexer to use The Graph Market as the authentication source. An API key with special "indexer" feature is needed to allow repeated calls to the API without rate limiting (for Key-based authentication and reissuance of "untrusted long-lived JWTs").
- Added mechanism to immediately cancel pending requests that are doing an 'external call' (ex: eth_call) on a given block when it gets forked out (UNDO because of a reorg).
- Fixed handling of invalid module kind: prevent heavy logging from recovered panic
- Error considered deterministic which will cache the error forever are now suffixed with
<original message> (deterministic error).
(removed release with wrong substreams version)
- fix: eth_calls returning rpc error code -32003 (InvalidFEOpcode) will not retry forever
- [OPERATORS] Tier2 servers must be upgraded BEFORE tier1 servers
- tier2 servers will now stream outputs for the 'first segment', to speed up time to first block
- Return 'processed blocks' counter to client at the end of the request
- Progress notifications will only be sent every 500ms for the first minute, then reduce rate up to every 5 seconds (can be overridden per request)
- Added
dev_output_modulesto protobuf request (if present, in dev mode, only send the output of the modules listed) - Added
progress_messages_interval_msto protobuf request (if present, overrides the rate of progress messages to that many milliseconds)
[Broken release, do not use]
- This release is a hotfix for a thread leak in substreams leading to a slow memory leak.
Rework the execout File read/write to improve memory efficiency:
-
This reduces the RAM usage necessary to read and stream data to the user on tier1, as well as to read the existing execouts on tier2 jobs (in multi-stage scenario)
-
The cached execouts need to be rewritten to take advantage of this, since their data is currently not ordered: the system will automatically load and rewrite existing execout when they are used.
-
Code changes include:
- new FileReader / FileWriter that "read as you go" or "write as you go"
- No more 'KV' map attached to the File
- Split the IndexWriter away from its dependencies on execoutMappers.
- Clock distributor now also reads "as you go", using a small "one-block-cache"
-
Removed
SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENTenv var (since this is not a RAM issue anymore) -
Add
uncompressed_egress_bytesfield tosubstreams request statslog message
- add
--headersflag tofireeth tools pollerto allow auhenticated calls to ETH_RPC providers - add
--allow-empty-receipts-on-block-0bool flag to work with tron-evm-mainnet - add
--parallel-workersint flag to allow increasing from the default (which is now 20 instead of 10)
- (dstore) Add storageClass query parameter for s3:// urls on stores (@fschoell)
- Update the firehose-beacon proto to include the new Electra spec in the 'well-known' protobuf definitions (@fschoell)
- Use The Graph's Network Registry to recognize chains by genesis blocks and fill the 'advertise' server on substreams/firehose
- Tier2 jobs now write mapper outputs "as they progress", preventing memory usage spikes when saving them to disk.
- Tier2 jobs now limit writing and loading mapper output files to a maximum size of 8GiB by default.
- Added
SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENT(int) environment variable to control this new limit. - Added
SUBSTREAMS_STORE_SIZE_LIMIT(uint64) env var to allow overwriting the default 1GiB value - Added
SUBSTREAMS_PRINT_STACK(bool) env var to enable printing full stack traces when caught panic occurs - Added
SUBSTREAMS_DEBUG_API_ADDR(string) environment variable to expose a "debug API" HTTP interface that allows blocking connections, running GC, listing or canceling active requests. - Prevent a deterministic failure on a module definition (mode, valueType, updatePolicy) from persisting when the issue is fixed in the substreams.yaml streamingfast/substreams#621
- Metering events on tier2 now bundled at the end of the job (prevents sending metering events for failing jobs)
- Added metering for: "processed_blocks" (block * number of stages where execution happened) and "egress_bytes"
- Substreams: properly classify eth_calls errors as deterministic on erigon (
return data out of boundsandReverted 0x.....) - Substreams: Speed up DeleteByPrefix operations (5x perf improvement on some heavy substreams)
- Substreams: Release existingExecOuts memory as blocks progress on tier2 job
- Added missing
addressinSetCodeAuthorizationstructure for proper recording of EIP-7702 feature, this arrives in time for Mainnet but Holesky, Sepolia, BSC Chapel, BSC Mainnet and Arbitrum Sepolia will need to be backfilled to fix the issue at a later time.
- (RAM+CPU) dedupe execution of modules with same hash but different name when computing dependency graph. (#619)
- (RAM) prevent memory usage burst on tier2 when writing mapper by streaming protobuf items to writer
- Tier1 requests will no longer error out with "service currently overloaded" because tier2 servers are ramping up
- Add
reader-node-firehosewhich creates one-blocks by consuming blocks from an already existing Firehose endpoint. This can be used to set up an indexer stack without having to run an instrumented blockchain node, or getting redundancy from another firehose provider.
- Bumped grpc-go lib to 1.72.0
- Now building
amd64andarm64Docker images on push & release.
- Added support to Balance Change REASON_REVERT for needed by optimism.
- Better documentation on versions of
Blockand known issues on version 3.
- Bump substreams to v1.15.2
- fix the 'quicksave' feature on substreams (incorrect block hash on quicksave)
-
Save deterministic failures in WASM in the module cache (under a file named
errors.0123456789.zstat the failed block number), so further requests depending on this module at the same block can return the error immediately without re-executing the module. -
Fix
module_wasm_ext_durationvalue in 'substreams request stats' log (always 0 since using wasmtime)
- Fix a panic when a module times out on tier2 while being executed from cached outputs
- eth_call timeout logs now properly show 0x-prefixed values
- Add environment variables to control retry behavior, "SUBSTREAMS_WORKER_MAX_RETRIES" (default 10) and "SUBSTREAMS_WORKER_MAX_TIMEOUT_RETRIES" (default 2), changing from previous defaults (720 and 3) The worker_max_timeout_retries is the number of retries specifically applied to block execution timing out (ex: because of external calls)
- The mechanism to slow down processing segments "ahead of blocks being sent to user" has been disabled on "noop-mode" requests, since these requests are used to pre-cache data and should not be slowed down.
- The "number of segments ahead" in this mechanism has been increased from
>number of parallel workers>to<number of parallel workers> * 1.5 - Tier2 now returns GRPC error codes for
DeadlineExceededwhen it times out, andResourceExhaustedwhen a request is rejected due to overload - Tier1 now correctly reports tier2 job outcomes in the
substreams request stats - Added jitter in "retry" logic to prevent all workers from retrying at the same time when tier2 are overloaded
-
Added RPC code
-32600as a deterministic error, happen if the JSON-RPC request itself is malformed. -
Fixed
runtime error: slice bounds out of rangeerror on heavy memory usage with wasmtime engin -
Added a validation on a module for the existence of 'triggering' inputs: the server will now fail with a clear error message when the only available inputs are stores used with mode 'get' (not 'deltas'), instead of silenlty skipping the module on every block.
-
Fixed a bug where the tier1 would not catch tier2 'module execution timeout' error, improved error messages related to timeouts during eth_call
- Added a mechanism for 'production-mode' requests where the tier1 will not schedule tier2 jobs over { max_parallel_subrequests } segments above the current block being streamed to the user. This will ensure that a user slowly reading blocks 1, 2, 3... will not trigger a flood of tier2 jobs for higher blocks, let's say 300_000_000, that might never get read.
- Improved connection draining on shutdown: Now waits for the end of the 'shutdown-delay' before draining and refusing new connections, then waits for 'quicksaves' and successful signaling of clients, up to a max of 30 sec.
- Added information about the number of blocks that need to be processed for a given request in the
sf.substreams.rpc.v2.SessionInitmessage - Added an optional field
limit_processed_blocksto thesf.substreams.rpc.v2.Request. When set to a non-zero value, the server will reject a request that would process more blocks than the given value with theFailedPreconditionGRPC error code. - Improved error messages when a module execution is timing out on a block (ex: due to a slow external call) and now return a
DeadlineExceededConnect/GRPC error code instead of a Internal. Removed 'panic' from wording. - In 'substreams request stats' log, add fields:
remote_jobs_completed,remote_blocks_processedandtotal_uncompressed_read_bytes
- Fix another
cannot resolve 'old cursor' from files in passthrough mode -- not implementedbug when receiving a request in production-mode with a cursor that is below the "linear handoff" block
- Implement "QuickSave" feature to save the state of "live running" substreams stores when shutting down, and then resume processing from that point if the cursor matches.
- Added flag
substreams-tier1-quicksave-storeto enable quicksave when non-empty (requires--common-system-shutdown-signal-delayto be set to a long enough value to save the in-flight stores)
- Added flag
- Rust modules will now be executed with
wasmtimeby default instead ofwazero.- Prevents the whole server from stalling in certain memory-intensive operations in wazero.
- Speed improvement: cuts the execution time in half in some circumstances.
- Wazero is still used for modules with
wbindgenand modules compiled withtinygo. - Set env var
SUBSTREAMS_WASM_RUNTIME=wazeroto revert to previous behavior.
The Ethereum block model has been updated to account for upcoming Prague fork. Namely, we added support for the new SetCode transaction's
type, added extracted SetCodeAuthorization elements from the transaction and added new gas changes that were introduced in the hard fork.
Also, totalDifficulty field is now deprecated, it has been removed entirely from geth codebase which means future reprocessing of
data wouldn't be able to populated that field anymore. If you used that field somehow, you should stop using it. At some point
we will remove the field entirely.
Also, from Prague hard-fork and onward, the Block model will now switch to version 4 of the block model (a.k.a Firehose Ethereum Block 3.0).
This means that for a given network, block.number < Prague, block will be using version 3 (a.k.a Firehose Ethereum Block 2.3) and when
block.number >= Prague, it will be version 4. This is deterministic per network as the Prague block is deterministic.
This does not change at the structure of the various element, everything stays the same in that aspect so the version 4 model is backward compatible. What the new version changes:
- Does not populate
accountCreationsfield anymore, this was bogus from day 1 and should never be used. - Fix
executedCodefield to be more accurate now, as soon as one opcode is executed, this will be set now and not otherwise. - The root's call
BeginOrdinalis now fixed and not always 0. - Ordinals in presence of system calls are now correctly ordered.
- The
returnDatais now properly populated. - The
keccakPreimagedata being "." is now fixed. - The call's
inputfield is now properly populated on contract creation, it was omitted before. - There is new gas changed behind recorded now mainly for full view of how gas is allocated, consumed and returned.
For upcoming Prague hard forks (BNB, Holesky, Sepolia, Mainnet), you will start using geth Firehose 3.0 version, so our Firehose enabled
releases suffixed with -fh3.0.
This new Firehose 3.0 geth tracer is built on the new geth Core Tracing API introduced in Geth 1.14. This new version changes how
one must start the geth binary.
So for Holesky hard-fork, you will need to use https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.15.2-fh3.0, here what
you need when you will update your reader-node's reader-node-arguments field:
- Remove
--firehose-enabledand any flag starting with--firehose-.... - Add
--vmtrace=firehoseflag which activates Firehose output (Important do not miss this change, otherwise you will not process new blocks, will make it the default soon). - Add
--syncmode=fullflag which is not set automatically anymore.
-
Integrated the
GlobalRequestPoolservice in theTier1Appto manage global requests pooling. -
Integrated the
GlobalWorkerPoolservice in theTier1Appto manage global worker pooling. -
Added flag
substreams-tier1-global-worker-pool-address, the address of the global worker pool to use for the substreams tier1. (disabled if empty). -
Added flag
substreams-tier1-global-worker-pool-keep-alive-delaydelay between two keep alive call to the global worker pool (default is 25s"). -
Added flag
substreams-tier1-global-request-pool-keep-alive-delaydelay between two keep alive call to the global worker pool for request (default is 25s). -
Added flag
substreams-tier1-default-max-request-per-userdefault max request per user, this will be use of the global worker pool is not reachable (default is 5). -
Added flag
substreams-tier1-default-minimal-request-life-time-seconddefault minimal request life time, this will be use of the global worker pool is not reachable (default is 180). -
Limit parallel execution of a stage's layer: Previously, the engine was executing modules in a stage's layer all in parallel. We now change that behavior, development mode will from now on execute every sequentially and when in production mode will limit parallelism to 2 (hard-coded) for now. The auth plugin can control that value dynamically by providing a trusted header
X-Sf-Substreams-Stage-Layer-Parallel-Executor-Max-Count.
- Fixed a regression since "v1.7.3" where the SkipEmptyOutput instruction was ignored in substreams mappers
- Add shared cache for tier1 execution near HEAD, to prevent multiple tier1 instances from reprocessing the same module on the same block when it comes in (ex: foundational modules)
- Improved fetching of state caches on tier1 requests to speed up "time to first data"
- make 'compare-blocks' command support one-blocks stores as well as merged-blocks
- Bump
substreamslib tov1.12.3- Improved logging of requests beginning/end
- Improved
noopmode (now sends less data)
-
Fixed
fireeth tools geth enforce-peers --onceshorthand flag registration now collapsing withfireeth tools -o (for --output).This means the
fireeth tools geth enforce-peerscommand does not accept-oanymore for once and if you were using it, replace with--once.
-
Fixed
substreams-tier2not setting itself ready correctly on startup sincev2.9.0. -
Added support for
--output=bytesmode which prints the chain's specific Protobuf block as bytes, the encoding for the bytes string printed is determined by--bytes-encoding, useshexby default. -
Added back
-oas shorthand for--outputinfirecore tools ...sub-commands.
- Add back
grpc.health.v1.Healthservice tofirehoseandsubstreams-tier1services (regression in 2.9.0) - Give precedence to the tracing header
X-Cloud-Trace-ContextoverTraceparentto prevent user systems' trace IDs from leaking passed a GCP load-balancer
- Reader Node Manager HTTP API now accepts
POST http://localhost:10011/v1/restart<?sync=true>to restart the underlying reader node binary sub-process. This is a alias for/v1/reload.
- Enhanced
fireeth tools print merged-blockswith various small quality of life improvements:- Now accepts a block range instead of a single start block.
- Passing a single block as the block range will print this single block alone.
- Block range is now optional, defaulting to run until there is no more files to read.
- It's possible to pass a merged blocks file directly, with or without an optional range.
Important
This release will reject firehose connections from clients that don't support GZIP or ZSTD compression. Use --firehose-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.
Important
This release removes the old sf.firehose.v1 protocol (replaced by sf.firehose.v2 in 2022, this should not affect any reasonably recent client).
- Add support for ConnectWeb firehose requests.
- Always use gzip compression on firehose requests for clients that support it (instead of always answering with the same compression as the request).
-
The
substreams-tier1app now has two new configuration flags named respectivelysubstreams-tier1-active-requests-soft-limitandsubstreams-tier1-active-requests-hard-limithelping better load balance active requests across a pool oftier1instances.The
substreams-tier1-active-requests-soft-limitlimits the number of client active requests that a tier1 accepts before starting to be report itself as 'unready' within the health check endpoint. A limit of 0 or less means no limit.This is useful to load balance active requests more easily across a pool of tier1 instance. When the instance reaches the soft limit, it will start to be unready from the load balancer standpoint. The load balancer in return will remove it from the list of available instances, and new connections will be routed to remaining clients, spreading the load.
The `substreams-tier1-active-requests-hard-limit` limits the number of client active requests that a tier1 accepts beforerejecting incoming gRPC requests with 'Unavailable' code and setting itself as unready. A limit of 0 or less means no limit.
This is useful to prevent the tier1 from being overwhelmed by too many requests, most client auto-reconnects on 'Unavailable' code so they should end up on another tier1 instance, assuming you have proper auto-scaling of the number of instances available.
-
The
substreams-tier1app now exposes a new Prometheus metricsubstreams_tier1_rejected_request_counterthat tracks rejected requests. The counter is labelled by the gRPC/ConnectRPC returned code (okandcanceledare not considered rejected requests). -
The
substreams-tier2app now exposes a new Prometheus metricsubstreams_tier2_rejected_request_counterthat tracks rejected requests. The counter is labelled by the gRPC/ConnectRPC returned code (okandcanceledare not considered rejected requests). -
Properly accept and compress responses with
gzipfor browser HTTP clients using ConnectWeb withAccept-Encodingheader -
Allow setting subscription channel max capacity via
SOURCE_CHAN_SIZEenv var (default: 100)
- Fix an issue preventing proper detection of gzip compression when multiple headers are set (ex: python grpc client)
- Fix an issue preventing some tier2 requests on last-stage from correctly generating stores. This could lead to some missing "backfilling" jobs and slower time to first block on reconnection.
- Fix a thread leak on cursor resolution resulting in bad counter for active connections
- Add support for zstd encoding on server
Note
This release will reject connections from clients that don't support GZIP compression. Use --substreams-tier1-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.
- Fix broken
tools pollercommand in v2.8.2
Warning
Do NOT use this version with tools poller, a flag issue prevents the poller from starting up. Recommended that you upgrade to v2.8.3 ASAP
Note
This release will reject connections from clients that don't support GZIP compression. Use --substreams-tier1-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.
- Bump firehose-core to v1.6.8
- Substreams: add
--substreams-tier1-enforce-compressionto reject connections from clients that do not support GZIP compression - Substreams performance: reduced the number of mallocs (patching some third-party libraries)
- Substreams performance: removed heavy tracing (that wasn't exposed to the client)
- Fixed
--reader-node-line-buffer-sizeflag that was not being respected in reader-node-stdin app - poller: add
--max-block-fetch-duration
firehose-grpc-listen-addrandsubstreams-tier1-grpc-listen-addrflags now accepts comma-separated addresses (allows listening as plaintext and snakeoil-ssl at the same time or on specific ip addresses)- rpc-poller: fix fetching the first block on an endpoint (was not following the cursor, failing unnecessarily on non-archive nodes)
- Adding
requests_hashwhich was added by EIP-7685 - Adding nil safety check on the
CombinedFilterand looping over the transaction_trace receipts - Bump
substreamsanddmeteringto latest version adding theoutputModuleHashto metering sender.
Note All caches for stores using the updatePolicy
set_sum(added in substreams v1.7.0) and modules that depend on them will need to be deleted, since they may contain bad data.
- Fix bad data in stores using
set_sumpolicy: squashing of store segments incorrectly "summed" some values that should have been "set" if the last event for a key on this segment was a "sum" - Fix small bug making some requests in development-mode slow to start (when starting close to the module initialBlock with a store that doesn't start on a boundary)
- Fixed an(other) issue where multiple stores running on the same stage with different initialBlocks will fail to proress (and hang)
- Fix bug where some invalid cursors may be sent (with 'LIB' being above the block being sent) and add safeguard/loggin if the bug appears again
- Fix panic in the whole tier2 process when stores go above the size limit while being read from "kvops" cached changes
- Fix "cannot resolve 'old cursor' from files in passthrough mode" error on some requests with an old cursor
- Fix handling of 'special case' substreams module with only "params" as its input: should not skip this execution (used in graph-node for head tracking)
-> empty files in module cache with hash
d3b1920483180cbcd2fd10abcabbee431146f4c8should be deleted for consistency
- [Operator] The flag
--advertise-block-id-encodingnow accepts shorter form:hex,base64, etc. The older longer formBLOCK_ID_ENCODING_HEXis still supported but we suggested using the shorter form from now on.
Note Since a bug that affected substreams with "skipping blocks" was corrected in this release, any previously produced substreams cache should be considered as possibly corrupted and be eventually replaced
- Substreams: fix bad handling of modules with multiple inputs when only one of them is filtered, resulting in bad outputs in production-mode.
- Substreams: fix stalling on some substreams with stores and mappers with different start block numbers on the same stage
- Substreams: fix 'development mode' and LIVE mode executing some modules that should be skipped
- Bump substreams to v1.10.0
- Bump firehose-core to v1.6.1
-
Add
sf.firehose.v2.EndpointInfo/Infoservice on Firehose andsf.substreams.rpc.v2.EndpointInfo/Infoto Substreams endpoints. This involves the following new flags:advertise-chain-nameCanonical name of the chain according to https://thegraph.com/docs/en/developing/supported-networks/ (required, unless it is in the "well-known" list)advertise-chain-aliasesAlternate names for that chain (optional)advertise-block-featuresList of features describing the blocks (optional)ignore-advertise-validationRuntime checks of chain name/features/encoding against the genesis block will no longer cause server to wait or fail.
-
Add a well-known list of chains (hard-coded in
wellknown/chains.goto help automatically determine the 'advertise' flag values). Users are encouraged to propose Pull Requests to add more chains to the list. -
The new info endpoint adds a mandatory fetching of the first streamable block on startup, with a failure if no block can be fetched after 3 minutes and you are running
firehoseorsubstreams-tier1service. It validates the following on a well-known chain:- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g.
matic, it will refuse another value foradvertise-chain-namethanmaticor one of its aliases (polygon) - If the first-streamable-block does not match any known chain, it will require the
advertise-chain-nameto be non-empty
- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g.
-
Substreams: add
--common-tmp-dirflag to activate local caching of pre-compiled WASM modules through wazero v1.8.0 feature (performance improvement on WASM compilation) -
Substreams: revert module hash calculation from
v2.6.5, when using a non-zero firstStreamableBlock. Hashes will now be the same even if the chain's first streamable block affects the initialBlock of a module. -
Substreams: add
--substreams-block-execution-timeoutflag (default 3 minutes) to prevent requests stalling. Timeout errors are returned to the client who can decide to retry.
- Bump substreams to v1.9.3: fix high CPU usage on tier1 caused by a bad error handling
- Bump substreams to v1.9.2: Prevent Noop handler from sending outputs with 'Stalled' step in cursor (which breaks substreams-sink-kv)
- Bump firehose-core to v1.5.6: add
--reader-node-line-buffer-sizeflag and bump default value from 100M to 200M to go over crazy block 278208000 on Solana
- Fixed a bug in the blockfetcher which could cause transactions receipts to be nil
- Fixed a bug in substreams where chains with non-zero first-streamable-block would cause some substreams to hang. Solution changes the 'cached' hashes for those substreams.
- Fix a bug introduced in v1.6.0 that could result in corrupted store "state" file if all the "outputs" were already cached for a module in a given segment (rare occurence)
- We recommend clearing your substreams cache after this upgrade and re-processing or validating your data if you use stores.
- Expose a new intrinsic to modules:
skip_empty_output, which causes the module output to be skipped if it has zero bytes. (Watch out, a protobuf object with all its default values will have zero bytes) - Improve schedule order (faster time to first block) for substreams with multiple stages when starting mid-chain
- fix "hub" not recovering on certain disconnections in relayer/firehose/substreams (scenarios requiring full restart)
- Bumped firehose-core to v1.5.2 and substreams v1.8.0
- Added substreams back-filler to populate cache for live requests when the blocks become final
- Fixed: truncate very long details on error messages to prevent them from disappearing when behind a (misbehaving) load-balancer
-
Bumped firehose-core v1.5.1 and substreams v1.7.3
-
Bootstrapping from live blocks improved for chains with very slow blocks or with very fast blocks (affects relayer, firehose and substreams tier1)
-
Substreams fixed slow response close to HEAD in production-mode
- Substreams engine is now able run Rust code that depends on
solana_programin Solana land to decode andalloy/ether-rsin Ethereum land
Those libraries when used in a wasm32-unknown-unknown context creates in a bunch of wasmbindgen imports in the resulting Substreams Rust code, imports that led to runtime errors because Substreams engine didn't know about those special imports until today.
The Substreams engine is now able to "shims" those wasmbindgen imports enabling you to run code that depends libraries like solana_program and alloy/ether-rs which are known to pull those wasmbindgen imports. This is going to work as long as you do not actually call those special imports. Normal usage of those libraries don't accidentally call those methods normally. If they are called, the WASM module will fail at runtime and stall the Substreams module from going forward.
To enable this feature, you need to explicitly opt-in by appending a +wasm-bindgen-shims at the end of the binary's type in your Substreams manifest:
binaries:
default:
type: wasm/rust-v1
file: <some_file>to become
binaries:
default:
type: wasm/rust-v1+wasm-bindgen-shims
file: <some_file>-
Substreams clients now enable gzip compression over the network (already supported by servers).
-
Substreams binary type can now be optionally composed of runtime extensions by appending a
+<extension>,[<extesions...>]at the end of the binary type. Extensions arekey[=value]that are runtime specifics.[!NOTE] If you were a library author and parsing generic Substreams manifest(s), you will now need to handle that possibility in the binary type. If you were reading the field without any processing, you don't have to change nothing.
- bump firehose-core to v1.4.2
- execout: preload only one file instead of two, log if undeleted caches found
- execout: add environment variable SUBSTREAMS_DISABLE_PRELOAD_EXEC_FILES to disable file preloading
- Revert sanity check to support the special case of a substreams with only 'params' as input. This allows a chain-agnostic event to be sent, along with the clock.
- Fix error handling when resolved start-block == stop-block and stop-block is defined as non-zero
Note Upgrading will require changing the tier1 and tier2 versions concurrently, as the internal protocol has changed.
- Index Modules and Block Filter now supported. See https://github.com/streamingfast/substreams-foundational-modules for an example implementation
- Various scheduling and performance improvements
- env variable
SUBSTREAMS_WORKERS_RAMPUP_TIMEchanged from4sto0. Set it to4sto keep previous behavior otelcol://tracing protocol no longer supported
-
Fixed a crash when
eth_callbatch is of length 0 and a retry is attempted. -
Allow stores to write to stores with out-of-order ordinals (they will be reordered at the end of the module execution for each block)
-
Fix issue in substreams-tier2 causing some files to be written to the wrong place sometimes under load, resulting in some hanging requests
-
The
fireeth tools download-from-firehosenow respects its documentation when doing--help, correct invocation now isfireeth tools download-from-firehose <endpoint> <start>:<end> <output_folder>. -
The
fireeth tools download-from-firehosehas been improved to work with new Firehosesf.firehose.v2.BlockMetadatafield, if the server sends this new field, the tool is going to work on any chain. If the server's you are reaching is not recent enough, the tool fallbacks to the previous logic. All StreamingFast endpoints should serves be compatible. -
Firehose response (both single block and stream) now include the
sf.firehose.v2.BlockMetadatafield. This new field contains the chain agnostic fields we hold about any block of any chain.
- bump substreams to v1.5.5 with fix in wazero to prevent process freezing on certain substreams
- Added support for Firehose reader format 2.5 which will be required for
BSC 1.4.5+.
- Updated block model to add
BalanceChange#Reason.REWARD_BLOB_FEEfor BSC Tycho hard-fork.
- fix a possible panic() when an request is interrupted during the file loading phase of a squashing operation.
- fix a rare possibility of stalling if only some fullkv stores caches were deleted, but further segments were still present.
- fix stats counters for store operations time
- fix memory leak on substreams execution (by bumping wazero dependency)
- remove the need for substreams-tier1 blocktype auto-detection
- fix missing error handling when writing output data to files. This could result in tier1 request just "hanging" waiting for the file never produced by tier2.
- fix handling of dstore error in tier1 'execout walker' causing stalling issues on S3 or on unexpected storage errors
- increase number of retries on storage when writing states or execouts (5 -> 10)
- prevent slow squashing when loading each segment from full KV store (can happen when a stage contains multiple stores)
- Fix a context leak causing tier1 responses to slow down progressively
- fix thread leak in metering affecting substreams
- revert a substreams scheduler optimisation that causes slow restarts when close to head
- add substreams_tier2_active_requests and substreams_tier2_request_counter prometheus metrics
- Substreams bumped to @v1.5.0: See https://github.com/streamingfast/substreams/releases/tag/v1.5.0 for details.
- A single substreams-tier2 instance can now serve requests for multiple chains or networks. All network-specific parameters are now passed from Tier1 to Tier2 in the internal ProcessRange request.
- This allows you to better use your computing resources by pooling all the networks together.
Important
Since the tier2 services will now get the network information from the tier1 request, you must make sure that the file paths and network addresses will be the same for both tiers.
ex: if --common-merged-blocks-store-url=/data/merged is set on tier1, make sure the merged blocks are also available from tier2 under the path /data/merged.
The flags --substreams-state-store-url, --substreams-state-store-default-tag, --common-merged-blocks-store-url, --substreams-rpc-endpoints stringArray and --substreams-rpc-gas-limit are now ignored on tier2.
The flag --common-first-streamable-block should be set to 0 to accommodate every chain.
Non-ethereum chains can query a firehose-ethereum tier2, but the opposite is not true, since only the firehose-ethereum implements the eth_call WASM extension.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files. The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.
- All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing).
- Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
- Tier2 will skip processing completely if it's processing the last stage and the
output_moduleis a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time) - Tier2 will skip processing completely if it's processing a stage where all the stores and outputs have been processed and cached.
- Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
- Improved file listing performance for Google Storage backends by 25%!
Tip
Concurrent requests on the same module hashes may benefit from the other requests' work to a certain extent (up to 75%!) -- The very first request does most of the work for the other ones.
Tip
More caches will increase disk usage and there is no automatic removal of old module caches. The operator is responsible for deleting old module caches.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files.
The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.
- Readiness metric for Substreams tier1 app is now named
substreams_tier1(was mistakenly calledfirehosebefore). - Added back readiness metric for Substreams tiere app (named
substreams_tier2). - Added metric
substreams_tier1_active_worker_requestswhich gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes. - Added metric
substreams_tier1_worker_request_counterwhich gives the total Substreams worker requests a tier1 app made against tier2 nodes.
- Added
--merger-delete-threadsto customize the number of threads the merger will use to delete files. It's recommended to increase this when using Ceph as S3 storage provider to 25 or higher (due to performance issues with deletes the merger might otherwise not be able to delete one-block files fast enough).
-
Fixed
tools check merged-blocksdefault range when-r <range>is not provided to now be[0, +∞](was previously[HEAD, +∞]). -
Fixed
tools check merged-blocksto be able to run without a block range provided. -
Added API Key based authentication to
tools firehose-clientandtools firehose-single-block-client, specify the value through environment variableFIREHOSE_API_KEY(you can use flag--api-key-env-varto change variable's name to something else thanFIREHOSE_API_KEY). -
Fixed
tools check merged-blocksexamples using block range (range should be specified as[<start>]?:[<end>]). -
Added
--substreams-tier2-max-concurrent-requeststo limit the number of concurrent requests to the tier2 Substreams service.
- Adding traceID for RPCCalls
- BlockFetcher: added support for WithdrawalsRoot, BlobGasUsed, BlobExcessGas and ParentBeaconRoot fields when fetching blocks from RPC.
- Substreams: add support for
substreams-tier2-max-concurrent-requestsflag to limit the number of concurrent requests to tier2
Warning
This release deprecates the "RPC Cache (for eth_calls)" feature of substreams: It has been turned off by default and will not be supported in future releases.
The RPC cache was a not-well-known feature that cached all eth_calls responses by default and loaded them on each request.
It is being deprecated because it has a negative impact on global performance.
If you want to cache your eth_call responses, you should do it in a specialized proxy instead of having substreams manage this.
Until the feature is completely removed, you can keep the previous behavior by setting the --substreams-rpc-cache-store-url flag to a non-empty value (its previous default value was {data-dir}/rpc-cache)
- Performance: prevent reprocessing jobs when there is only a mapper in production mode and everything is already cached
- Performance: prevent "UpdateStats" from running too often and stalling other operations when running with a high parallel jobs count
- Performance: fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Added the output module's hash to the "incoming request" log
- Substreams RPC: add
--substreams-rpc-gas-limitflag to allow overriding default of 50M. Arbitrum chains behave better with a value of0to avoidintrinsic gas too low (supplied gas 50000000)errors
- The
reader-node-bootstrap-urlgained the ability to be bootstrapped from abashscript.
If the bootstrap URL is of the form bash:///<path/to/script>?<parameters>, the bash script at
<path/to/script> will be executed. The script is going to receive in environment variables the resolved
reader node variables in the form of READER_NODE_<VARIABLE_NAME>. The fully resolved node arguments
(from reader-node-arguments) are passed as args to the bash script. The query parameters accepted are:
* `arg=<value>` | Pass as extra argument to the script, prepended to the list of resolved node arguments
* `env=<key>%3d<value>` | Pass as extra environment variable as `<key>=<value>` with key being upper-cased (multiple(s) allowed)
* `env_<key>=<value>` | Pass as extra environment variable as `<key>=<value>` with key being upper-cased (multiple(s) allowed)
* `cwd=<path>` | Change the working directory to `<path>` before running the script
* `interpreter=<path>` | Use `<path>` as the interpreter to run the script
* `interpreter_arg=<arg>` | Pass `<interpreter_arg>` as arguments to the interpreter before the script path (multiple(s) allowed)
Note
The bash:/// script support is currently experimental and might change in upcoming releases, the behavior changes will be
clearly documented here.
- Fix JSON decoding in the client tools (firehose-client, print merged-blocks, etc.).
- The block decoding to JSON is broken in the client tools (firehose-client, print merged-blocks, etc.). Use version v2.3.1
- Fix block poller panic on v2.3.2
- This release has a broken RPC poller component. Upgrade to v2.3.3.
- The block decoding to JSON is broken in the client tools (firehose-client, print merged-blocks, etc.). Use version v2.3.1
- Add missing metering events for
sf.firehose.v2.Fetch/Blockresponses. - Changed default polling interval in 'continuous authentication' from 10s to 60s, added 'interval' query param to URL.
- Fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Fixed load-balancing from tier1 to tier2 when using dns:/// (round-robin policy was not set correctly)
- Added
trace_idin grpc authentication calls - Bumped connect-go library to new "connectrpc.com/connect" location
- Firehose blocks that were produced using the RPC Poller will have to be extracted again to fix the Transaction Status and the potential missing receipt (ex: arb-one pre-nitro, Avalanche, Optimism ...)
- Fix race condition in RPC Poller which would cause some missing transaction receipts
- Fix conversion of transaction status from RPC Poller: failed transactions would show up as "status unknown" in firehose blocks.
- Added the support the FORCE_FINALITY_AFTER_BLOCKS environment variable: setting it to a value like '200' will make the 'reader' mark blocks as final after a maximum of 200 block confirmations, even if the chain implements finality via a beacon that lags behind.
-
Reduce logging and logging "payload".
-
Tools printing Firehose
Blockmodel to JSON now have--proto-pathstake higher precedence over well-known types and even the chain itself, the order is--proto-paths>chain>well-known(sowell-knownis lookup last). -
The
tools print one-blocknow works correctly on blocks generated by omni-chainfirecorebinary. -
The various health endpoint now sets
Content-Type: application/jsonheader prior sending back their response to the client. -
The
firehose,substreams-tier1andsubstream-tier2health endpoint now respects thecommon-system-shutdown-signal-delayconfiguration value meaning that the health endpoint will returnfalsenow ifSIGINThas been received but we are still in the shutdown unready period defined by the config value. If you use some sort of load balancer, you should make sure they are configured to use the health endpoint and you shouldcommon-system-shutdown-signal-delayto something like15s. -
Changed
readerlogger back toreader-nodeto fit with the app's name which isreader-node. -
Fix
tools compare-blocksthat would fail on new format. -
Fix
substreamsto correctly delete.partialfiles when serving a request that is not on a boundary
The Cancun hard fork happened on Goerli and after further review, we decided to change the Protobuf definition for the new BlockHeader, Transaction and TransactionReceipt fields that are related to blob transaction.
We made explicit that those fields are optional in the Protobuf definition which will render them in your language of choice using the appropriate "null" mechanism. For example on Golang, those fields are generated as BlobGasUsed *uint64 and ExcessBlobGas *uint64 which will make it clear that those fields are not populated at all.
The affected fields are:
- BlockHeader.blob_gas_used, now
optional uint64. - BlockHeader.excess_blob_gas, now
optional uint64. - TransactionTrace.blob_gas, now
optional uint64. - TransactionTrace.blob_gas_fee_cap, now
optional BigInt. - TransactionReceipt.blob_gas_used, now
optional uint64. - TransactionReceipt.blob_gas_price, now
optional BigInt.
This is technically a breaking change for those that could have consumed those fields already but we think he impact is so minimal that it's better to make the change right now.
You will need to reprocess a small Goerli range. You should update to new version to produce the newer version and the reprocess from block 10377700 up to when you upgraded to v2.2.2.
The block 10377700 was chosen since it is the block at the time of the first release we did supporting Cancun where we introduced those new field. If you know when you deploy either v2.2.0 or v2.2.1, you should reprocess from that point.
An alternative to reprocessing is updating your blocks by having a StreamingFast API Token and using fireeth tools download-from-firehose goerli.eth.streamingfast.io:443 -a SUBSTREAMS_API_TOKEN 10377700:<recent block rounded to 100s> <destination>.
Note
You should download the blocks to a temporary destination and copy over to your production destination once you have them all.
You can reach to us on Discord if you need help on something.
- Updated the documentation for some of the upcoming new Cancun hard-fork fields:
- Added support for EIP-4844 (upcoming with activation of Cancun fork), through instrumented go-ethereum nodes with version
fh2.4. This adds new fields in the Ethereum Block model, fields that will be non-empty when the Ethereum network your pulling have EIP-4844 activated. The fields in questions are:- Block.system_calls
- BlockHeader.blob_gas_used
- BlockHeader.excess_blob_gas
- BlockHeader.parent_beacon_root
- TransactionTrace.blob_gas
- TransactionTrace.blob_gas_fee_cap
- TransactionTrace.blob_hashes
- TransactionReceipt.blob_gas_used
- TransactionReceipt.blob_gas_price
- A new
TransactionTrace.Typevalue TRX_TYPE_BLOB
Important
Operators running Goerli chain will need to upgrade to this version, with this geth node release: https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.13.10-fh2.4
- Fixed error-passing between tier2 and tier1 (tier1 will not retry sending requests that fail deterministicly to tier2)
- Tier1 will now schedule a single job on tier2, quickly ramping up to the requested number of workers after 4 seconds of delay, to catch early exceptions
- "store became too big" is now considered a deterministic error and returns code "InvalidArgument"
- Added
tools poller generic-evmsubcommand. It is identical to optimism/arb-one in feature at the moment and should work for most evm chains.
- Bump to major release firehose-core v1.0.0
Important
When upgrading your stack to this release, be sure to upgrade all components simultaneously because the block encapsulation format has changed. Blocks that are merged using the new merger will not be readable by previous versions. There is no simple way to revert, except by deleting the all the one-blocks and merged-blocks that were produced with this version.
- Blocks files (one-blocks and merged) are now stored with a new format using
google.protobuf.anyformat. Previous blocks can still be read and processed.
- Added RPC pollers for Optimism and Arb-one: These can be used from by running the reader-node with
--reader-node-path=/path/to/fireethand--reader-node-arguments="tools poller {optimism|arb-one} [more flags...]" - Added
tools fix-any-typeto rewrite the previous merged-blocks (OPTIONAL)
- Fixed grpc error code when shutting down: changed from Canceled to Unavailable
- Fixed SF_TRACING feature (regression broke the ability to specify a tracing endpoint)
- Fixed substreams GRPC/Connect error codes not propagating correctly
- Firehose connections rate-limiting will now force an (increased) delay of between 1 and 4 seconds (random value) before refusing a connection when under heavy load
- Fixed the
fix-polygon-indextool (parsing error made it unusable in v2.0.0-rc.1) - Fixed some false positives in
compare-blocks-rpc
This releases refactor firehose-ethereum repository to use the common shared Firehose Core library (https://github.com/streamingfast/firehose-core) that every single Firehose supported chain should use and follow.
Both at the data level and gRPC level, there is no changes in behavior to all core components which are reader-node, merger, relayer, firehose, substreams-tier1 and substreams-tier2.
A lot of changes happened at the operators level however and some superflous mode have been removed, especially around the reader-node application. The full changes is listed below, operators should review thoroughly the changelog.
Important
It's important to emphasis that at the data level, nothing changed, so reverting to 1.4.22 in case of a problem is quite easy and no special data migration is required outside of changing back to the old set of flags that was used before.
You will find below the detailed upgrade procedure for the configuration file operators usually use. If you are using the flags based approach, simply update the corresponding flags.
Important
We have had reports of older versions of this software creating corrupted merged-blocks-files (with duplicate or out-of-bound blocks) This release adds additional validation of merged-blocks to prevent serving duplicate blocks from the firehose or substreams service. This may cause service outage if you have produced those blocks or downloaded them from another party who was affected by this bug. See the Finding and fixing corrupted merged-blocks-files to see how you can prevent service outage.
Here a bullet list for upgrading your instance, we still recommend to fully read each section below, the list here can serve as a check list. The list below is done in such way that you get back the same "instance" as before. The listening addresses changes can be omitted as long as you update other tools to account for the port changes list your load balancer.
-
Add config
config-file: ./sf.yamlif not present already -
Add config
data-dir: ./sf-dataif not present already -
Rename config
verbosetolog-verbosityif present -
Add config
common-blocks-cache-dir: ./sf-data/blocks-cacheif not present already -
Remove config
common-chain-idif present -
Remove config
common-deployment-idif present -
Remove config
common-network-idif present -
Add config
common-live-blocks-addr: :13011if not present already -
Add config
relayer-grpc-listen-addr: :13011ifcommon-live-blocks-addrhas been added in previous step -
Add config
reader-node-grpc-listen-addr: :13010if not present already -
Add config
relayer-source: :13010ifreader-node-grpc-listen-addrhas been added in previous step -
Remove config
reader-node-enforce-peersif present -
Remove config
reader-node-log-to-zapif present -
Remove config
reader-node-ipc-pathif present -
Remove config
reader-node-typeif present -
Replace config
reader-node-arguments: +--<flag1> --<flag2> ...byreader-node-arguments: --networkid=<network-id> --datadir={node-data-dir} --port=30305 --http --http.api=eth,net,web3 --http.port=8547 --http.addr=0.0.0.0 --http.vhosts=* --firehose-enabled --<flag1> --<flag2> ...[!NOTE] The
<network-id>is dynamic and should be replace with a literal value like1for Ethereum Mainnet. The{node-data-dir}value is actually a templating value that is going o be resolved for you (resolves to value of configreader-node-data-dir).![!IMPORTANT] Ensure that
--firehose-enabledis part of the flag! Moreover, tweak flags to avoid repetitions if your were overriding some of them. -
Remove
nodeunderstart: args:list -
Add config
merger-grpc-listen-addr: :13012if not present already -
Add config
firehose-grpc-listen-addr: :13042if not present already -
Add config
substreams-tier1-grpc-listen-addr: :13044if not present already -
Add config
substreams-tier1-grpc-listen-addr: :13044if not present already -
Add config
substreams-tier2-grpc-listen-addr: :13045if not present already -
Add config
substreams-tier1-subrequests-endpoint: :13045ifsubstreams-tier1-grpc-listen-addrhas been added in previous step -
Replace config
combined-index-buildertoindex-builderunderstart: args:list -
Rename config
common-block-index-sizestocommon-index-block-sizesif present -
Rename config
combined-index-builder-grpc-listen-addrtoindex-builder-grpc-listen-addrif present -
Add config
index-builder-grpc-listen-addr: :13043if you didn't havecombined-index-builder-grpc-listen-addrpreviously -
Rename config
combined-index-builder-index-sizetoindex-builder-index-sizeif present -
Rename config
combined-index-builder-start-blocktoindex-builder-start-blockif present -
Rename config
combined-index-builder-stop-blocktoindex-builder-stop-blockif present -
Replace any occurrences of
{sf-data-dir}to{data-dir}in any of your configuration values if present
-
The default value for
config-filechanged fromsf.yamltofirehose.yaml. If you didn't had this flag defined and wish to keep the old default, defineconfig-file: sf.yaml. -
The default value for
data-dirchanged fromsf-datatofirehose-data. If you didn't had this flag defined before, you should either movesf-datatofirehose-dataor definedata-dir: sf-data.[!NOTE] This is an important change, forgetting to change it will change expected locations of data leading to errors or wrong data.
-
Deprecated The
{sf-data-dir}templating argument used in various flags to resolve to the--data-dir=<location>value has been deprecated and should now be simply{data-dir}. The older replacement is still going to work but you should replace any occurrences of{sf-data-dir}in your flag definition by{data-dir}. -
The default value for
common-blocks-cache-dirchanged from{sf-data-dir}/blocks-cachetofile://{data-dir}/storage/blocks-cache. If you didn't had this flag defined and you hadcommon-blocks-cache-enabled: true, you should definecommon-blocks-cache-dir: file://{data-dir}/blocks-cache. -
The default value for
common-live-blocks-addrchanged from:13011to:10014. If you didn't had this flag defined and wish to keep the old default, definecommon-live-blocks-addr: 13011and ensure you also modifyrelayer-grpc-listen-addr: :13011(see next entry for details). -
The Go module
github.com/streamingfast/firehose-ethereum/typeshas been removed, if you were depending ongithub.com/streamingfast/firehose-ethereum/typesin your project before, depend directly ongithub.com/streamingfast/firehose-ethereuminstead.[!NOTE] This will pull much more dependencies then before, if you're reluctant of such additions, talk to us on Discord and we can offer alternatives depending on what you were using.
-
The config value
verbosehas been renamed tolog-verbositykeeping the same semantic and default value as before[!NOTE] The short flag version is still
-vand can still be provided multiple times like-vvvv.
This change will impact all operators currently running Firehose on Ethereum so it's important to pay attention to the upgrade procedure below, if you are unsure of something, reach to us on Discord.
Before this release, the reader-node app was managing for you a portion of the reader-node-arguments configuration value, prepending some arguments that would be passed to geth when invoking it, the list of arguments that were automatically provided before:
--networkid=<value of config value 'common-network-id'>--datadir=<value of config value 'reader-node-data-dir'>--ipcpath=<value of config value 'reader-node-ipc-path'>--port=30305--http--http.api=eth,net,web3--http.port=8547--http.addr=0.0.0.0--http.vhosts=*--firehose-enabled
We have now removed those magical additions and operators are now responsible of providing the flags they required to properly run a Firehose-enabled native geth node. The + sign that was used to append/override the flags has been removed also since no default additions is performed, the + was now useless. To make some flag easier to define and avoid repetition, a few templating variable can be used within the reader-node-arguments value:
{data-dir}The current data-dir path defined by the config valuedata-dir{node-data-dir}The node data dir path defined by the flagreader-node-data-dir{hostname}The machine's hostname{start-block-num}The resolved start block number defined by the flagreader-node-start-block-num(can be overwritten){stop-block-num}The stop block number defined by the flagreader-node-stop-block-num
As an example, if you provide the config value reader-node-data-dir=/var/geth for example, then you could use reader-node-arguments: --datadir={node-data-dir} and that would resolve to reader-node-arguments: --datadir=/var/geth for you.
Note
The reader-node-arguments is a string that is parsed using Shell word splitting rules which means for example that double quotes are supported like --datadir="/var/with space/path" and the argument will be correctly accepted. We use https://github.com/kballard/go-shellquote as your parsing library.
We also removed the following reader-node configuration value:
reader-node-type(No replacement needed, just remove it)reader-node-ipc-path(If you were using that, define it manually usinggethflag--ipcpath=...)reader-node-enforce-peers(If you were using that, use agethconfig file to add static peers to your node, read about static peers forgethon the Web)
Default listening addresses changed also to be the same on all firehose-<...> project, meaning consistent ports across all chains for operators. The reader-node-grpc-listen-addr default listen address went from :13010 to :10010 and reader-node-manager-api-addr from :13009 to :10011. If you have no occurrences of 13010 or 13009 in your config file or your scripts, there is nothing to do. Otherwise, feel free to adjust the default port to fit your needs, if you do change reader-node-grpc-listen-addr, ensure --relayer-source is also updated as by default it points to :10010.
Here an example of the required changes.
Change:
start:
args:
- ...
- reader-node
- ...
flags:
...
reader-node-bootstrap-data-url: ./reader/genesis.json
reader-node-enforce-peers: localhost:13041
reader-node-arguments: +--firehose-genesis-file=./reader/genesis.json --authrpc.port=8552
reader-node-log-to-zap: false
...To:
start:
args:
- ...
- reader-node
- ...
flags:
...
reader-node-bootstrap-data-url: ./reader/genesis.json
reader-node-arguments:
--networkid=1515
--datadir={node-data-dir}
--ipcpath={data-dir}/reader/ipc
--port=30305
--http
--http.api=eth,net,web3
--http.port=8547
--http.addr=0.0.0.0
--http.vhosts=*
--firehose-enabled
--firehose-genesis-file=./reader/genesis.json
--authrpc.port=8552
...Note
Adjust the --networkid=1515 value to fit your targeted chain, see https://chainlist.org/ for a list of Ethereum chain and their network-id value.
In previous version of firehose-ethereum, it was possible to use the node app to launch managed "peering/backup/whatever" Ethereum node, this is not possible anymore. If you were using the node app previously, like in this config:
start:
args:
- ...
- node
- ...
flags:
...
node-...You must now remove the node app from args and any flags starting with node-. The migration path is to run those on your own without the use of fireeth and using whatever tools fits your desired needs.
We have completely drop support to concentrate on the core mission of Firehose which is to run reader nodes to extract Firehose blocks from it.
Note This is about the
nodeapp and not thereader-node, we think usage of this app is minimal/inexistent.
The app has been renamed to simply index-builder and the flags has been completely renamed removing the prefix combined- in front of them.
Change:
start:
args:
- ...
- combined-index-builder
- ...
flags:
...
combined-index-builder-grpc-listen-addr: ":9999"
combined-index-builder-index-size: 10000
combined-index-builder-start-block: 0
combined-index-builder-stop-block: 0
...To:
start:
args:
- ...
- index-builder
- ...
flags:
...
index-builder-grpc-listen-addr: ":9999"
index-builder-index-size: 10000
index-builder-start-block: 0
index-builder-stop-block: 0
...- Flag
common-block-index-sizeshas been renamed tocommon-index-block-sizes.
Note
Rename only configuration item you had previously defined, do not copy paste verbatim example above.
-
The default value for
relayer-grpc-listen-addrchanged from:13011to:10014. If you didn't had this flag defined and wish to keep the old default, definerelayer-grpc-listen-addr: 13011and ensure you also modifycommon-live-blocks-addr: :13011(see previous entry for details). -
The default value for
relayer-sourcechanged from:13010to:10010. If you didn't had this flag defined and wish to keep the old default, definerelayer-source: 13010and ensure you also modifyreader-node-grpc-listen-addr: :13010.[!NOTE] Must align with
reader-node-grpc-listen-addr!
- The default value for
firehose-grpc-listen-addrchanged from:13042to:10015. If you didn't had this flag defined and wish to keep the old default, definefirehose-grpc-listen-addr: :13042. - Firehose logs now include auth information (userID, keyID, realIP) along with blocks + egress bytes sent.
- The default value for
merger-grpc-listen-addrchanged from:13012to:10012. If you didn't had this flag defined and wish to keep the old default, definemerger-grpc-listen-addr: :13012.
-
The default value for
substreams-tier1-grpc-listen-addrchanged from:13044to:10016. If you didn't had this flag defined and wish to keep the old default, definesubstreams-tier1-grpc-listen-addr: :13044. -
The default value for
substreams-tier1-subrequests-endpointchanged from:13045to:10017. If you didn't had this flag defined and wish to keep the old default, definesubstreams-tier1-subrequests-endpoint: :13044.[!NOTE] Must align with
substreams-tier1-grpc-listen-addr! -
The default value for
substreams-tier2-grpc-listen-addrchanged from:13045to:10017. If you didn't had this flag defined and wish to keep the old default, definesubstreams-tier2-grpc-listen-addr: :13045.
- Added field
DetailLevel(Base, Extended(default)) tosf.ethereum.type.v2.Blockto distinguish the new blocks produced from polling RPC (base) from the blocks normally produced with firehose instrumentation (extended)
- Added command
tools fix-bloated-merged-blocksto go through a range of possibly corrupted merged-blocks (with duplicates and out-of-range blocks) and try to fix them, writing the fixed merged-blocks files to another destination.
- Transform
sf.ethereum.transform.v1.LightBlockis not supported, this has been deprecated for a long time and should not be used anywhere.
You may have certain merged-blocks files (most likely OLD blocks) that contain more than 100 blocks (with duplicate or extra out-of-bound blocks)
- Find the affected files by running the following command (can be run multiple times in parallel, over smaller ranges)
tools check merged-blocks-batch <merged-blocks-store> <start> <stop>
- If you see any affected range, produce fixed merged-blocks files with the following command, on each range:
tools fix-bloated-merged-blocks <merged-blocks-store> <output-store> <start>:<stop>
- Copy the merged-blocks files created in output-store over to the your merged-blocks-store, replacing the corrupted files.
- Fixed a regression where
reader-node-rolewas changed todevby default, putting back the defaultgethvalue.
- Bump Substreams to
v1.1.20with a fix for some minor bug fixes related to start block processing
- Added
tools poll-rpc-blockscommand to launch an RPC-based poller that acts as a firehose extractor node, printing base64-encoded protobuf blocks to stdout (used by the 'dev' node-type). It creates "light" blocks, without traces and ordinals. - Added
--devflag to thestartcommand to simplify running a local firehose+substreams stack from a development node (ex: Hardhat).- This flag overrides the
--reader-node-path, instead pointing to the fireeth binary itself. - This flag overrides the
--reader-node-type, setting it todevinstead ofgeth. This node type has the following defaultreader-node-arguments:tools poll-rpc-blocks http://localhost:8545 0 - It also removes
nodefrom the list of default apps
- This flag overrides the
- Substreams: fixed metrics calculations (per-module processing-time and external calls were wrong)
- Substreams: fixed immediate EOF when streaming from block 0 to (unbounded) in dev mode
- Bumped substreams to
v1.1.18with a regression fix for when a substreams has a start block in the reversible segment
- Bumped substreams to
v1.1.17with fixmissing decrement on metricssubstreams_active_requests`
The --common-auth-plugin got back the ability to use secret://<expected_secret>?[user_id=<user_id>]&[api_key_id=<api_key_id>] in which case request are authenticated based on the Authorization: Bearer <actual_secret> and continue only if <actual_secret> == <expected_secret>.
- Bumped substreams to
v1.1.16with support of metricssubstreams_active_requestsandsubstreams_counter
- If you started reprocessing the blockchain blocks using release v1.4.14 or v1.4.15, you will need to run the following command to fix the blocks affected by another bug:
fireeth tools fix-polygon-index /your/merged/blocks /temporary/destination 0 48200000(note that you can run multiple instances of this command in parallel to cover the range of blocks from 0 to current HEAD in smaller chunks)
- Fix another data issue found in polygon blocks: blocks that contain a single "system" transaction have "Index=1" for that transaction instead of "Index=0"
- (Substreams) fixed regressions for relative start-blocks for substreams (see https://github.com/streamingfast/substreams/releases/tag/v1.1.14)
If you are indexing Polygon or Mumbai chains, you will need to reprocess the chain from genesis, as your existing Firehose blocks are missing some system transactions.
As always, this can be done with multiple client nodes working in parallel on different chain's segment if you have snapshots at various block heights.
Golang 1.21+ is now also required to build the project.
- Fixed post-processing of polygon blocks: some system transactions were not "bundled" correctly.
- (Substreams) fixed validations for invalid start-blocks (see https://github.com/streamingfast/substreams/releases/tag/v1.1.13)
- Added
tools compare-oneblock-rpccommand to perform a validation between a firehose 'one-block-file' blocks+trx+logs fetched from an RPC endpoint
- The
tools printsubcommands now use hex to encode values instead of base64, making them easier to use
Important
The Substreams service exposed from this version will send progress messages that cannot be decoded by substreams clients prior to v1.1.12. Streaming of the actual data will not be affected. Clients will need to be upgraded to properly decode the new progress messages.
- Bumped substreams to
v1.1.12to support the new progress message format. Progression now relates to stages instead of modules. You can get stage information using thesubstreams infocommand starting at versionv1.1.12.
- added
tools compare-blocks-rpccommand to perform a validation between firehose blocks and blocks+trx+logs fetched from an RPC endpoint
- More tolerant retry/timeouts on filesource (prevent "Context Deadline Exceeded")
This release mainly brings reader-node Firehose Protocol 2.3 support for all networks and not just Polygon. This is important for the upcoming release of Firehose-enabled geth version 1.2.11 and 1.2.12 that are going to be releases shortly.
Golang 1.20+ is now also required to build the project.
- Support reader node Firehose Protocol 2.3 on all networks now (and not just Polygon).
- Removed
--substreams-tier1-request-statsand--substreams-tier1-request-stats(substreams request-stats are now always sent to clients)
tools check merged-blocksnow correctly prints missing block gaps even without print-full or print-stats.
- Now requires Go 1.20+ to compile the project.
- Substreams bumped: better "Progress" messages
- Bumped
firehoseandsubstreamslibrary to fix a bug where live blocks were not metered correctly.
- Fixed: jobs would hang when flags
--substreams-state-bundle-sizeand--substreams-tier1-subrequests-sizehad different values. The latter flag has been completely removed, subrequests will be bound to the state bundle size.
- Added support for continuous authentication via the grpc auth plugin (allowing cutoff triggered by the auth system).
The substreams server now accepts X-Sf-Substreams-Cache-Tag header to select which Substreams state store URL should be used by the request. When performing a Substreams request, the servers will pick the state store based on the header. This enable consumers to stay on the same cache version when the operators needs to bump the data version (reasons for this could be a bug in Substreams software that caused some cached data to be corrupted on invalid).
To benefit from this, operators that have a version currently in their state store URL should move the version part from --substreams-state-store-url to the new flag --substreams-state-store-default-tag. For example if today you have in your config:
start:
...
flags:
substreams-state-store-url: /<some>/<path>/v3You should convert to:
start:
...
flags:
substreams-state-store-url: /<some>/<path>
substreams-state-store-default-tag: v3The substreams scheduler has been improved to reduce the number of required jobs for parallel processing. This affects backprocessing (preparing the states of modules up to a "start-block") and forward processing (preparing the states and the outputs to speed up streaming in production-mode).
Jobs on tier2 workers are now divided in "stages", each stage generating the partial states for all the modules that have the same dependencies. A substreams that has a single store won't be affected, but one that has 3 top-level stores, which used to run 3 jobs for every segment now only runs a single job per segment to get all the states ready.
The app substreams-tier1 and substreams-tier2 should be upgraded concurrently. Some calls will fail while versions are misaligned.
- Substreams bumped to version v1.1.9
- Authentication plugin
trustcan now specify an exclusive list ofallowedheaders (all lowercase), ex:trust://?allowed=x-sf-user-id,x-sf-api-key-id,x-real-ip,x-sf-substreams-cache-tag - The
tier2app no longer uses thecommon-auth-plugin,trustwill always be used, so thattier1can pass down its headers (ex:X-Sf-Substreams-Cache-Tag).
- Fixed a bug in
substreams-tier1andsubstreams-tier2which caused "live" blocks to be sent while the stream previously received block(s) were historic.
- Added a check for readiness of the
dauthprovider when answering "/healthz" on firehose and substreams
- Changed
--substreams-tier1-debug-request-statsto--substreams-tier1-request-statswhich enabled request stats logging on Substreams Tier1 - Changed
--substreams-tier2-debug-request-statsto--substreams-tier2-request-statswhich enabled request stats logging on Substreams Tier2
- Fixed an occasional panic in substreams-tier1 caused by a race condition
- Fixed the grpc error codes for substreams tier1: Unauthenticated on bad auth, Canceled (endpoint is shutting down, please reconnect) on shutdown
- Fixed the grpc healthcheck method on substreams-tier1 (regression)
- Fixed the default value for flag
common-auth-plugin: now set to 'trusted://' instead of panicking on removed 'null://'
- Substreams (@v1.1.6) is now out of the
firehoseapp, and must be started usingsubstreams-tier1andsubstreams-tier2apps! - Most substreams-related flags have been changed:
- common:
--substreams-rpc-cache-chunk-size,--substreams-rpc-cache-store-url,--substreams-rpc-endpoints,--substreams-state-bundle-size,--substreams-state-store-url - tier1:
--substreams-tier1-debug-request-stats,--substreams-tier1-discovery-service-url,--substreams-tier1-grpc-listen-addr,--substreams-tier1-max-subrequests,--substreams-tier1-subrequests-endpoint,--substreams-tier1-subrequests-insecure,--substreams-tier1-subrequests-plaintext,--substreams-tier1-subrequests-size - tier2:
--substreams-tier2-discovery-service-url,--substreams-tier2-grpc-listen-addr
- common:
- Some auth plugins have been removed, the new available plugins for
--common-auth-pluginsaretrust://andgrpc://. See https://github.com/streamingfast/dauth for details - Metering features have been added, the available plugins for
--common-metering-pluginarenull://,logger://,grpc://. See https://github.com/streamingfast/dmetering for details
- Support for reader node Firehose Protocol 2.3 (for parallel processing of transactions, added to polygon 'bor' v0.4.0)
- Removed the
tools upgrade-merged-blockscommand. Normalization is now part of consolereader within 'codec', not the 'types' package, and cannot be done a posteriori. - Updated metering to fix dependencies
- Updated metering (bumped versions of
dmetering,dauth, andfirehoselibraries.) - Fixed firehose service healthcheck on shutdown
- Fixed panic on download-blocks-from-firehose tool
- When upgrading a substreams server to this version, you should delete all existing module caches to benefit from deterministic output
- Switch default engine from
wasmtimetowazero - Prevent reusing memory between blocks in wasm engine to fix determinism
- Switch our store operations from bigdecimal to fixed point decimal to fix determinism
- Sort the store deltas from
DeletePrefixes()to fix determinism - Implement staged module execution within a single block.
- "Fail fast" on repeating requests with deterministic failures for a "blacklist period", preventing waste of resources
- SessionInit protobuf message now includes resolvedStartBlock and MaxWorkers, sent back to the client
- This release brings an update to
substreamstov1.1.4which includes the following:- Changes the module hash computation implementation to allow reusing caches accross substreams that 'import' other substreams as a dependency.
- Faster shutdown of requests that fail deterministically
- Fixed memory leak in RPC calls
Note This upgrade procedure is applies if your Substreams deployment topology includes both
tier1andtier2processes. If you have defined somewhere the config valuesubstreams-tier2: true, then this applies to you, otherwise, if you can ignore the upgrade procedure.
The components should be deployed simultaneously to tier1 and tier2, or users will end up with backend error(s) saying that some partial file are not found. These errors will be resolved when both tiers are upgraded.
- Added Substreams scheduler tracing support. Enable tracing by setting the ENV variables
SF_TRACINGto one of the following:stdout://cloudtrace://[host:port]?project_id=<project_id>&ratio=<0.25>jaeger://[host:port]?scheme=<http|https>zipkin://[host:port]?scheme=<http|https>otelcol://[host:port]
- This release brings an update to
substreamstov1.1.3which includes the following:- Fixes an important bug that could have generated corrupted store state files. This is important for developers and operators.
- Fixes for race conditions that would return a failure when multiple identical requests are backprocessing.
- Fixes and speed/scaling improvements around the engine.
Note This upgrade procedure is applies if your Substreams deployment topology includes both
tier1andtier2processes. If you have defined somewhere the config valuesubstreams-tier2: true, then this applies to you, otherwise, if you can ignore the upgrade procedure.
This release includes a small change in the internal RPC layer between tier1 processes and tier2 processes. This change requires an ordered upgrade of the processes to avoid errors.
The components should be deployed in this order:
- Deploy and roll out
tier1processes first - Deploy and roll out
tier2processes in second
If you upgrade in the wrong order or if somehow tier2 processes start using the new protocol without tier1 being aware, user will end up with backend error(s) saying that some partial file are not found. Those will be resolved only when tier1 processes have been upgraded successfully.
- Substreams running without a specific tier2
substreams-client-endpointwill now expose tier2 servicesf.substreams.internal.v2.Substreamsso it can be used internally.
Warning If you don't use dedicated tier2 nodes, make sure that you don't expose
sf.substreams.internal.v2.Substreamsto the public (from your load-balancer or using a firewall)
- flag
substreams-partial-mode-enabledrenamed tosubstreams-tier2 - flag
substreams-client-endpointnow defaults to empty string, which means it is its own client-endpoint (as it was before the change to protocol V2)
Substreams protocol changed from sf.substreams.v1.Stream/Blocks to sf.substreams.rpc.v2.Stream/Blocks for client-facing service. This changes the way that substreams clients are notified of chain reorgs.
All substreams clients need to be upgraded to support this new protocol.
See https://github.com/streamingfast/substreams/releases/tag/v1.1.1 for details.
firehose-clienttool now accepts--limitflag to only send that number of blocks. Get the latest block like this:fireeth tools firehose-client <endpoint> --limit=1 -- -1 0
This is a bug fix release for node operators that are about to upgrade to Shanghai release. The Firehose instrumented geth compatible with Shanghai release introduced a new message CANCEL_BLOCK. It seems in some circumstances, we had a bug in the console reader that was actually panicking but the message was received but no block was actively being assembled.
This release fix this bogus behavior by simply ignoring CANCEL_BLOCK message when there is no active block which is harmless. Every node operators that upgrade to https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.11.5-fh2.2 should upgrade to this version.
Note There is no need to update the Firehose instrumented
gethbinary, onlyfireethneeds to be bumped if you already are at the latestgethversion.
- Fixed a bug on console reader when seeing
CANCEL_BLOCKon certain circumstances.
-
Now using Golang 1.20 for building releases.
-
Changed default value of flag
substreams-sub-request-block-range-sizefrom1000to10000.
- Fixed a bug in data normalization for Polygon chain which would cause panics on certain blocks.
- Support for gcp
archivetypes of snapshots
- This release implements the new
CANCEL_BLOCKinstruction from Firehose protocol 2.2 (fh2.2), to reject blocks that failed post-validation. - This release fixes polygon "StateSync" transactions by grouping the calls inside an artificial transaction.
If you had previous blocks from a Polygon chain (bor), you will need to reprocess all your blocks from the node because some StateSync transactions may be missing on some blocks.
This release now supports the new Firehose node exchange format 2.2 which introduced a new exchanged message CANCEL_BLOCK. This has an implication on the Firehose instrumented Geth binary you can use with the release.
- If you use Firehose instrumented
Gethbinary taggedfh2.2(likegeth-v1.11.4-fh2.2-1), you must usefirehose-ethereumversion>= 1.3.6 - If you use Firehose instrumented
Gethbinary taggedfh2.1(likegeth-v1.11.3-fh2.1), you can usefirehose-ethereumversion>= 1.0.0
New releases of Firehose instrumented Geth binary for all chain will soon all be tagged fh2.2, so upgrade to >= 1.3.6 of firehose-ethereum will be required.
This release is required if you run on Goerli and is mostly about supporting the upcoming Shanghai fork that has been activated on Goerli on March 14th.
- Added support for
withdrawalbalance change reason in block model, this is required for running on most recent Goerli Shanghai hard fork. - Added support for
withdrawals_rootonHeaderin the block model, this will be populated only if the chain has activated Shanghai hard fork. --substreams-max-fuel-per-block-modulewill limit the number of wasmtime instructions for a single module in a single block.
Blocks that were migrated from v2 to v3 using the 'upgrade-merged-blocks' should now be considered invalid. The upgrade mechanism did not correctly fix the "caller" on DELEGATECALLs when these calls were nested under another DELEGATECALL.
You should run the upgrade-merged-blocks again if you previously used 'v2' blocks that were upgraded to 'v3'.
This mechanism uses a leaky-bucket mechanism, allowing an initial burst of X connections, allowing a new connection every Y seconds or whenever an existing connection closes.
Use --firehose-rate-limit-bucket-size=50 and --firehose-rate-limit-bucket-fill-rate=1s to allow 50 connections instantly, and another connection every second.
Note that when the server is above the limit, it waits 500ms before it returns codes.Unavailable to the client, forcing a minimal back-off.
- Substreams
RpcCallobject are now validated before being performed to ensure they are correct. - Substreams
RpcCallJSON-RPC code-32602is now treated as a deterministic error (invalid request). tools compare-blocksnow correctly handle segment health reporting and properly prints all differences with-diff.tools compare-blocksnow ignores 'unknown fields' in the protobuf message, unless--include-unknown-fields=truetools compare-blocksnow ignores when a block bundle contains the 'last block of previous bundle' (a now-deprecated feature)
- support for "requester pays" buckets on Google Storage in url, ex:
gs://my-bucket/path?project=my-project-id - substreams were also bumped to current March 1st develop HEAD
- Increased gRPC max received message size accepted by Firehose and Substreams gRPC endpoints to 25 MiB.
- Command
fireeth inithas been removed, this was a leftover from another time and the command was not working anyway.
- flag
common-auto-max-procsto optimize go thread management using github.com/uber-go/automaxprocs - flag
common-auto-mem-limit-percentto specify GOMEMLIMIT based on a percentage of available memory
- Updated to Substreams version
v0.2.0please refer to release page for further info about Substreams changes.
-
Breaking Config value
substreams-stores-save-intervalandsubstreams-output-cache-save-intervalhave been merged together as a single value to avoid potential bugs that would arise when the value is different for those two. The new configuration value is calledsubstreams-cache-save-interval.- To migrate, remove usage of
substreams-stores-save-interval: <number>andsubstreams-output-cache-save-interval: <number>if defined in your config file and replace withsubstreams-cache-save-interval: <number>, if you had two different value before, pick the biggest of the two as the new value to put. We are currently setting to1000for Ethereum Mainnet.
- To migrate, remove usage of
- Fixed various issues with
fireeth tools check merged-blocks- The
stopWalkerror is not reported as a realerroranymore. Incomplete rangeshould now be printed more accurately.
- The
- Release made to fix our building workflows, nothing different than v1.3.0.
-
Updated to Substreams
v0.1.0, please refer to release page for further info about Substreams changes.Warning The state output format for
mapandstoremodules has changed internally to be more compact in Protobuf format. When deploying this new version and using Substreams feature, previous existing state files should be deleted or deployment updated to point to a new store location. The state output store is defined by the flag--substreams-state-store-urlflag.
-
New Prometheus metric
console_reader_trx_read_countcan be used to obtain a transaction rate of how many transactions were read from the node over a period of time. -
New Prometheus metric
console_reader_block_read_countcan be used to obtain a block rate of how many blocks were read from the node over a period of time. -
Added
--header-onlysupport onfireeth tools firehose-client. -
Added
HeaderOnlytransform that can be used to return only the Block's header a few top-level fieldsVer,Hash,NumberandSize. -
Added
fireeth tools firehose-prometheus-exporterto use as a client-side monitoring tool of a Firehose endpoint.
- Deprecated
LightBlockis deprecated and will be removed in the next major version, it's goal is now much better handled byCombineFiltertransform orHeaderOnlytransform if you required only Block's header.
- Hotfix 'nil pointer' panic when saving uninitialized cache.
- Changed cache file format for stores and outputs (faster with vtproto) -- requires removing the existing state files.
- Various improvements to scheduling.
- Fixed
eth_callhandler not flaggingout of gaserror as deterministic. - Fixed Memory leak in wasmtime.
- Removed the unused 'previous' one-block in merged-blocks (99 inside bundle:100).
- Fix: also prevent rare bug of bundling "very old" one-blocks in merged-blocks.
- Added
sf.firehose.v2.Fetch/Blockendpoint on firehose, allows fetching single block by num, num+ID or cursor. - Added
tools firehose-single-block-clientto call that new endpoint.
- Renamed tools
normalize-merged-blockstoupgrade-merged-blocks.
- Fixed
common-blocks-cache-dirflag's description. - Fixed
DELEGATECALL'scaller(a.k.afrom). -> requires upgrade of blocks toversion: 3 - Fixed
execution aborted (timeout = 5s)hard-coded timeout value when detecting in Substreams ifeth_callerror response was deterministic.
Assuming that you are running a firehose deployment v1.1.0 writing blocks to folders /v2-oneblock, /v2-forked and /v2,
you will deploy a new setup that writes blocks to folders /v3-oneblock, v3-forked and /v3
This procedure describes an upgrade without any downtime. With proper parallelization, it should be possible to complete this upgrade within a single day.
- Launch a new reader with this code, running instrumented geth binary: https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.10.25-fh2.1 (you can start from a backup that is close to head)
- Upgrade your merged-blocks from
version: 2toversion: 3usingfireeth tools upgrade-merged-blocks /path/to/v2 /path/to/v3 {start} {stop}(you can run multiple upgrade commands in parallel to cover the whole blocks range) - Create combined indexes from those new blocks with
fireeth start combined-index-builder(you can run multiple commands in parallel to fill the block range) - When your merged-blocks have been upgraded and the one-block-files are being produced by the new reader, launch a merger
- When the reader, merger and combined-index-builder caught up to live, you can launch the relayer(s), firehose(s)
- When the firehoses are ready, you can now switch traffic to them.
- Added 'SendAllBlockHeaders' param to CombinedFilter transform when we want to prevent skipping blocks but still want to filter out trxs.
- Reduced how many times
reader read statisticsis displayed down to each 30s (previously each 5s) (and re-wrote log toreader node statistics).
- Fix
fireeth tools download-blocks-from-firehosetool that was not working anymore. - Simplify
forkablehubstartup performance cases. - Fix relayer detection of a hole in stream blocks (restart on unrecoverable issue).
- Fix possible panic in hub when calls to one-block store are timing out.
- Fix merger slow one-block-file deletions when there are more than 10000 of them.
- The binary name has changed from
sfethtofireeth(aligned with https://firehose.streamingfast.io/references/naming-conventions) - The repo name has changed from
sf-ethereumtofirehose-ethereum
- This will require reprocessing the chain to produce new blocks
- Protobuf Block model is now tagged
sf.ethereum.type.v2and contains the following improvements:- Fixed Gas Price on dynamic transactions (post-London-fork on ethereum mainnet, EIP-1559)
- Added "Total Ordering" concept, 'Ordinal' field on all events within a block (trx begin/end, call, log, balance change, etc.)
- Added TotalDifficulty field to ethereum blocks
- Fixed wrong transaction status for contract deployments that fail due to out of gas on pre-Homestead transactions (aligned with status reported by chain: SUCCESS -- even if no contract code is set)
- Added more instrumentation around AccessList and DynamicFee transaction, removed some elements that were useless or could not be derived from other elements in the structure, ex: gasEvents
- Added support for finalized block numbers (moved outside the proto-ethereum block, to firehose bstream v2 block)
- There are no more "forked blocks" in the merged-blocks bundles:
- The merged-blocks are therefore produced only after finality passed (before The Merge, this means after 200 confirmations).
- One-block-files close to HEAD stay in the one-blocks-store for longer
- The blocks that do not make it in the merged-blocks (forked out because of a re-org) are uploaded to another store (common-forked-blocks-store-url) and kept there for a while (to allow resolving cursors)
- This will require changes in most firehose clients
- A compatibility layer has been added to still support
sf.firehose.v1.Stream/Blocksbut only for specific values for 'ForkSteps' in request: 'irreversible' or 'new+undo' - The Firehose Blocks protocol is now under
sf.firehose.v2(bumped fromsf.firehose.v1).- Step type
IRREVERSIBLErenamed toFINAL Blocksrequest now only allows 2 modes regarding steps:NEW,UNDOandFINAL(gated by thefinal_blocks_onlyboolean flag)- Blocks that are sent out can have the combined step
NEW+FINALto prevent sending the same blocks over and over if they are already final
- Step type
- Removed the Irreversible indices completely (because the merged-blocks only contain final blocks now)
- Deprecated the "Call" and "log" indices (
xxxxxxxxxx.yyy.calladdrsig.idxandxxxxxxxxxx.yyy.logaddrsig.idx), now replaced by "combined" index - Moved out the
sfeth tools generate-...command to a new app that can be launched withsfeth start generate-combined-index[,...]
- All config via environment variables that started with
SFETH_now starts withFIREETH_ - All logs now output on stderr instead of stdout like previously
- Changed
config-filedefault from./sf.yamlto"", preventing failure without this flag. - Renamed
common-blocks-store-urltocommon-merged-blocks-store-url - Renamed
common-oneblock-store-urltocommon-one-block-store-urlnow used by firehose and relayer apps - Renamed
common-blockstream-addrtocommon-live-blocks-addr - Renamed the
mindreaderapplication toreader - Renamed all the
mindreader-node-*flags toreader-node-* - Added
common-forked-blocks-store-urlflag used by merger and firehose - Changed
--log-to-filedefault fromtruetofalse - Changed default verbosity level: now all loggers are
INFO(instead of having most of them toWARN).-vwill now activate allDEBUGlogs - Removed
common-block-index-sizes,common-index-store-url - Removed
merger-state-file,merger-next-exclusive-highest-block-limit,merger-max-one-block-operations-batch-size,merger-one-block-deletion-threads,merger-writers-leeway - Added
merger-stop-block,merger-prune-forked-blocks-after,merger-time-between-store-pruning - Removed
mindreader-node-start-block-num,mindreader-node-wait-upload-complete-on-shutdown,mindreader-node-merge-and-store-directly,mindreader-node-merge-threshold-block-age - Removed
firehose-block-index-sizes,firehose-block-index-sizes,firehose-irreversible-blocks-index-bundle-sizes,firehose-irreversible-blocks-index-url,firehose-realtime-tolerance - Removed
relayer-buffer-size,relayer-merger-addr,relayer-min-start-offset
- If you depend on the proto file, update
import "sf/ethereum/type/v1/type.proto"toimport "sf/ethereum/type/v2/type.proto" - If you depend on the proto file, update all occurrences of
sf.ethereum.type.v1.<Something>tosf.ethereum.type.v2.<Something> - If you depend on
sf-ethereum/typesas a library, update all occurrences ofgithub.com/streamingfast/firehose-ethereum/types/pb/sf/ethereum/type/v1togithub.com/streamingfast/firehose-ethereum/types/pb/sf/ethereum/type/v2.
- The
readerrequires Firehose-instrumented Geth binary with instrumentation version 2.x (taggedfh2) - Because of the changes in the ethereum block protocol, an existing deployment cannot be migrated in-place.
- You must deploy firehose-ethereum v1.0.0 on a new environment (without any prior block or index data)
- You can put this new deployment behind a GRPC load-balancer that routes
/sf.firehose.v2.Stream/*and/sf.firehose.v1.Stream/*to your different versions. - Go through the list of changed "Flags and environment variables" and adjust your deployment accordingly.
- Determine a (shared) location for your
forked-blocks. - Make sure that you set the
one-block-storeandforked-blocks-storecorrectly on all the apps that now require it. - Add the
generate-combined-indexapp to your new deployment instead of thetoolscommand for call/logs indices.
- Determine a (shared) location for your
- If you want to reprocess blocks in batches while you set up a "live" deployment:
- run your reader node from prior data (ex: from a snapshot)
- use the
--common-first-streamable-blockflag to a 100-block-aligned boundary right after where this snapshot starts (use this flag on all apps) - perform batch merged-blocks reprocessing jobs
- when all the blocks are present, set the
common-first-streamable-blockflag to 0 on your deployment to serve the whole range
- The
readerrequires Firehose-instrumented Geth binary with instrumentation version 2.x (taggedfh2) - The
readerdoes NOT merge block files directly anymore: you need to run it alongside amerger:- determine a
startandstopblock for your reprocessing job, aligned on a 100-blocks boundary right after your Geth data snapshot - set
--common-first-streamable-blockto your start-block - set
--merger-stop-blockto your stop-block - set
--common-one-block-store-urlto a local folder accessible to bothmergerandmindreaderapps - set
--common-merged-blocks-store-urlto the final (ex: remote) folder where you will store your merged-blocks - run both apps like this
fireeth start reader,merger --...
- determine a
- You can run as many batch jobs like this as you like in parallel to produce the merged-blocks, as long as you have data snapshots for Geth that start at this point
- Run batch jobs like this:
fireeth start generate-combined-index --common-blocks-store-url=/path/to/blocks --common-index-store-url=/path/to/index --combined-index-builder-index-size=10000 --combined-index-builder-start-block=0 [--combined-index-builder-stop-block=10000] --combined-index-builder-grpc-listen-addr=:9000
- Added
tools firehose-clientcommand with filter/index options - Added
tools normalize-merged-blockscommand to remove forked blocks from merged-blocks files (cannot transform ethereum blocks V1 into V2 because some fields are missing in V1) - Added substreams server support in firehose app (alpha) through
--substreams-enabledflag
- The firehose GRPC endpoint now supports requests that are compressed using
gziporzstd - The merger does not expose
PreMergedBlocksendpoint over GRPC anymore, only HealthCheck. (relayer does not need to talk to it) - Automatically setting the flag
--firehose-genesis-fileonreadernodes if theirreader-node-bootstrap-data-urlconfig value is sets to agenesis.jsonfile. - Note to other Firehose implementors: we changed all command line flags to fit the required/optional format referred to here: https://en.wikipedia.org/wiki/Usage_message
- Added prometheus boolean metric to all apps called 'ready' with label 'app' (firehose, merger, mindreader-node, node, relayer, combined-index-builder)
- Removed
firehose-blocks-store-urlsflag (feature for using multiple stores now deprecated -> causes confusion and issues with block-caching), usecommon-blocks-sture-urlinstead.
- Fixed problem using S3 provider where the S3 API returns empty filename (we ignore at the consuming time when we receive an empty filename result).
- Fixed an issue where the merger could panic on a new deployment
- Fixed an issue where the
mergerwould get stuck when too many (more than 2000) one-block-files were lying around, with block numbers below the current bundle high boundary.
- Renamed common
atm4 flags toblocks-cache:--common-blocks-cache-{enabled|dir|max-recent-entry-bytes|max-entry-by-age-bytes}
- Fixed
tools check merged-blocksblock hole detection behavior on missing ranges (bumpedsf-tools) - Fixed a deadlock issue related to s3 storage error handling (bumped
dstore)
- Added
tools download-from-firehosecommand to fetch blocks and save them as merged-blocks files locally. - Added
cloud-gcp://auth module (bumpeddauth)
- substreams-alpha client
- gke-pvc-snapshot backup module
- Fixed a potential 'panic' in
mergeron a new chain
- Fixed an issue where the
mergerwould get stuck when too many (more than 2000) one-block-files were lying around, with block numbers below the current bundle high boundary.
- Renamed common
atm4 flags toblocks-cache:--common-blocks-cache-{enabled|dir|max-recent-entry-bytes|max-entry-by-age-bytes}
- Fixed
tools check merged-blocksblock hole detection behavior on missing ranges (bumpedsf-tools)
- Added
tools download-from-firehosecommand to fetch blocks and save them as merged-blocks files locally. - Added
cloud-gcp://auth module (bumpeddauth)
- The default text
encoderuse to encode log entries now emits the level when coloring is disabled. - Default value for flag
--mindreader-node-enforce-peersis now"", this has been changed because the default value was useful only in development when running a localnode-manageras either the miner or a peering node.
- Added block data file caching (called
ATM), this is to reduce the memory usage of component keeping block objects in memory. - Added transforms: LogFilter, MultiLogFilter, CallToFilter, MultiCallToFilter to only return transaction traces that match logs or called addresses.
- Added support for irreversibility indexes in firehose to prevent replaying reorgs when streaming old blocks.
- Added support for log and call indexes to skip old blocks that do not match any transform filter.
- Updated all Firehose stack direct dependencies.
- Updated confusing flag behavior for
--common-system-shutdown-signal-delayand its interaction withgRPCconnection draining infirehosecomponent sometimes preventing it from shutting down. - Reporting an error is if flag
merge-threshold-block-ageis way too low (< 30s).
- Removed some old components that are not required by Firehose stack directly, the repository is as lean as it ca now.
- Fixed Firehose gRPC listening address over plain text.
- Fixed automatic merging of files within the
mindreaderis much more robust then before.