Change log

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning. See MAINTAINERS.md for instructions to keep up to date.

Unreleased

Substreams

Fix a panic (nil pointer) when skipping blocks via indexes on stores on tier2

v2.14.0

Substreams v1.17.0

New sf.substreams.rpc.v3.Stream/Blocks endpoint added

This new endpoint removes the need for complex "mangling" of the package on the client side.
Instead of expecting sf.substreams.v1.Modules (with the client having to apply parameters, network, etc.), the sf.substreams.rpc.v3.Request now expects:
- a sf.substreams.v1.Package.
- a map<string, string> of params
- the network string which will all be applied to the package server-side.
It returns the same object as the v2 endpoint, i.e. a stream of sf.substreams.rpc.v2.Response
It is added on top of the existing 'v2' endpoint, both being active at the same time.
To enable it, operators will simply need to ensure that their routing allows the /sf.substreams.rpc.v3.Stream/* path.
Cached spkg on the server will now contain protobuf definitions, simplifying debugging of user requests.
Emitted metrics for requests can now be sf.substreams.rpc.v3/Blocks instead of always sf.substreams.rpc.v2/Blocks, make sure that your metering endpoint can support it.

Note: recent substreams clients will support both endpoints, first trying the v3 and automatically falling back to v2 if they hit a "404 Not Found" or "Not Implemented" error.

Bug fixes

Fixed a bug with BlockFilter: a skipped module would send BlockScopedData (in dev or near HEAD, to follow progress) with an empty module name, breaking some sinks. Module name was present if requesting a module dependent on that skipped module. Now the module name is always included.

v2.13.3

Bumped to firehose-core v1.11.3
- Improved panic message when reader node encounter a block whose finality is bigger than the block itself to include lib_num, block_num, distance, and max_distance for easier debugging.
- Updated firehose-networks dependency to v0.2.2 (latest).
- Fixed common-one-block-store-url flag not expanding environment variables in all apps.

v2.13.2

Substreams v1.16.6

Updated Wasmtime runtime from v30.0.0 to v36.0.0, bringing performance improvements, inlining support, Component Model async implementation, and enhanced security features.
Added WASM bindgen shims support for Wasmtime runtime to handle WASM modules with WASM bindgen imports (when Substreams Module binary is defined as type wasm/rust-v1+wasm-bindgen-shims).
Added support for foundational-store (in wasmtime and wazero).
Added foundational-store grpc client to substreams engine.
Fixed module caching to properly handle modules with different runtime extensions.

v2.13.1

Substreams

Metering

'paymentgateway' metering plugin renamed to tgm, now supports the indexer-api-key parameter.

Session (stream + workers management)

Concurrent streams and workers limits are now handled under the new session plugin, available under common-session-plugin argument.
The following flags were removed, now handled by that session plugin
- substreams-tier1-global-worker-pool-address
- substreams-tier1-global-request-pool-address
- substreams-tier1-global-worker-pool-keep-alive-delay
- substreams-tier1-global-request-pool-keep-alive-delay
- substreams-tier1-default-max-request-per-use
- substreams-tier1-default-minimal-request-life-time-second
To use thegraph.market as a session plugin, use: --common-session-plugin=tgm://session.thegraph.market:443?indexer-api-key={your-api-key} (requires specific indexer API key) see https://github.com/streamingfast/tgm-gateway/tree/develop/session for details on the various flags
To use simple local session management, use: --common-session-plugin=local://?max_sessions=30&max_sessions_per_user=3&max_workers_per_user=10&max_workers_per_session=10 see https://github.com/streamingfast/dsession/tree/main/local for details on those flags
Note: The 'max_sessions' parameter from the common-session-plugin is now also used to limit the number of firehose streams.
If you were using a custom GRPC implementation for --substreams-tier1-global-worker-pool-address and --substreams-tier1-global-request-pool-address (ex: localhost:9010), simply use this value for the session plugin: --common-session-plugin=tgm://localhost:9010?plaintext=true, it is compatible.

Stability

Fix a slow memory leak around metering plugin on tier2
Add a maximum execution time for a full tier2 segment. By default, this is 60 minutes. It will fail with rpc error: code = DeadlineExceeded desc = request active for too long. It can be configured from the --substreams-tier2-segment-execution-timeout flag
Fix subscription channel at max capacity error: when the LIVE channel is full (ex: slow module execution or slow client reader), the request will be continued from merged files instead of failing, and gracefully recover if performance is restored.
Improve log message for 'request active for a long time', adding stats.

v2.13.0

Substreams (v1.16.4)

Tier1 thread / memory leak

Fix thread leak on filereader (tier1)

Authentication changes

People using their own authentication layer will need to consider these changes before upgrading!

Renamed config headers that come from authentication layer:
- x-sf-user-id renamed to x-user-id (from dauth module)
- x-sf-api-key-id renamed to x-api-key-id (from dauth module)
- x-sf-meta renamed to x-meta (from dauth module)
- x-sf-substreams-parallel-jobs renamed to x-substreams-parallel-workers
Allow decreasing x-substreams-parallel-workers through an HTTP headers (auth layer determines higher bound)
Detect value for the 'stage layer parallel executor max count' based on the x-plan-tier header (removed x-sf-substreams-stage-layer-parallel-executor-max-count handling)

New authentication plugin

Added tgm://auth.thegraph.market?indexer-api-key=<API_KEY>&reissue-jwt-max-age-secs=600 plugin that allows an indexer to use The Graph Market as the authentication source. An API key with special "indexer" feature is needed to allow repeated calls to the API without rate limiting (for Key-based authentication and reissuance of "untrusted long-lived JWTs").

v2.12.4

Substreams (v1.16.2)

Added mechanism to immediately cancel pending requests that are doing an 'external call' (ex: eth_call) on a given block when it gets forked out (UNDO because of a reorg).
Fixed handling of invalid module kind: prevent heavy logging from recovered panic
Error considered deterministic which will cache the error forever are now suffixed with <original message> (deterministic error).

v2.12.3

(removed release with wrong substreams version)

v2.12.2

Substreams

fix: eth_calls returning rpc error code -32003 (InvalidFEOpcode) will not retry forever

v2.12.1

Substreams

[OPERATORS] Tier2 servers must be upgraded BEFORE tier1 servers
tier2 servers will now stream outputs for the 'first segment', to speed up time to first block
Return 'processed blocks' counter to client at the end of the request
Progress notifications will only be sent every 500ms for the first minute, then reduce rate up to every 5 seconds (can be overridden per request)
Added dev_output_modules to protobuf request (if present, in dev mode, only send the output of the modules listed)
Added progress_messages_interval_ms to protobuf request (if present, overrides the rate of progress messages to that many milliseconds)

v2.12.0

[Broken release, do not use]

v2.11.13

This release is a hotfix for a thread leak in substreams leading to a slow memory leak.

v2.11.12

Substreams improvements v1.15.8

Rework the execout File read/write to improve memory efficiency:

This reduces the RAM usage necessary to read and stream data to the user on tier1, as well as to read the existing execouts on tier2 jobs (in multi-stage scenario)
The cached execouts need to be rewritten to take advantage of this, since their data is currently not ordered: the system will automatically load and rewrite existing execout when they are used.
Code changes include:
- new FileReader / FileWriter that "read as you go" or "write as you go"
- No more 'KV' map attached to the File
- Split the IndexWriter away from its dependencies on execoutMappers.
- Clock distributor now also reads "as you go", using a small "one-block-cache"
Removed SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENT env var (since this is not a RAM issue anymore)
Add uncompressed_egress_bytes field to substreams request stats log message

Poller

add --headers flag to fireeth tools poller to allow auhenticated calls to ETH_RPC providers
add --allow-empty-receipts-on-block-0 bool flag to work with tron-evm-mainnet
add --parallel-workers int flag to allow increasing from the default (which is now 20 instead of 10)

Various

(dstore) Add storageClass query parameter for s3:// urls on stores (@fschoell)
Update the firehose-beacon proto to include the new Electra spec in the 'well-known' protobuf definitions (@fschoell)
Use The Graph's Network Registry to recognize chains by genesis blocks and fill the 'advertise' server on substreams/firehose

v2.11.11

Substreams improvements v1.15.7

Tier2 jobs now write mapper outputs "as they progress", preventing memory usage spikes when saving them to disk.
Tier2 jobs now limit writing and loading mapper output files to a maximum size of 8GiB by default.
AddedSUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENT (int) environment variable to control this new limit.
Added SUBSTREAMS_STORE_SIZE_LIMIT (uint64) env var to allow overwriting the default 1GiB value
Added SUBSTREAMS_PRINT_STACK (bool) env var to enable printing full stack traces when caught panic occurs
Added SUBSTREAMS_DEBUG_API_ADDR (string) environment variable to expose a "debug API" HTTP interface that allows blocking connections, running GC, listing or canceling active requests.
Prevent a deterministic failure on a module definition (mode, valueType, updatePolicy) from persisting when the issue is fixed in the substreams.yaml streamingfast/substreams#621
Metering events on tier2 now bundled at the end of the job (prevents sending metering events for failing jobs)
Added metering for: "processed_blocks" (block * number of stages where execution happened) and "egress_bytes"

v2.11.10

Substreams: properly classify eth_calls errors as deterministic on erigon (return data out of bounds and Reverted 0x.....)
Substreams: Speed up DeleteByPrefix operations (5x perf improvement on some heavy substreams)
Substreams: Release existingExecOuts memory as blocks progress on tier2 job

v2.11.9

Block

Added missing address in SetCodeAuthorization structure for proper recording of EIP-7702 feature, this arrives in time for Mainnet but Holesky, Sepolia, BSC Chapel, BSC Mainnet and Arbitrum Sepolia will need to be backfilled to fix the issue at a later time.

v2.11.8

Substreams performance improvements v1.15.4

(RAM+CPU) dedupe execution of modules with same hash but different name when computing dependency graph. (#619)
(RAM) prevent memory usage burst on tier2 when writing mapper by streaming protobuf items to writer
Tier1 requests will no longer error out with "service currently overloaded" because tier2 servers are ramping up

New 'firehose' reader

Add reader-node-firehose which creates one-blocks by consuming blocks from an already existing Firehose endpoint. This can be used to set up an indexer stack without having to run an instrumented blockchain node, or getting redundancy from another firehose provider.

Other

Bumped grpc-go lib to 1.72.0
Now building amd64 and arm64 Docker images on push & release.

v2.11.7

Added support to Balance Change REASON_REVERT for needed by optimism.

v2.11.6

Better documentation on versions of Block and known issues on version 3.

v2.11.5

Bump substreams to v1.15.2
fix the 'quicksave' feature on substreams (incorrect block hash on quicksave)

v2.11.4

Substreams (v1.15.1)

Save deterministic failures in WASM in the module cache (under a file named errors.0123456789.zst at the failed block number), so further requests depending on this module at the same block can return the error immediately without re-executing the module.
Fix module_wasm_ext_duration value in 'substreams request stats' log (always 0 since using wasmtime)

v2.11.3

Substreams

Fix a panic when a module times out on tier2 while being executed from cached outputs
eth_call timeout logs now properly show 0x-prefixed values
Add environment variables to control retry behavior, "SUBSTREAMS_WORKER_MAX_RETRIES" (default 10) and "SUBSTREAMS_WORKER_MAX_TIMEOUT_RETRIES" (default 2), changing from previous defaults (720 and 3) The worker_max_timeout_retries is the number of retries specifically applied to block execution timing out (ex: because of external calls)
The mechanism to slow down processing segments "ahead of blocks being sent to user" has been disabled on "noop-mode" requests, since these requests are used to pre-cache data and should not be slowed down.
The "number of segments ahead" in this mechanism has been increased from >number of parallel workers> to <number of parallel workers> * 1.5
Tier2 now returns GRPC error codes for DeadlineExceeded when it times out, and ResourceExhausted when a request is rejected due to overload
Tier1 now correctly reports tier2 job outcomes in the substreams request stats
Added jitter in "retry" logic to prevent all workers from retrying at the same time when tier2 are overloaded

v2.11.2

Substreams (v1.14.3)

Bugfixes

Added RPC code -32600 as a deterministic error, happen if the JSON-RPC request itself is malformed.
Fixed runtime error: slice bounds out of range error on heavy memory usage with wasmtime engin
Added a validation on a module for the existence of 'triggering' inputs: the server will now fail with a clear error message when the only available inputs are stores used with mode 'get' (not 'deltas'), instead of silenlty skipping the module on every block.
Fixed a bug where the tier1 would not catch tier2 'module execution timeout' error, improved error messages related to timeouts during eth_call

Performance

Added a mechanism for 'production-mode' requests where the tier1 will not schedule tier2 jobs over { max_parallel_subrequests } segments above the current block being streamed to the user. This will ensure that a user slowly reading blocks 1, 2, 3... will not trigger a flood of tier2 jobs for higher blocks, let's say 300_000_000, that might never get read.

Service lifecycle

Improved connection draining on shutdown: Now waits for the end of the 'shutdown-delay' before draining and refusing new connections, then waits for 'quicksaves' and successful signaling of clients, up to a max of 30 sec.

Logging / errors

Added information about the number of blocks that need to be processed for a given request in the sf.substreams.rpc.v2.SessionInit message
Added an optional field limit_processed_blocks to the sf.substreams.rpc.v2.Request. When set to a non-zero value, the server will reject a request that would process more blocks than the given value with the FailedPrecondition GRPC error code.
Improved error messages when a module execution is timing out on a block (ex: due to a slow external call) and now return a DeadlineExceeded Connect/GRPC error code instead of a Internal. Removed 'panic' from wording.
In 'substreams request stats' log, add fields: remote_jobs_completed, remote_blocks_processed and total_uncompressed_read_bytes

v2.11.1

Substreams (v1.14.1)

Fix another cannot resolve 'old cursor' from files in passthrough mode -- not implemented bug when receiving a request in production-mode with a cursor that is below the "linear handoff" block

v2.11.0

Substreams (v1.14.0)

Reconnection time

Implement "QuickSave" feature to save the state of "live running" substreams stores when shutting down, and then resume processing from that point if the cursor matches.
- Added flag substreams-tier1-quicksave-store to enable quicksave when non-empty (requires --common-system-shutdown-signal-delay to be set to a long enough value to save the in-flight stores)

Performance

Rust modules will now be executed with wasmtime by default instead of wazero.
- Prevents the whole server from stalling in certain memory-intensive operations in wazero.
- Speed improvement: cuts the execution time in half in some circumstances.
- Wazero is still used for modules with wbindgen and modules compiled with tinygo.
- Set env var SUBSTREAMS_WASM_RUNTIME=wazero to revert to previous behavior.

v2.10.0

Block Model

The Ethereum block model has been updated to account for upcoming Prague fork. Namely, we added support for the new SetCode transaction's type, added extracted SetCodeAuthorization elements from the transaction and added new gas changes that were introduced in the hard fork.

Also, totalDifficulty field is now deprecated, it has been removed entirely from geth codebase which means future reprocessing of data wouldn't be able to populated that field anymore. If you used that field somehow, you should stop using it. At some point we will remove the field entirely.

Also, from Prague hard-fork and onward, the Block model will now switch to version 4 of the block model (a.k.a Firehose Ethereum Block 3.0). This means that for a given network, block.number < Prague, block will be using version 3 (a.k.a Firehose Ethereum Block 2.3) and when block.number >= Prague, it will be version 4. This is deterministic per network as the Prague block is deterministic.

This does not change at the structure of the various element, everything stays the same in that aspect so the version 4 model is backward compatible. What the new version changes:

Does not populate accountCreations field anymore, this was bogus from day 1 and should never be used.
Fix executedCode field to be more accurate now, as soon as one opcode is executed, this will be set now and not otherwise.
The root's call BeginOrdinal is now fixed and not always 0.
Ordinals in presence of system calls are now correctly ordered.
The returnData is now properly populated.
The keccakPreimage data being "." is now fixed.
The call's input field is now properly populated on contract creation, it was omitted before.
There is new gas changed behind recorded now mainly for full view of how gas is allocated, consumed and returned.

Reader Node

For upcoming Prague hard forks (BNB, Holesky, Sepolia, Mainnet), you will start using geth Firehose 3.0 version, so our Firehose enabled releases suffixed with -fh3.0.

This new Firehose 3.0 geth tracer is built on the new geth Core Tracing API introduced in Geth 1.14. This new version changes how one must start the geth binary.

So for Holesky hard-fork, you will need to use https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.15.2-fh3.0, here what you need when you will update your reader-node's reader-node-arguments field:

Remove --firehose-enabled and any flag starting with --firehose-....
Add --vmtrace=firehose flag which activates Firehose output (Important do not miss this change, otherwise you will not process new blocks, will make it the default soon).
Add --syncmode=full flag which is not set automatically anymore.

Substreams v1.13.0

Capacity Management

Integrated the GlobalRequestPool service in the Tier1App to manage global requests pooling.
Integrated the GlobalWorkerPool service in the Tier1App to manage global worker pooling.
Added flag substreams-tier1-global-worker-pool-address, the address of the global worker pool to use for the substreams tier1. (disabled if empty).
Added flag substreams-tier1-global-worker-pool-keep-alive-delay delay between two keep alive call to the global worker pool (default is 25s").
Added flag substreams-tier1-global-request-pool-keep-alive-delay delay between two keep alive call to the global worker pool for request (default is 25s).
Added flag substreams-tier1-default-max-request-per-user default max request per user, this will be use of the global worker pool is not reachable (default is 5).
Added flag substreams-tier1-default-minimal-request-life-time-second default minimal request life time, this will be use of the global worker pool is not reachable (default is 180).
Limit parallel execution of a stage's layer: Previously, the engine was executing modules in a stage's layer all in parallel. We now change that behavior, development mode will from now on execute every sequentially and when in production mode will limit parallelism to 2 (hard-coded) for now. The auth plugin can control that value dynamically by providing a trusted header X-Sf-Substreams-Stage-Layer-Parallel-Executor-Max-Count.

Performance

Fixed a regression since "v1.7.3" where the SkipEmptyOutput instruction was ignored in substreams mappers
Add shared cache for tier1 execution near HEAD, to prevent multiple tier1 instances from reprocessing the same module on the same block when it comes in (ex: foundational modules)
Improved fetching of state caches on tier1 requests to speed up "time to first data"

Tools

make 'compare-blocks' command support one-blocks stores as well as merged-blocks

v2.9.4

Bump substreams lib to v1.12.3
- Improved logging of requests beginning/end
- Improved noop mode (now sends less data)

v2.9.3

Fixed fireeth tools geth enforce-peers --once shorthand flag registration now collapsing with fireeth tools -o (for --output).

This means the fireeth tools geth enforce-peers command does not accept -o anymore for once and if you were using it, replace with --once.

v2.9.2

Fixed substreams-tier2 not setting itself ready correctly on startup since v2.9.0.
Added support for --output=bytes mode which prints the chain's specific Protobuf block as bytes, the encoding for the bytes string printed is determined by --bytes-encoding, uses hex by default.
Added back -o as shorthand for --output in firecore tools ... sub-commands.

v2.9.1

Add back grpc.health.v1.Health service to firehose and substreams-tier1 services (regression in 2.9.0)
Give precedence to the tracing header X-Cloud-Trace-Context over Traceparent to prevent user systems' trace IDs from leaking passed a GCP load-balancer

v2.9.0

Reader

Reader Node Manager HTTP API now accepts POST http://localhost:10011/v1/restart<?sync=true> to restart the underlying reader node binary sub-process. This is a alias for /v1/reload.

Tools

Enhanced fireeth tools print merged-blocks with various small quality of life improvements:
- Now accepts a block range instead of a single start block.
- Passing a single block as the block range will print this single block alone.
- Block range is now optional, defaulting to run until there is no more files to read.
- It's possible to pass a merged blocks file directly, with or without an optional range.

Firehose

Important

This release will reject firehose connections from clients that don't support GZIP or ZSTD compression. Use --firehose-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.

Important

This release removes the old sf.firehose.v1 protocol (replaced by sf.firehose.v2 in 2022, this should not affect any reasonably recent client).

Add support for ConnectWeb firehose requests.
Always use gzip compression on firehose requests for clients that support it (instead of always answering with the same compression as the request).

Substreams

The substreams-tier1 app now has two new configuration flags named respectively substreams-tier1-active-requests-soft-limit and substreams-tier1-active-requests-hard-limit helping better load balance active requests across a pool of tier1 instances.

The substreams-tier1-active-requests-soft-limit limits the number of client active requests that a tier1 accepts before starting to be report itself as 'unready' within the health check endpoint. A limit of 0 or less means no limit.

This is useful to load balance active requests more easily across a pool of tier1 instance. When the instance reaches the soft limit, it will start to be unready from the load balancer standpoint. The load balancer in return will remove it from the list of available instances, and new connections will be routed to remaining clients, spreading the load.
```
  The `substreams-tier1-active-requests-hard-limit` limits the number of client active requests that a tier1 accepts before
```
rejecting incoming gRPC requests with 'Unavailable' code and setting itself as unready. A limit of 0 or less means no limit.

This is useful to prevent the tier1 from being overwhelmed by too many requests, most client auto-reconnects on 'Unavailable' code so they should end up on another tier1 instance, assuming you have proper auto-scaling of the number of instances available.
The substreams-tier1 app now exposes a new Prometheus metric substreams_tier1_rejected_request_counter that tracks rejected requests. The counter is labelled by the gRPC/ConnectRPC returned code (ok and canceled are not considered rejected requests).
The substreams-tier2 app now exposes a new Prometheus metric substreams_tier2_rejected_request_counter that tracks rejected requests. The counter is labelled by the gRPC/ConnectRPC returned code (ok and canceled are not considered rejected requests).
Properly accept and compress responses with gzip for browser HTTP clients using ConnectWeb with Accept-Encoding header
Allow setting subscription channel max capacity via SOURCE_CHAN_SIZE env var (default: 100)

v2.8.4

Substreams

Fix an issue preventing proper detection of gzip compression when multiple headers are set (ex: python grpc client)
Fix an issue preventing some tier2 requests on last-stage from correctly generating stores. This could lead to some missing "backfilling" jobs and slower time to first block on reconnection.
Fix a thread leak on cursor resolution resulting in bad counter for active connections
Add support for zstd encoding on server

v2.8.3

Note

This release will reject connections from clients that don't support GZIP compression. Use --substreams-tier1-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.

Fix broken tools poller command in v2.8.2

v2.8.2

Warning

Do NOT use this version with tools poller, a flag issue prevents the poller from starting up. Recommended that you upgrade to v2.8.3 ASAP

Note

Bump firehose-core to v1.6.8
Substreams: add --substreams-tier1-enforce-compression to reject connections from clients that do not support GZIP compression
Substreams performance: reduced the number of mallocs (patching some third-party libraries)
Substreams performance: removed heavy tracing (that wasn't exposed to the client)
Fixed --reader-node-line-buffer-size flag that was not being respected in reader-node-stdin app
poller: add --max-block-fetch-duration

v2.8.1

firehose-grpc-listen-addr and substreams-tier1-grpc-listen-addr flags now accepts comma-separated addresses (allows listening as plaintext and snakeoil-ssl at the same time or on specific ip addresses)
rpc-poller: fix fetching the first block on an endpoint (was not following the cursor, failing unnecessarily on non-archive nodes)

v2.8.0

Adding requests_hash which was added by EIP-7685
Adding nil safety check on the CombinedFilter and looping over the transaction_trace receipts
Bump substreams and dmetering to latest version adding the outputModuleHash to metering sender.

v2.7.5

Substreams fixes

Note All caches for stores using the updatePolicy set_sum (added in substreams v1.7.0) and modules that depend on them will need to be deleted, since they may contain bad data.

Fix bad data in stores using set_sum policy: squashing of store segments incorrectly "summed" some values that should have been "set" if the last event for a key on this segment was a "sum"
Fix small bug making some requests in development-mode slow to start (when starting close to the module initialBlock with a store that doesn't start on a boundary)

v2.7.4

Substreams fixes

Fixed an(other) issue where multiple stores running on the same stage with different initialBlocks will fail to proress (and hang)

v2.7.3

Substreams fixes

Fix bug where some invalid cursors may be sent (with 'LIB' being above the block being sent) and add safeguard/loggin if the bug appears again
Fix panic in the whole tier2 process when stores go above the size limit while being read from "kvops" cached changes
Fix "cannot resolve 'old cursor' from files in passthrough mode" error on some requests with an old cursor
Fix handling of 'special case' substreams module with only "params" as its input: should not skip this execution (used in graph-node for head tracking) -> empty files in module cache with hash d3b1920483180cbcd2fd10abcabbee431146f4c8 should be deleted for consistency

v2.7.2

Core

[Operator] The flag --advertise-block-id-encoding now accepts shorter form: hex, base64, etc. The older longer form BLOCK_ID_ENCODING_HEX is still supported but we suggested using the shorter form from now on.

Substreams v1.10.2

Note Since a bug that affected substreams with "skipping blocks" was corrected in this release, any previously produced substreams cache should be considered as possibly corrupted and be eventually replaced

Substreams: fix bad handling of modules with multiple inputs when only one of them is filtered, resulting in bad outputs in production-mode.
Substreams: fix stalling on some substreams with stores and mappers with different start block numbers on the same stage
Substreams: fix 'development mode' and LIVE mode executing some modules that should be skipped

v2.7.1

Bump substreams to v1.10.0
Bump firehose-core to v1.6.1

v2.7.0

Add sf.firehose.v2.EndpointInfo/Info service on Firehose and sf.substreams.rpc.v2.EndpointInfo/Info to Substreams endpoints. This involves the following new flags:
- advertise-chain-name Canonical name of the chain according to https://thegraph.com/docs/en/developing/supported-networks/ (required, unless it is in the "well-known" list)
- advertise-chain-aliases Alternate names for that chain (optional)
- advertise-block-features List of features describing the blocks (optional)
- ignore-advertise-validation Runtime checks of chain name/features/encoding against the genesis block will no longer cause server to wait or fail.
Add a well-known list of chains (hard-coded in wellknown/chains.go to help automatically determine the 'advertise' flag values). Users are encouraged to propose Pull Requests to add more chains to the list.
The new info endpoint adds a mandatory fetching of the first streamable block on startup, with a failure if no block can be fetched after 3 minutes and you are running firehose or substreams-tier1 service. It validates the following on a well-known chain:
- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g. matic, it will refuse another value for advertise-chain-name than matic or one of its aliases (polygon)
- If the first-streamable-block does not match any known chain, it will require the advertise-chain-name to be non-empty
Substreams: add --common-tmp-dir flag to activate local caching of pre-compiled WASM modules through wazero v1.8.0 feature (performance improvement on WASM compilation)
Substreams: revert module hash calculation from v2.6.5, when using a non-zero firstStreamableBlock. Hashes will now be the same even if the chain's first streamable block affects the initialBlock of a module.
Substreams: add --substreams-block-execution-timeout flag (default 3 minutes) to prevent requests stalling. Timeout errors are returned to the client who can decide to retry.

v2.6.7

Bump substreams to v1.9.3: fix high CPU usage on tier1 caused by a bad error handling

v2.6.6

Bump substreams to v1.9.2: Prevent Noop handler from sending outputs with 'Stalled' step in cursor (which breaks substreams-sink-kv)
Bump firehose-core to v1.5.6: add --reader-node-line-buffer-size flag and bump default value from 100M to 200M to go over crazy block 278208000 on Solana

v2.6.5

Fixed

Fixed a bug in the blockfetcher which could cause transactions receipts to be nil
Fixed a bug in substreams where chains with non-zero first-streamable-block would cause some substreams to hang. Solution changes the 'cached' hashes for those substreams.

v2.6.4

Substreams bumped to v1.9.0

Important Substreams BUG FIX

Fix a bug introduced in v1.6.0 that could result in corrupted store "state" file if all the "outputs" were already cached for a module in a given segment (rare occurence)
We recommend clearing your substreams cache after this upgrade and re-processing or validating your data if you use stores.

Added

Expose a new intrinsic to modules: skip_empty_output, which causes the module output to be skipped if it has zero bytes. (Watch out, a protobuf object with all its default values will have zero bytes)
Improve schedule order (faster time to first block) for substreams with multiple stages when starting mid-chain

v2.6.3

fix "hub" not recovering on certain disconnections in relayer/firehose/substreams (scenarios requiring full restart)

v2.6.2

Bumped firehose-core to v1.5.2 and substreams v1.8.0

Substreams changes

Added substreams back-filler to populate cache for live requests when the blocks become final
Fixed: truncate very long details on error messages to prevent them from disappearing when behind a (misbehaving) load-balancer

v2.6.1

Improvements

Bumped firehose-core v1.5.1 and substreams v1.7.3
Bootstrapping from live blocks improved for chains with very slow blocks or with very fast blocks (affects relayer, firehose and substreams tier1)
Substreams fixed slow response close to HEAD in production-mode

v2.6.0

Highlights

Substreams engine is now able run Rust code that depends on solana_program in Solana land to decode and alloy/ether-rs in Ethereum land

How to use `solana_program` or `alloy`/`ether-rs`

Those libraries when used in a wasm32-unknown-unknown context creates in a bunch of wasmbindgen imports in the resulting Substreams Rust code, imports that led to runtime errors because Substreams engine didn't know about those special imports until today.

The Substreams engine is now able to "shims" those wasmbindgen imports enabling you to run code that depends libraries like solana_program and alloy/ether-rs which are known to pull those wasmbindgen imports. This is going to work as long as you do not actually call those special imports. Normal usage of those libraries don't accidentally call those methods normally. If they are called, the WASM module will fail at runtime and stall the Substreams module from going forward.

To enable this feature, you need to explicitly opt-in by appending a +wasm-bindgen-shims at the end of the binary's type in your Substreams manifest:

binaries:
  default:
    type: wasm/rust-v1
    file: <some_file>

to become

binaries:
  default:
    type: wasm/rust-v1+wasm-bindgen-shims
    file: <some_file>

Others

Substreams clients now enable gzip compression over the network (already supported by servers).
Substreams binary type can now be optionally composed of runtime extensions by appending a +<extension>,[<extesions...>] at the end of the binary type. Extensions are key[=value] that are runtime specifics.

[!NOTE] If you were a library author and parsing generic Substreams manifest(s), you will now need to handle that possibility in the binary type. If you were reading the field without any processing, you don't have to change nothing.

v2.5.3

bump firehose-core to v1.4.2

v2.5.2

Substreams bumped to v1.6.2

execout: preload only one file instead of two, log if undeleted caches found
execout: add environment variable SUBSTREAMS_DISABLE_PRELOAD_EXEC_FILES to disable file preloading

v2.5.1

Substreams bumped to v1.6.1

Revert sanity check to support the special case of a substreams with only 'params' as input. This allows a chain-agnostic event to be sent, along with the clock.
Fix error handling when resolved start-block == stop-block and stop-block is defined as non-zero

v2.5.0

Substreams bumped to v1.6.0

Note Upgrading will require changing the tier1 and tier2 versions concurrently, as the internal protocol has changed.

Index Modules and Block Filter now supported. See https://github.com/streamingfast/substreams-foundational-modules for an example implementation
Various scheduling and performance improvements
env variable SUBSTREAMS_WORKERS_RAMPUP_TIME changed from 4s to 0. Set it to 4s to keep previous behavior
otelcol:// tracing protocol no longer supported

v2.4.9

Substreams

Fixed a crash when eth_call batch is of length 0 and a retry is attempted.
Allow stores to write to stores with out-of-order ordinals (they will be reordered at the end of the module execution for each block)
Fix issue in substreams-tier2 causing some files to be written to the wrong place sometimes under load, resulting in some hanging requests

v2.4.8

The fireeth tools download-from-firehose now respects its documentation when doing --help, correct invocation now is fireeth tools download-from-firehose <endpoint> <start>:<end> <output_folder>.
The fireeth tools download-from-firehose has been improved to work with new Firehose sf.firehose.v2.BlockMetadata field, if the server sends this new field, the tool is going to work on any chain. If the server's you are reaching is not recent enough, the tool fallbacks to the previous logic. All StreamingFast endpoints should serves be compatible.
Firehose response (both single block and stream) now include the sf.firehose.v2.BlockMetadata field. This new field contains the chain agnostic fields we hold about any block of any chain.

v2.4.7

Substreams fixes

bump substreams to v1.5.5 with fix in wazero to prevent process freezing on certain substreams

v2.4.6

Added support for Firehose reader format 2.5 which will be required for BSC 1.4.5+.

v2.4.5

Updated block model to add BalanceChange#Reason.REWARD_BLOB_FEE for BSC Tycho hard-fork.

v2.4.4

Substreams fixes

fix a possible panic() when an request is interrupted during the file loading phase of a squashing operation.
fix a rare possibility of stalling if only some fullkv stores caches were deleted, but further segments were still present.
fix stats counters for store operations time

v2.4.3

substreams

fix memory leak on substreams execution (by bumping wazero dependency)
remove the need for substreams-tier1 blocktype auto-detection
fix missing error handling when writing output data to files. This could result in tier1 request just "hanging" waiting for the file never produced by tier2.
fix handling of dstore error in tier1 'execout walker' causing stalling issues on S3 or on unexpected storage errors
increase number of retries on storage when writing states or execouts (5 -> 10)
prevent slow squashing when loading each segment from full KV store (can happen when a stage contains multiple stores)

v2.4.2

substreams

Fix a context leak causing tier1 responses to slow down progressively

v2.4.1

Substreams

fix thread leak in metering affecting substreams
revert a substreams scheduler optimisation that causes slow restarts when close to head
add substreams_tier2_active_requests and substreams_tier2_request_counter prometheus metrics

v2.4.0

Substreams

Substreams bumped to @v1.5.0: See https://github.com/streamingfast/substreams/releases/tag/v1.5.0 for details.

Chain-agnostic tier2

A single substreams-tier2 instance can now serve requests for multiple chains or networks. All network-specific parameters are now passed from Tier1 to Tier2 in the internal ProcessRange request.
This allows you to better use your computing resources by pooling all the networks together.

Important

Since the tier2 services will now get the network information from the tier1 request, you must make sure that the file paths and network addresses will be the same for both tiers. ex: if --common-merged-blocks-store-url=/data/merged is set on tier1, make sure the merged blocks are also available from tier2 under the path /data/merged. The flags --substreams-state-store-url, --substreams-state-store-default-tag, --common-merged-blocks-store-url, --substreams-rpc-endpoints stringArray and --substreams-rpc-gas-limit are now ignored on tier2. The flag --common-first-streamable-block should be set to 0 to accommodate every chain. Non-ethereum chains can query a firehose-ethereum tier2, but the opposite is not true, since only the firehose-ethereum implements the eth_call WASM extension.

Tip

The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files. The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.

Performance improvements

All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing).
Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
Tier2 will skip processing completely if it's processing the last stage and the output_module is a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time)
Tier2 will skip processing completely if it's processing a stage where all the stores and outputs have been processed and cached.
Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
Improved file listing performance for Google Storage backends by 25%!

Tip

Concurrent requests on the same module hashes may benefit from the other requests' work to a certain extent (up to 75%!) -- The very first request does most of the work for the other ones.

Tip

More caches will increase disk usage and there is no automatic removal of old module caches. The operator is responsible for deleting old module caches.

Tip

Metrics

Readiness metric for Substreams tier1 app is now named substreams_tier1 (was mistakenly called firehose before).
Added back readiness metric for Substreams tiere app (named substreams_tier2).
Added metric substreams_tier1_active_worker_requests which gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes.
Added metric substreams_tier1_worker_request_counter which gives the total Substreams worker requests a tier1 app made against tier2 nodes.

Flags

Added --merger-delete-threads to customize the number of threads the merger will use to delete files. It's recommended to increase this when using Ceph as S3 storage provider to 25 or higher (due to performance issues with deletes the merger might otherwise not be able to delete one-block files fast enough).

v2.3.7

Fixed tools check merged-blocks default range when -r <range> is not provided to now be [0, +∞] (was previously [HEAD, +∞]).
Fixed tools check merged-blocks to be able to run without a block range provided.
Added API Key based authentication to tools firehose-client and tools firehose-single-block-client, specify the value through environment variable FIREHOSE_API_KEY (you can use flag --api-key-env-var to change variable's name to something else than FIREHOSE_API_KEY).
Fixed tools check merged-blocks examples using block range (range should be specified as [<start>]?:[<end>]).
Added --substreams-tier2-max-concurrent-requests to limit the number of concurrent requests to the tier2 Substreams service.

v2.3.6

Adding traceID for RPCCalls
BlockFetcher: added support for WithdrawalsRoot, BlobGasUsed, BlobExcessGas and ParentBeaconRoot fields when fetching blocks from RPC.
Substreams: add support for substreams-tier2-max-concurrent-requests flag to limit the number of concurrent requests to tier2

v2.3.5

Substreams

Warning

This release deprecates the "RPC Cache (for eth_calls)" feature of substreams: It has been turned off by default and will not be supported in future releases. The RPC cache was a not-well-known feature that cached all eth_calls responses by default and loaded them on each request. It is being deprecated because it has a negative impact on global performance. If you want to cache your eth_call responses, you should do it in a specialized proxy instead of having substreams manage this. Until the feature is completely removed, you can keep the previous behavior by setting the --substreams-rpc-cache-store-url flag to a non-empty value (its previous default value was {data-dir}/rpc-cache)

Performance: prevent reprocessing jobs when there is only a mapper in production mode and everything is already cached
Performance: prevent "UpdateStats" from running too often and stalling other operations when running with a high parallel jobs count
Performance: fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
Added the output module's hash to the "incoming request" log
Substreams RPC: add --substreams-rpc-gas-limit flag to allow overriding default of 50M. Arbitrum chains behave better with a value of 0 to avoid intrinsic gas too low (supplied gas 50000000) errors

Reader node

The reader-node-bootstrap-url gained the ability to be bootstrapped from a bash script.

If the bootstrap URL is of the form bash:///<path/to/script>?<parameters>, the bash script at <path/to/script> will be executed. The script is going to receive in environment variables the resolved reader node variables in the form of READER_NODE_<VARIABLE_NAME>. The fully resolved node arguments (from reader-node-arguments) are passed as args to the bash script. The query parameters accepted are:

* `arg=<value>` | Pass as extra argument to the script, prepended to the list of resolved node arguments
* `env=<key>%3d<value>` | Pass as extra environment variable as `<key>=<value>` with key being upper-cased (multiple(s) allowed)
* `env_<key>=<value>` | Pass as extra environment variable as `<key>=<value>` with key being upper-cased (multiple(s) allowed)
* `cwd=<path>` | Change the working directory to `<path>` before running the script
* `interpreter=<path>` | Use `<path>` as the interpreter to run the script
* `interpreter_arg=<arg>` | Pass `<interpreter_arg>` as arguments to the interpreter before the script path (multiple(s) allowed)

Note

The bash:/// script support is currently experimental and might change in upcoming releases, the behavior changes will be clearly documented here.

v2.3.4

Fix JSON decoding in the client tools (firehose-client, print merged-blocks, etc.).

v2.3.3

Known issues

The block decoding to JSON is broken in the client tools (firehose-client, print merged-blocks, etc.). Use version v2.3.1

Hotfix

Fix block poller panic on v2.3.2

v2.3.2

Known issues

This release has a broken RPC poller component. Upgrade to v2.3.3.
The block decoding to JSON is broken in the client tools (firehose-client, print merged-blocks, etc.). Use version v2.3.1

Auth and metering

Add missing metering events for sf.firehose.v2.Fetch/Block responses.
Changed default polling interval in 'continuous authentication' from 10s to 60s, added 'interval' query param to URL.

Substreams

Fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
Fixed load-balancing from tier1 to tier2 when using dns:/// (round-robin policy was not set correctly)
Added trace_id in grpc authentication calls
Bumped connect-go library to new "connectrpc.com/connect" location

v2.3.1

Operators

Firehose blocks that were produced using the RPC Poller will have to be extracted again to fix the Transaction Status and the potential missing receipt (ex: arb-one pre-nitro, Avalanche, Optimism ...)

Fixes

Fix race condition in RPC Poller which would cause some missing transaction receipts
Fix conversion of transaction status from RPC Poller: failed transactions would show up as "status unknown" in firehose blocks.

Added

Added the support the FORCE_FINALITY_AFTER_BLOCKS environment variable: setting it to a value like '200' will make the 'reader' mark blocks as final after a maximum of 200 block confirmations, even if the chain implements finality via a beacon that lags behind.

v2.3.0

Reduce logging and logging "payload".
Tools printing Firehose Block model to JSON now have --proto-paths take higher precedence over well-known types and even the chain itself, the order is --proto-paths > chain > well-known (so well-known is lookup last).
The tools print one-block now works correctly on blocks generated by omni-chain firecore binary.
The various health endpoint now sets Content-Type: application/json header prior sending back their response to the client.
The firehose, substreams-tier1 and substream-tier2 health endpoint now respects the common-system-shutdown-signal-delay configuration value meaning that the health endpoint will return false now if SIGINT has been received but we are still in the shutdown unready period defined by the config value. If you use some sort of load balancer, you should make sure they are configured to use the health endpoint and you should common-system-shutdown-signal-delay to something like 15s.
Changed reader logger back to reader-node to fit with the app's name which is reader-node.
Fix tools compare-blocks that would fail on new format.
Fix substreams to correctly delete .partial files when serving a request that is not on a boundary

v2.2.2

The Cancun hard fork happened on Goerli and after further review, we decided to change the Protobuf definition for the new BlockHeader, Transaction and TransactionReceipt fields that are related to blob transaction.

We made explicit that those fields are optional in the Protobuf definition which will render them in your language of choice using the appropriate "null" mechanism. For example on Golang, those fields are generated as BlobGasUsed *uint64 and ExcessBlobGas *uint64 which will make it clear that those fields are not populated at all.

The affected fields are:

BlockHeader.blob_gas_used, now optional uint64.
BlockHeader.excess_blob_gas, now optional uint64.
TransactionTrace.blob_gas, now optional uint64.
TransactionTrace.blob_gas_fee_cap, now optional BigInt.
TransactionReceipt.blob_gas_used, now optional uint64.
TransactionReceipt.blob_gas_price, now optional BigInt.

This is technically a breaking change for those that could have consumed those fields already but we think he impact is so minimal that it's better to make the change right now.

Operators

You will need to reprocess a small Goerli range. You should update to new version to produce the newer version and the reprocess from block 10377700 up to when you upgraded to v2.2.2.

The block 10377700 was chosen since it is the block at the time of the first release we did supporting Cancun where we introduced those new field. If you know when you deploy either v2.2.0 or v2.2.1, you should reprocess from that point.

An alternative to reprocessing is updating your blocks by having a StreamingFast API Token and using fireeth tools download-from-firehose goerli.eth.streamingfast.io:443 -a SUBSTREAMS_API_TOKEN 10377700:<recent block rounded to 100s> <destination>.

Note

You should download the blocks to a temporary destination and copy over to your production destination once you have them all.

You can reach to us on Discord if you need help on something.

v2.2.1

Updated the documentation for some of the upcoming new Cancun hard-fork fields:

v2.2.0

Support for Cancun fork (Goerli: Jan 17th)

Added support for EIP-4844 (upcoming with activation of Cancun fork), through instrumented go-ethereum nodes with version fh2.4. This adds new fields in the Ethereum Block model, fields that will be non-empty when the Ethereum network your pulling have EIP-4844 activated. The fields in questions are:
- Block.system_calls
- BlockHeader.blob_gas_used
- BlockHeader.excess_blob_gas
- BlockHeader.parent_beacon_root
- TransactionTrace.blob_gas
- TransactionTrace.blob_gas_fee_cap
- TransactionTrace.blob_hashes
- TransactionReceipt.blob_gas_used
- TransactionReceipt.blob_gas_price
- A new TransactionTrace.Type value TRX_TYPE_BLOB

Important

Operators running Goerli chain will need to upgrade to this version, with this geth node release: https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.13.10-fh2.4

Substreams server (bumped to v1.3.1)

Fixed error-passing between tier2 and tier1 (tier1 will not retry sending requests that fail deterministicly to tier2)
Tier1 will now schedule a single job on tier2, quickly ramping up to the requested number of workers after 4 seconds of delay, to catch early exceptions
"store became too big" is now considered a deterministic error and returns code "InvalidArgument"

Misc

Added tools poller generic-evm subcommand. It is identical to optimism/arb-one in feature at the moment and should work for most evm chains.

v2.1.0

Bump to major release firehose-core v1.0.0

Operators

Important

When upgrading your stack to this release, be sure to upgrade all components simultaneously because the block encapsulation format has changed. Blocks that are merged using the new merger will not be readable by previous versions. There is no simple way to revert, except by deleting the all the one-blocks and merged-blocks that were produced with this version.

Changed

Blocks files (one-blocks and merged) are now stored with a new format using google.protobuf.any format. Previous blocks can still be read and processed.

Added

Added RPC pollers for Optimism and Arb-one: These can be used from by running the reader-node with --reader-node-path=/path/to/fireeth and --reader-node-arguments="tools poller {optimism|arb-one} [more flags...]"
Added tools fix-any-type to rewrite the previous merged-blocks (OPTIONAL)

v2.0.2

Fixed grpc error code when shutting down: changed from Canceled to Unavailable

v2.0.1

Fixed SF_TRACING feature (regression broke the ability to specify a tracing endpoint)
Fixed substreams GRPC/Connect error codes not propagating correctly
Firehose connections rate-limiting will now force an (increased) delay of between 1 and 4 seconds (random value) before refusing a connection when under heavy load

v2.0.0

Fixed

Fixed the fix-polygon-index tool (parsing error made it unusable in v2.0.0-rc.1)
Fixed some false positives in compare-blocks-rpc

v2.0.0-rc.1

Highlights

This releases refactor firehose-ethereum repository to use the common shared Firehose Core library (https://github.com/streamingfast/firehose-core) that every single Firehose supported chain should use and follow.

Both at the data level and gRPC level, there is no changes in behavior to all core components which are reader-node, merger, relayer, firehose, substreams-tier1 and substreams-tier2.

A lot of changes happened at the operators level however and some superflous mode have been removed, especially around the reader-node application. The full changes is listed below, operators should review thoroughly the changelog.

Important

It's important to emphasis that at the data level, nothing changed, so reverting to 1.4.22 in case of a problem is quite easy and no special data migration is required outside of changing back to the old set of flags that was used before.

Operators

You will find below the detailed upgrade procedure for the configuration file operators usually use. If you are using the flags based approach, simply update the corresponding flags.

Important

We have had reports of older versions of this software creating corrupted merged-blocks-files (with duplicate or out-of-bound blocks) This release adds additional validation of merged-blocks to prevent serving duplicate blocks from the firehose or substreams service. This may cause service outage if you have produced those blocks or downloaded them from another party who was affected by this bug. See the Finding and fixing corrupted merged-blocks-files to see how you can prevent service outage.

Quick Upgrade

Here a bullet list for upgrading your instance, we still recommend to fully read each section below, the list here can serve as a check list. The list below is done in such way that you get back the same "instance" as before. The listening addresses changes can be omitted as long as you update other tools to account for the port changes list your load balancer.

Add config config-file: ./sf.yaml if not present already
Add config data-dir: ./sf-data if not present already
Rename config verbose to log-verbosity if present
Add config common-blocks-cache-dir: ./sf-data/blocks-cache if not present already
Remove config common-chain-id if present
Remove config common-deployment-id if present
Remove config common-network-id if present
Add config common-live-blocks-addr: :13011 if not present already
Add config relayer-grpc-listen-addr: :13011 if common-live-blocks-addr has been added in previous step
Add config reader-node-grpc-listen-addr: :13010 if not present already
Add config relayer-source: :13010 if reader-node-grpc-listen-addr has been added in previous step
Remove config reader-node-enforce-peers if present
Remove config reader-node-log-to-zap if present
Remove config reader-node-ipc-path if present
Remove config reader-node-type if present
Replace config reader-node-arguments: +--<flag1> --<flag2> ... by reader-node-arguments: --networkid=<network-id> --datadir={node-data-dir} --port=30305 --http --http.api=eth,net,web3 --http.port=8547 --http.addr=0.0.0.0 --http.vhosts=* --firehose-enabled --<flag1> --<flag2> ...

[!NOTE] The <network-id> is dynamic and should be replace with a literal value like 1 for Ethereum Mainnet. The {node-data-dir} value is actually a templating value that is going o be resolved for you (resolves to value of config reader-node-data-dir).!

[!IMPORTANT] Ensure that --firehose-enabled is part of the flag! Moreover, tweak flags to avoid repetitions if your were overriding some of them.
Remove node under start: args: list
Add config merger-grpc-listen-addr: :13012 if not present already
Add config firehose-grpc-listen-addr: :13042 if not present already
Add config substreams-tier1-grpc-listen-addr: :13044 if not present already
Add config substreams-tier1-grpc-listen-addr: :13044 if not present already
Add config substreams-tier2-grpc-listen-addr: :13045 if not present already
Add config substreams-tier1-subrequests-endpoint: :13045 if substreams-tier1-grpc-listen-addr has been added in previous step
Replace config combined-index-builder to index-builder under start: args: list
Rename config common-block-index-sizes to common-index-block-sizes if present
Rename config combined-index-builder-grpc-listen-addr to index-builder-grpc-listen-addr if present
Add config index-builder-grpc-listen-addr: :13043 if you didn't have combined-index-builder-grpc-listen-addr previously
Rename config combined-index-builder-index-size to index-builder-index-size if present
Rename config combined-index-builder-start-block to index-builder-start-block if present
Rename config combined-index-builder-stop-block to index-builder-stop-block if present
Replace any occurrences of {sf-data-dir} to {data-dir} in any of your configuration values if present

Common Changes

The default value for config-file changed from sf.yaml to firehose.yaml. If you didn't had this flag defined and wish to keep the old default, define config-file: sf.yaml.
The default value for data-dir changed from sf-data to firehose-data. If you didn't had this flag defined before, you should either move sf-data to firehose-data or define data-dir: sf-data.

[!NOTE] This is an important change, forgetting to change it will change expected locations of data leading to errors or wrong data.
Deprecated The {sf-data-dir} templating argument used in various flags to resolve to the --data-dir=<location> value has been deprecated and should now be simply {data-dir}. The older replacement is still going to work but you should replace any occurrences of {sf-data-dir} in your flag definition by {data-dir}.
The default value for common-blocks-cache-dir changed from {sf-data-dir}/blocks-cache to file://{data-dir}/storage/blocks-cache. If you didn't had this flag defined and you had common-blocks-cache-enabled: true, you should define common-blocks-cache-dir: file://{data-dir}/blocks-cache.
The default value for common-live-blocks-addr changed from :13011 to :10014. If you didn't had this flag defined and wish to keep the old default, define common-live-blocks-addr: 13011 and ensure you also modify relayer-grpc-listen-addr: :13011 (see next entry for details).
The Go module github.com/streamingfast/firehose-ethereum/types has been removed, if you were depending on github.com/streamingfast/firehose-ethereum/types in your project before, depend directly on github.com/streamingfast/firehose-ethereum instead.

[!NOTE] This will pull much more dependencies then before, if you're reluctant of such additions, talk to us on Discord and we can offer alternatives depending on what you were using.
The config value verbose has been renamed to log-verbosity keeping the same semantic and default value as before

[!NOTE] The short flag version is still -v and can still be provided multiple times like -vvvv.

App `reader-node` changes

This change will impact all operators currently running Firehose on Ethereum so it's important to pay attention to the upgrade procedure below, if you are unsure of something, reach to us on Discord.

Before this release, the reader-node app was managing for you a portion of the reader-node-arguments configuration value, prepending some arguments that would be passed to geth when invoking it, the list of arguments that were automatically provided before:

--networkid=<value of config value 'common-network-id'>
--datadir=<value of config value 'reader-node-data-dir'>
--ipcpath=<value of config value 'reader-node-ipc-path'>
--port=30305
--http
--http.api=eth,net,web3
--http.port=8547
--http.addr=0.0.0.0
--http.vhosts=*
--firehose-enabled

We have now removed those magical additions and operators are now responsible of providing the flags they required to properly run a Firehose-enabled native geth node. The + sign that was used to append/override the flags has been removed also since no default additions is performed, the + was now useless. To make some flag easier to define and avoid repetition, a few templating variable can be used within the reader-node-arguments value:

{data-dir} The current data-dir path defined by the config value data-dir
{node-data-dir} The node data dir path defined by the flag reader-node-data-dir
{hostname} The machine's hostname
{start-block-num} The resolved start block number defined by the flag reader-node-start-block-num (can be overwritten)
{stop-block-num} The stop block number defined by the flag reader-node-stop-block-num

As an example, if you provide the config value reader-node-data-dir=/var/geth for example, then you could use reader-node-arguments: --datadir={node-data-dir} and that would resolve to reader-node-arguments: --datadir=/var/geth for you.

Note

The reader-node-arguments is a string that is parsed using Shell word splitting rules which means for example that double quotes are supported like --datadir="/var/with space/path" and the argument will be correctly accepted. We use https://github.com/kballard/go-shellquote as your parsing library.

We also removed the following reader-node configuration value:

reader-node-type (No replacement needed, just remove it)
reader-node-ipc-path (If you were using that, define it manually using geth flag --ipcpath=...)
reader-node-enforce-peers (If you were using that, use a geth config file to add static peers to your node, read about static peers for geth on the Web)

Default listening addresses changed also to be the same on all firehose-<...> project, meaning consistent ports across all chains for operators. The reader-node-grpc-listen-addr default listen address went from :13010 to :10010 and reader-node-manager-api-addr from :13009 to :10011. If you have no occurrences of 13010 or 13009 in your config file or your scripts, there is nothing to do. Otherwise, feel free to adjust the default port to fit your needs, if you do change reader-node-grpc-listen-addr, ensure --relayer-source is also updated as by default it points to :10010.

Here an example of the required changes.

Change:

start:
  args:
  - ...
  - reader-node
  - ...
  flags:
    ...
    reader-node-bootstrap-data-url: ./reader/genesis.json
    reader-node-enforce-peers: localhost:13041
    reader-node-arguments: +--firehose-genesis-file=./reader/genesis.json --authrpc.port=8552
    reader-node-log-to-zap: false
    ...

To:

start:
  args:
  - ...
  - reader-node
  - ...
  flags:
    ...
    reader-node-bootstrap-data-url: ./reader/genesis.json
    reader-node-arguments:
      --networkid=1515
      --datadir={node-data-dir}
      --ipcpath={data-dir}/reader/ipc
      --port=30305
      --http
      --http.api=eth,net,web3
      --http.port=8547
      --http.addr=0.0.0.0
      --http.vhosts=*
      --firehose-enabled
      --firehose-genesis-file=./reader/genesis.json
      --authrpc.port=8552
    ...

Note

Adjust the --networkid=1515 value to fit your targeted chain, see https://chainlist.org/ for a list of Ethereum chain and their network-id value.

App `node` removed

In previous version of firehose-ethereum, it was possible to use the node app to launch managed "peering/backup/whatever" Ethereum node, this is not possible anymore. If you were using the node app previously, like in this config:

start:
  args:
  - ...
  - node
  - ...
  flags:
    ...
    node-...

You must now remove the node app from args and any flags starting with node-. The migration path is to run those on your own without the use of fireeth and using whatever tools fits your desired needs.

We have completely drop support to concentrate on the core mission of Firehose which is to run reader nodes to extract Firehose blocks from it.

Note This is about the node app and not the reader-node, we think usage of this app is minimal/inexistent.

Rename of `combined-index-builder` to `index-builder`

The app has been renamed to simply index-builder and the flags has been completely renamed removing the prefix combined- in front of them.

Change:

start:
  args:
  - ...
  - combined-index-builder
  - ...
  flags:
    ...
    combined-index-builder-grpc-listen-addr: ":9999"
    combined-index-builder-index-size: 10000
    combined-index-builder-start-block: 0
    combined-index-builder-stop-block: 0
    ...

To:

start:
  args:
  - ...
  - index-builder
  - ...
  flags:
    ...
    index-builder-grpc-listen-addr: ":9999"
    index-builder-index-size: 10000
    index-builder-start-block: 0
    index-builder-stop-block: 0
    ...

Flag common-block-index-sizes has been renamed to common-index-block-sizes.

Note

Rename only configuration item you had previously defined, do not copy paste verbatim example above.

App `relayer` changes

The default value for relayer-grpc-listen-addr changed from :13011 to :10014. If you didn't had this flag defined and wish to keep the old default, define relayer-grpc-listen-addr: 13011 and ensure you also modify common-live-blocks-addr: :13011 (see previous entry for details).
The default value for relayer-source changed from :13010 to :10010. If you didn't had this flag defined and wish to keep the old default, define relayer-source: 13010 and ensure you also modify reader-node-grpc-listen-addr: :13010.

[!NOTE] Must align with reader-node-grpc-listen-addr!

App `firehose` changes

The default value for firehose-grpc-listen-addr changed from :13042 to :10015. If you didn't had this flag defined and wish to keep the old default, define firehose-grpc-listen-addr: :13042.
Firehose logs now include auth information (userID, keyID, realIP) along with blocks + egress bytes sent.

App `merger` changed

The default value for merger-grpc-listen-addr changed from :13012 to :10012. If you didn't had this flag defined and wish to keep the old default, define merger-grpc-listen-addr: :13012.

App `substreams-tier1` and `substreams-tier2` changed

The default value for substreams-tier1-grpc-listen-addr changed from :13044 to :10016. If you didn't had this flag defined and wish to keep the old default, define substreams-tier1-grpc-listen-addr: :13044.
The default value for substreams-tier1-subrequests-endpoint changed from :13045 to :10017. If you didn't had this flag defined and wish to keep the old default, define substreams-tier1-subrequests-endpoint: :13044.

[!NOTE] Must align with substreams-tier1-grpc-listen-addr!
The default value for substreams-tier2-grpc-listen-addr changed from :13045 to :10017. If you didn't had this flag defined and wish to keep the old default, define substreams-tier2-grpc-listen-addr: :13045.

Protobuf model changes

Added field DetailLevel (Base, Extended(default)) to sf.ethereum.type.v2.Block to distinguish the new blocks produced from polling RPC (base) from the blocks normally produced with firehose instrumentation (extended)

Tools changes

Added command tools fix-bloated-merged-blocks to go through a range of possibly corrupted merged-blocks (with duplicates and out-of-range blocks) and try to fix them, writing the fixed merged-blocks files to another destination.

Removed

Transform sf.ethereum.transform.v1.LightBlock is not supported, this has been deprecated for a long time and should not be used anywhere.

Finding and fixing corrupted merged-blocks files

You may have certain merged-blocks files (most likely OLD blocks) that contain more than 100 blocks (with duplicate or extra out-of-bound blocks)

Find the affected files by running the following command (can be run multiple times in parallel, over smaller ranges)

tools check merged-blocks-batch <merged-blocks-store> <start> <stop>

If you see any affected range, produce fixed merged-blocks files with the following command, on each range:

tools fix-bloated-merged-blocks <merged-blocks-store> <output-store> <start>:<stop>

Copy the merged-blocks files created in output-store over to the your merged-blocks-store, replacing the corrupted files.

v1.4.22

Fixed a regression where reader-node-role was changed to dev by default, putting back the default geth value.

v1.4.21

Bump Substreams to v1.1.20 with a fix for some minor bug fixes related to start block processing

v1.4.20

Added

Added tools poll-rpc-blocks command to launch an RPC-based poller that acts as a firehose extractor node, printing base64-encoded protobuf blocks to stdout (used by the 'dev' node-type). It creates "light" blocks, without traces and ordinals.
Added --dev flag to the start command to simplify running a local firehose+substreams stack from a development node (ex: Hardhat).
- This flag overrides the --reader-node-path, instead pointing to the fireeth binary itself.
- This flag overrides the --reader-node-type, setting it to dev instead of geth. This node type has the following default reader-node-arguments: tools poll-rpc-blocks http://localhost:8545 0
- It also removes node from the list of default apps

Fixed

Substreams: fixed metrics calculations (per-module processing-time and external calls were wrong)
Substreams: fixed immediate EOF when streaming from block 0 to (unbounded) in dev mode

v1.4.19

Bumped substreams to v1.1.18 with a regression fix for when a substreams has a start block in the reversible segment

v1.4.18

Bumped substreams to v1.1.17 with fix missing decrement on metricssubstreams_active_requests`

v1.4.17

Added

The --common-auth-plugin got back the ability to use secret://<expected_secret>?[user_id=<user_id>]&[api_key_id=<api_key_id>] in which case request are authenticated based on the Authorization: Bearer <actual_secret> and continue only if <actual_secret> == <expected_secret>.

Changed

Bumped substreams to v1.1.16 with support of metrics substreams_active_requests and substreams_counter

v1.4.16

Operators of Polygon/Mumbai chains

If you started reprocessing the blockchain blocks using release v1.4.14 or v1.4.15, you will need to run the following command to fix the blocks affected by another bug: fireeth tools fix-polygon-index /your/merged/blocks /temporary/destination 0 48200000 (note that you can run multiple instances of this command in parallel to cover the range of blocks from 0 to current HEAD in smaller chunks)

Fixed

Fix another data issue found in polygon blocks: blocks that contain a single "system" transaction have "Index=1" for that transaction instead of "Index=0"

v1.4.15

Fixed

(Substreams) fixed regressions for relative start-blocks for substreams (see https://github.com/streamingfast/substreams/releases/tag/v1.1.14)

v1.4.14

Operators

If you are indexing Polygon or Mumbai chains, you will need to reprocess the chain from genesis, as your existing Firehose blocks are missing some system transactions.

As always, this can be done with multiple client nodes working in parallel on different chain's segment if you have snapshots at various block heights.

Golang 1.21+ is now also required to build the project.

Fixed

Fixed post-processing of polygon blocks: some system transactions were not "bundled" correctly.
(Substreams) fixed validations for invalid start-blocks (see https://github.com/streamingfast/substreams/releases/tag/v1.1.13)

Added

Added tools compare-oneblock-rpc command to perform a validation between a firehose 'one-block-file' blocks+trx+logs fetched from an RPC endpoint

Changed

The tools print subcommands now use hex to encode values instead of base64, making them easier to use

v1.4.13

Important

The Substreams service exposed from this version will send progress messages that cannot be decoded by substreams clients prior to v1.1.12. Streaming of the actual data will not be affected. Clients will need to be upgraded to properly decode the new progress messages.

Changed

Bumped substreams to v1.1.12 to support the new progress message format. Progression now relates to stages instead of modules. You can get stage information using the substreams info command starting at version v1.1.12.

Added

added tools compare-blocks-rpc command to perform a validation between firehose blocks and blocks+trx+logs fetched from an RPC endpoint

Fixed

More tolerant retry/timeouts on filesource (prevent "Context Deadline Exceeded")

v1.4.12

Highlights

Operators

This release mainly brings reader-node Firehose Protocol 2.3 support for all networks and not just Polygon. This is important for the upcoming release of Firehose-enabled geth version 1.2.11 and 1.2.12 that are going to be releases shortly.

Golang 1.20+ is now also required to build the project.

Added

Support reader node Firehose Protocol 2.3 on all networks now (and not just Polygon).

Removed

Removed --substreams-tier1-request-stats and --substreams-tier1-request-stats (substreams request-stats are now always sent to clients)

Fixed

tools check merged-blocks now correctly prints missing block gaps even without print-full or print-stats.

Changed

Now requires Go 1.20+ to compile the project.
Substreams bumped: better "Progress" messages

v1.4.11

Fixes

Bumped firehose and substreams library to fix a bug where live blocks were not metered correctly.

v1.4.10

Fixes

Fixed: jobs would hang when flags --substreams-state-bundle-size and --substreams-tier1-subrequests-size had different values. The latter flag has been completely removed, subrequests will be bound to the state bundle size.

Added

Added support for continuous authentication via the grpc auth plugin (allowing cutoff triggered by the auth system).

v1.4.9

Highlights

Substreams State Store Selection

The substreams server now accepts X-Sf-Substreams-Cache-Tag header to select which Substreams state store URL should be used by the request. When performing a Substreams request, the servers will pick the state store based on the header. This enable consumers to stay on the same cache version when the operators needs to bump the data version (reasons for this could be a bug in Substreams software that caused some cached data to be corrupted on invalid).

To benefit from this, operators that have a version currently in their state store URL should move the version part from --substreams-state-store-url to the new flag --substreams-state-store-default-tag. For example if today you have in your config:

start:
  ...
  flags:
    substreams-state-store-url: /<some>/<path>/v3

You should convert to:

start:
  ...
  flags:
    substreams-state-store-url: /<some>/<path>
    substreams-state-store-default-tag: v3

Substreams Scheduler Improvements for Parallel Processing

The substreams scheduler has been improved to reduce the number of required jobs for parallel processing. This affects backprocessing (preparing the states of modules up to a "start-block") and forward processing (preparing the states and the outputs to speed up streaming in production-mode).

Jobs on tier2 workers are now divided in "stages", each stage generating the partial states for all the modules that have the same dependencies. A substreams that has a single store won't be affected, but one that has 3 top-level stores, which used to run 3 jobs for every segment now only runs a single job per segment to get all the states ready.

Operators Upgrade

The app substreams-tier1 and substreams-tier2 should be upgraded concurrently. Some calls will fail while versions are misaligned.

Backend Changes

Substreams bumped to version v1.1.9
Authentication plugin trust can now specify an exclusive list of allowed headers (all lowercase), ex: trust://?allowed=x-sf-user-id,x-sf-api-key-id,x-real-ip,x-sf-substreams-cache-tag
The tier2 app no longer uses the common-auth-plugin, trust will always be used, so that tier1 can pass down its headers (ex: X-Sf-Substreams-Cache-Tag).

v1.4.8

Fixed

Fixed a bug in substreams-tier1 and substreams-tier2 which caused "live" blocks to be sent while the stream previously received block(s) were historic.

Added

Added a check for readiness of the dauth provider when answering "/healthz" on firehose and substreams

Changed

Changed --substreams-tier1-debug-request-stats to --substreams-tier1-request-stats which enabled request stats logging on Substreams Tier1
Changed --substreams-tier2-debug-request-stats to --substreams-tier2-request-stats which enabled request stats logging on Substreams Tier2

v1.4.7

Fixed an occasional panic in substreams-tier1 caused by a race condition
Fixed the grpc error codes for substreams tier1: Unauthenticated on bad auth, Canceled (endpoint is shutting down, please reconnect) on shutdown
Fixed the grpc healthcheck method on substreams-tier1 (regression)
Fixed the default value for flag common-auth-plugin: now set to 'trusted://' instead of panicking on removed 'null://'

v1.4.6

Changed

Substreams (@v1.1.6) is now out of the firehose app, and must be started using substreams-tier1 and substreams-tier2 apps!
Most substreams-related flags have been changed:
- common: --substreams-rpc-cache-chunk-size,--substreams-rpc-cache-store-url,--substreams-rpc-endpoints,--substreams-state-bundle-size,--substreams-state-store-url
- tier1: --substreams-tier1-debug-request-stats,--substreams-tier1-discovery-service-url,--substreams-tier1-grpc-listen-addr,--substreams-tier1-max-subrequests,--substreams-tier1-subrequests-endpoint,--substreams-tier1-subrequests-insecure,--substreams-tier1-subrequests-plaintext,--substreams-tier1-subrequests-size
- tier2: --substreams-tier2-discovery-service-url,--substreams-tier2-grpc-listen-addr
Some auth plugins have been removed, the new available plugins for --common-auth-plugins are trust:// and grpc://. See https://github.com/streamingfast/dauth for details
Metering features have been added, the available plugins for --common-metering-plugin are null://, logger://, grpc://. See https://github.com/streamingfast/dmetering for details

Added

Support for reader node Firehose Protocol 2.3 (for parallel processing of transactions, added to polygon 'bor' v0.4.0)

Removed

Removed the tools upgrade-merged-blocks command. Normalization is now part of consolereader within 'codec', not the 'types' package, and cannot be done a posteriori.
Updated metering to fix dependencies

v1.4.5

Updated metering (bumped versions of dmetering, dauth, and firehose libraries.)
Fixed firehose service healthcheck on shutdown
Fixed panic on download-blocks-from-firehose tool

v1.4.4

Operators

When upgrading a substreams server to this version, you should delete all existing module caches to benefit from deterministic output

Substreams changes

Switch default engine from wasmtime to wazero
Prevent reusing memory between blocks in wasm engine to fix determinism
Switch our store operations from bigdecimal to fixed point decimal to fix determinism
Sort the store deltas from DeletePrefixes() to fix determinism
Implement staged module execution within a single block.
"Fail fast" on repeating requests with deterministic failures for a "blacklist period", preventing waste of resources
SessionInit protobuf message now includes resolvedStartBlock and MaxWorkers, sent back to the client

v1.4.3

Highlights

This release brings an update to substreams to v1.1.4 which includes the following:
- Changes the module hash computation implementation to allow reusing caches accross substreams that 'import' other substreams as a dependency.
- Faster shutdown of requests that fail deterministically
- Fixed memory leak in RPC calls

Note for Operators

Note This upgrade procedure is applies if your Substreams deployment topology includes both tier1 and tier2 processes. If you have defined somewhere the config value substreams-tier2: true, then this applies to you, otherwise, if you can ignore the upgrade procedure.

The components should be deployed simultaneously to tier1 and tier2, or users will end up with backend error(s) saying that some partial file are not found. These errors will be resolved when both tiers are upgraded.

Added

Added Substreams scheduler tracing support. Enable tracing by setting the ENV variables SF_TRACING to one of the following:
- stdout://
- cloudtrace://[host:port]?project_id=<project_id>&ratio=<0.25>
- jaeger://[host:port]?scheme=<http|https>
- zipkin://[host:port]?scheme=<http|https>
- otelcol://[host:port]

v1.4.2

Highlights

This release brings an update to substreams to v1.1.3 which includes the following:
- Fixes an important bug that could have generated corrupted store state files. This is important for developers and operators.
- Fixes for race conditions that would return a failure when multiple identical requests are backprocessing.
- Fixes and speed/scaling improvements around the engine.

Note for Operators

Note This upgrade procedure is applies if your Substreams deployment topology includes both tier1 and tier2 processes. If you have defined somewhere the config value substreams-tier2: true, then this applies to you, otherwise, if you can ignore the upgrade procedure.

This release includes a small change in the internal RPC layer between tier1 processes and tier2 processes. This change requires an ordered upgrade of the processes to avoid errors.

The components should be deployed in this order:

Deploy and roll out tier1 processes first
Deploy and roll out tier2 processes in second

If you upgrade in the wrong order or if somehow tier2 processes start using the new protocol without tier1 being aware, user will end up with backend error(s) saying that some partial file are not found. Those will be resolved only when tier1 processes have been upgraded successfully.

v1.4.1

Fixed

Substreams running without a specific tier2 substreams-client-endpoint will now expose tier2 service sf.substreams.internal.v2.Substreams so it can be used internally.

Warning If you don't use dedicated tier2 nodes, make sure that you don't expose sf.substreams.internal.v2.Substreams to the public (from your load-balancer or using a firewall)

Breaking changes

flag substreams-partial-mode-enabled renamed to substreams-tier2
flag substreams-client-endpoint now defaults to empty string, which means it is its own client-endpoint (as it was before the change to protocol V2)

v1.4.0

Substreams RPC protocol V2

Substreams protocol changed from sf.substreams.v1.Stream/Blocks to sf.substreams.rpc.v2.Stream/Blocks for client-facing service. This changes the way that substreams clients are notified of chain reorgs. All substreams clients need to be upgraded to support this new protocol.

See https://github.com/streamingfast/substreams/releases/tag/v1.1.1 for details.

Added

firehose-client tool now accepts --limit flag to only send that number of blocks. Get the latest block like this: fireeth tools firehose-client <endpoint> --limit=1 -- -1 0

v1.3.8

Highlights

This is a bug fix release for node operators that are about to upgrade to Shanghai release. The Firehose instrumented geth compatible with Shanghai release introduced a new message CANCEL_BLOCK. It seems in some circumstances, we had a bug in the console reader that was actually panicking but the message was received but no block was actively being assembled.

This release fix this bogus behavior by simply ignoring CANCEL_BLOCK message when there is no active block which is harmless. Every node operators that upgrade to https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.11.5-fh2.2 should upgrade to this version.

Note There is no need to update the Firehose instrumented geth binary, only fireeth needs to be bumped if you already are at the latest geth version.

Fixed

Fixed a bug on console reader when seeing CANCEL_BLOCK on certain circumstances.

Changed

Now using Golang 1.20 for building releases.
Changed default value of flag substreams-sub-request-block-range-size from 1000 to 10000.

v1.3.7

Fixed

Fixed a bug in data normalization for Polygon chain which would cause panics on certain blocks.

Added

Support for gcp archive types of snapshots

v1.3.6

Highlights

This release implements the new CANCEL_BLOCK instruction from Firehose protocol 2.2 (fh2.2), to reject blocks that failed post-validation.
This release fixes polygon "StateSync" transactions by grouping the calls inside an artificial transaction.

If you had previous blocks from a Polygon chain (bor), you will need to reprocess all your blocks from the node because some StateSync transactions may be missing on some blocks.

Operators

This release now supports the new Firehose node exchange format 2.2 which introduced a new exchanged message CANCEL_BLOCK. This has an implication on the Firehose instrumented Geth binary you can use with the release.

If you use Firehose instrumented Geth binary tagged fh2.2 (like geth-v1.11.4-fh2.2-1), you must use firehose-ethereum version >= 1.3.6
If you use Firehose instrumented Geth binary tagged fh2.1 (like geth-v1.11.3-fh2.1), you can use firehose-ethereum version >= 1.0.0

New releases of Firehose instrumented Geth binary for all chain will soon all be tagged fh2.2, so upgrade to >= 1.3.6 of firehose-ethereum will be required.

v1.3.5

Highlights

This release is required if you run on Goerli and is mostly about supporting the upcoming Shanghai fork that has been activated on Goerli on March 14th.

Changed

Added support for withdrawal balance change reason in block model, this is required for running on most recent Goerli Shanghai hard fork.
Added support for withdrawals_root on Header in the block model, this will be populated only if the chain has activated Shanghai hard fork.
--substreams-max-fuel-per-block-module will limit the number of wasmtime instructions for a single module in a single block.

v1.3.4

Highlights

Fixed the 'upgrade-merged-blocks' from v2 to v3

Blocks that were migrated from v2 to v3 using the 'upgrade-merged-blocks' should now be considered invalid. The upgrade mechanism did not correctly fix the "caller" on DELEGATECALLs when these calls were nested under another DELEGATECALL.

You should run the upgrade-merged-blocks again if you previously used 'v2' blocks that were upgraded to 'v3'.

Backoff mechanism for bursts

This mechanism uses a leaky-bucket mechanism, allowing an initial burst of X connections, allowing a new connection every Y seconds or whenever an existing connection closes.

Use --firehose-rate-limit-bucket-size=50 and --firehose-rate-limit-bucket-fill-rate=1s to allow 50 connections instantly, and another connection every second. Note that when the server is above the limit, it waits 500ms before it returns codes.Unavailable to the client, forcing a minimal back-off.

Fixed

Substreams RpcCall object are now validated before being performed to ensure they are correct.
Substreams RpcCall JSON-RPC code -32602 is now treated as a deterministic error (invalid request).
tools compare-blocks now correctly handle segment health reporting and properly prints all differences with -diff.
tools compare-blocks now ignores 'unknown fields' in the protobuf message, unless --include-unknown-fields=true
tools compare-blocks now ignores when a block bundle contains the 'last block of previous bundle' (a now-deprecated feature)

Added

support for "requester pays" buckets on Google Storage in url, ex: gs://my-bucket/path?project=my-project-id
substreams were also bumped to current March 1st develop HEAD

v1.3.3

Changed

Increased gRPC max received message size accepted by Firehose and Substreams gRPC endpoints to 25 MiB.

Removed

Command fireeth init has been removed, this was a leftover from another time and the command was not working anyway.

Added

flag common-auto-max-procs to optimize go thread management using github.com/uber-go/automaxprocs
flag common-auto-mem-limit-percent to specify GOMEMLIMIT based on a percentage of available memory

v1.3.2

Updated

Updated to Substreams version v0.2.0 please refer to release page for further info about Substreams changes.

Changed

Breaking Config value substreams-stores-save-interval and substreams-output-cache-save-interval have been merged together as a single value to avoid potential bugs that would arise when the value is different for those two. The new configuration value is called substreams-cache-save-interval.
- To migrate, remove usage of substreams-stores-save-interval: <number> and substreams-output-cache-save-interval: <number> if defined in your config file and replace with substreams-cache-save-interval: <number>, if you had two different value before, pick the biggest of the two as the new value to put. We are currently setting to 1000 for Ethereum Mainnet.

Fixed

Fixed various issues with fireeth tools check merged-blocks
- The stopWalk error is not reported as a real error anymore.
- Incomplete range should now be printed more accurately.

v1.3.1

Release made to fix our building workflows, nothing different than v1.3.0.

v1.3.0

Changed

Updated to Substreams v0.1.0, please refer to release page for further info about Substreams changes.

Warning The state output format for map and store modules has changed internally to be more compact in Protobuf format. When deploying this new version and using Substreams feature, previous existing state files should be deleted or deployment updated to point to a new store location. The state output store is defined by the flag --substreams-state-store-url flag.

Added

New Prometheus metric console_reader_trx_read_count can be used to obtain a transaction rate of how many transactions were read from the node over a period of time.
New Prometheus metric console_reader_block_read_count can be used to obtain a block rate of how many blocks were read from the node over a period of time.
Added --header-only support on fireeth tools firehose-client.
Added HeaderOnly transform that can be used to return only the Block's header a few top-level fields Ver, Hash, Number and Size.
Added fireeth tools firehose-prometheus-exporter to use as a client-side monitoring tool of a Firehose endpoint.

Deprecated

Deprecated LightBlock is deprecated and will be removed in the next major version, it's goal is now much better handled by CombineFilter transform or HeaderOnly transform if you required only Block's header.

v1.2.2

Hotfix 'nil pointer' panic when saving uninitialized cache.

v1.2.1

Substreams improvements

Performance

Changed cache file format for stores and outputs (faster with vtproto) -- requires removing the existing state files.
Various improvements to scheduling.

Fixes

Fixed eth_call handler not flagging out of gas error as deterministic.
Fixed Memory leak in wasmtime.

Merger fixes

Removed the unused 'previous' one-block in merged-blocks (99 inside bundle:100).
Fix: also prevent rare bug of bundling "very old" one-blocks in merged-blocks.

v1.2.0

Added

Added sf.firehose.v2.Fetch/Block endpoint on firehose, allows fetching single block by num, num+ID or cursor.
Added tools firehose-single-block-client to call that new endpoint.

Changed

Renamed tools normalize-merged-blocks to upgrade-merged-blocks.

Fixed

Fixed common-blocks-cache-dir flag's description.
Fixed DELEGATECALL's caller (a.k.a from). -> requires upgrade of blocks to version: 3
Fixed execution aborted (timeout = 5s) hard-coded timeout value when detecting in Substreams if eth_call error response was deterministic.

Upgrade Procedure

Assuming that you are running a firehose deployment v1.1.0 writing blocks to folders /v2-oneblock, /v2-forked and /v2, you will deploy a new setup that writes blocks to folders /v3-oneblock, v3-forked and /v3

This procedure describes an upgrade without any downtime. With proper parallelization, it should be possible to complete this upgrade within a single day.

Launch a new reader with this code, running instrumented geth binary: https://github.com/streamingfast/go-ethereum/releases/tag/geth-v1.10.25-fh2.1 (you can start from a backup that is close to head)
Upgrade your merged-blocks from version: 2 to version: 3 using fireeth tools upgrade-merged-blocks /path/to/v2 /path/to/v3 {start} {stop} (you can run multiple upgrade commands in parallel to cover the whole blocks range)
Create combined indexes from those new blocks with fireeth start combined-index-builder (you can run multiple commands in parallel to fill the block range)
When your merged-blocks have been upgraded and the one-block-files are being produced by the new reader, launch a merger
When the reader, merger and combined-index-builder caught up to live, you can launch the relayer(s), firehose(s)
When the firehoses are ready, you can now switch traffic to them.

v1.1.0

Added

Added 'SendAllBlockHeaders' param to CombinedFilter transform when we want to prevent skipping blocks but still want to filter out trxs.

Changed

Reduced how many times reader read statistics is displayed down to each 30s (previously each 5s) (and re-wrote log to reader node statistics).

Fixed

Fix fireeth tools download-blocks-from-firehose tool that was not working anymore.
Simplify forkablehub startup performance cases.
Fix relayer detection of a hole in stream blocks (restart on unrecoverable issue).
Fix possible panic in hub when calls to one-block store are timing out.
Fix merger slow one-block-file deletions when there are more than 10000 of them.

v1.0.0

BREAKING CHANGES

Project rename

The binary name has changed from sfeth to fireeth (aligned with https://firehose.streamingfast.io/references/naming-conventions)
The repo name has changed from sf-ethereum to firehose-ethereum

Ethereum V2 blocks (with fh2-instrumented nodes)

This will require reprocessing the chain to produce new blocks
Protobuf Block model is now tagged sf.ethereum.type.v2 and contains the following improvements:
- Fixed Gas Price on dynamic transactions (post-London-fork on ethereum mainnet, EIP-1559)
- Added "Total Ordering" concept, 'Ordinal' field on all events within a block (trx begin/end, call, log, balance change, etc.)
- Added TotalDifficulty field to ethereum blocks
- Fixed wrong transaction status for contract deployments that fail due to out of gas on pre-Homestead transactions (aligned with status reported by chain: SUCCESS -- even if no contract code is set)
- Added more instrumentation around AccessList and DynamicFee transaction, removed some elements that were useless or could not be derived from other elements in the structure, ex: gasEvents
- Added support for finalized block numbers (moved outside the proto-ethereum block, to firehose bstream v2 block)
There are no more "forked blocks" in the merged-blocks bundles:
- The merged-blocks are therefore produced only after finality passed (before The Merge, this means after 200 confirmations).
- One-block-files close to HEAD stay in the one-blocks-store for longer
- The blocks that do not make it in the merged-blocks (forked out because of a re-org) are uploaded to another store (common-forked-blocks-store-url) and kept there for a while (to allow resolving cursors)

Firehose V2 Protocol

This will require changes in most firehose clients
A compatibility layer has been added to still support sf.firehose.v1.Stream/Blocks but only for specific values for 'ForkSteps' in request: 'irreversible' or 'new+undo'
The Firehose Blocks protocol is now under sf.firehose.v2 (bumped from sf.firehose.v1).
- Step type IRREVERSIBLE renamed to FINAL
- Blocks request now only allows 2 modes regarding steps: NEW,UNDO and FINAL (gated by the final_blocks_only boolean flag)
- Blocks that are sent out can have the combined step NEW+FINAL to prevent sending the same blocks over and over if they are already final

Block Indexes

Removed the Irreversible indices completely (because the merged-blocks only contain final blocks now)
Deprecated the "Call" and "log" indices (xxxxxxxxxx.yyy.calladdrsig.idx and xxxxxxxxxx.yyy.logaddrsig.idx), now replaced by "combined" index
Moved out the sfeth tools generate-... command to a new app that can be launched with sfeth start generate-combined-index[,...]

Flags and environment variables

All config via environment variables that started with SFETH_ now starts with FIREETH_
All logs now output on stderr instead of stdout like previously
Changed config-file default from ./sf.yaml to "", preventing failure without this flag.
Renamed common-blocks-store-url to common-merged-blocks-store-url
Renamed common-oneblock-store-url to common-one-block-store-url now used by firehose and relayer apps
Renamed common-blockstream-addr to common-live-blocks-addr
Renamed the mindreader application to reader
Renamed all the mindreader-node-* flags to reader-node-*
Added common-forked-blocks-store-url flag used by merger and firehose
Changed --log-to-file default from true to false
Changed default verbosity level: now all loggers are INFO (instead of having most of them to WARN). -v will now activate all DEBUG logs
Removed common-block-index-sizes, common-index-store-url
Removed merger-state-file, merger-next-exclusive-highest-block-limit, merger-max-one-block-operations-batch-size, merger-one-block-deletion-threads, merger-writers-leeway
Added merger-stop-block, merger-prune-forked-blocks-after, merger-time-between-store-pruning
Removed mindreader-node-start-block-num, mindreader-node-wait-upload-complete-on-shutdown, mindreader-node-merge-and-store-directly, mindreader-node-merge-threshold-block-age
Removed firehose-block-index-sizes,firehose-block-index-sizes, firehose-irreversible-blocks-index-bundle-sizes, firehose-irreversible-blocks-index-url, firehose-realtime-tolerance
Removed relayer-buffer-size, relayer-merger-addr, relayer-min-start-offset

MIGRATION

Clients

If you depend on the proto file, update import "sf/ethereum/type/v1/type.proto" to import "sf/ethereum/type/v2/type.proto"
If you depend on the proto file, update all occurrences of sf.ethereum.type.v1.<Something> to sf.ethereum.type.v2.<Something>
If you depend on sf-ethereum/types as a library, update all occurrences of github.com/streamingfast/firehose-ethereum/types/pb/sf/ethereum/type/v1 to github.com/streamingfast/firehose-ethereum/types/pb/sf/ethereum/type/v2.

Server-side

Deployment

The reader requires Firehose-instrumented Geth binary with instrumentation version 2.x (tagged fh2)
Because of the changes in the ethereum block protocol, an existing deployment cannot be migrated in-place.
You must deploy firehose-ethereum v1.0.0 on a new environment (without any prior block or index data)
You can put this new deployment behind a GRPC load-balancer that routes /sf.firehose.v2.Stream/* and /sf.firehose.v1.Stream/* to your different versions.
Go through the list of changed "Flags and environment variables" and adjust your deployment accordingly.
- Determine a (shared) location for your forked-blocks.
- Make sure that you set the one-block-store and forked-blocks-store correctly on all the apps that now require it.
- Add the generate-combined-index app to your new deployment instead of the tools command for call/logs indices.
If you want to reprocess blocks in batches while you set up a "live" deployment:
- run your reader node from prior data (ex: from a snapshot)
- use the --common-first-streamable-block flag to a 100-block-aligned boundary right after where this snapshot starts (use this flag on all apps)
- perform batch merged-blocks reprocessing jobs
- when all the blocks are present, set the common-first-streamable-block flag to 0 on your deployment to serve the whole range

Producing merged-blocks in batch

The reader requires Firehose-instrumented Geth binary with instrumentation version 2.x (tagged fh2)
The reader does NOT merge block files directly anymore: you need to run it alongside a merger:
- determine a start and stop block for your reprocessing job, aligned on a 100-blocks boundary right after your Geth data snapshot
- set --common-first-streamable-block to your start-block
- set --merger-stop-block to your stop-block
- set --common-one-block-store-url to a local folder accessible to both merger and mindreader apps
- set --common-merged-blocks-store-url to the final (ex: remote) folder where you will store your merged-blocks
- run both apps like this fireeth start reader,merger --...
You can run as many batch jobs like this as you like in parallel to produce the merged-blocks, as long as you have data snapshots for Geth that start at this point

Producing combined block indices in batch

Run batch jobs like this: fireeth start generate-combined-index --common-blocks-store-url=/path/to/blocks --common-index-store-url=/path/to/index --combined-index-builder-index-size=10000 --combined-index-builder-start-block=0 [--combined-index-builder-stop-block=10000] --combined-index-builder-grpc-listen-addr=:9000

Other (non-breaking) changes

Added tools and apps

Added tools firehose-client command with filter/index options
Added tools normalize-merged-blocks command to remove forked blocks from merged-blocks files (cannot transform ethereum blocks V1 into V2 because some fields are missing in V1)
Added substreams server support in firehose app (alpha) through --substreams-enabled flag

Various

The firehose GRPC endpoint now supports requests that are compressed using gzip or zstd
The merger does not expose PreMergedBlocks endpoint over GRPC anymore, only HealthCheck. (relayer does not need to talk to it)
Automatically setting the flag --firehose-genesis-file on reader nodes if their reader-node-bootstrap-data-url config value is sets to a genesis.json file.
Note to other Firehose implementors: we changed all command line flags to fit the required/optional format referred to here: https://en.wikipedia.org/wiki/Usage_message
Added prometheus boolean metric to all apps called 'ready' with label 'app' (firehose, merger, mindreader-node, node, relayer, combined-index-builder)

v0.10.2

Removed firehose-blocks-store-urls flag (feature for using multiple stores now deprecated -> causes confusion and issues with block-caching), use common-blocks-sture-url instead.

v0.10.2

Fixed problem using S3 provider where the S3 API returns empty filename (we ignore at the consuming time when we receive an empty filename result).

v0.10.1

Fixed an issue where the merger could panic on a new deployment

v0.10.0

Fixed an issue where the merger would get stuck when too many (more than 2000) one-block-files were lying around, with block numbers below the current bundle high boundary.

v0.10.0-rc.5

Changed

Renamed common atm 4 flags to blocks-cache: --common-blocks-cache-{enabled|dir|max-recent-entry-bytes|max-entry-by-age-bytes}

Fixed

Fixed tools check merged-blocks block hole detection behavior on missing ranges (bumped sf-tools)
Fixed a deadlock issue related to s3 storage error handling (bumped dstore)

Added

Added tools download-from-firehose command to fetch blocks and save them as merged-blocks files locally.
Added cloud-gcp:// auth module (bumped dauth)

v0.10.0-rc.4

Added

substreams-alpha client
gke-pvc-snapshot backup module

Fixed

Fixed a potential 'panic' in merger on a new chain

v0.10.0

Fixed

Fixed an issue where the merger would get stuck when too many (more than 2000) one-block-files were lying around, with block numbers below the current bundle high boundary.

v0.10.0-rc.5

Changed

Renamed common atm 4 flags to blocks-cache: --common-blocks-cache-{enabled|dir|max-recent-entry-bytes|max-entry-by-age-bytes}

Fixed

Fixed tools check merged-blocks block hole detection behavior on missing ranges (bumped sf-tools)

Added

Added tools download-from-firehose command to fetch blocks and save them as merged-blocks files locally.
Added cloud-gcp:// auth module (bumped dauth)

v0.10.0-rc.4

Changed

The default text encoder use to encode log entries now emits the level when coloring is disabled.
Default value for flag --mindreader-node-enforce-peers is now "", this has been changed because the default value was useful only in development when running a local node-manager as either the miner or a peering node.

v0.10.0-rc.1

Added

Added block data file caching (called ATM), this is to reduce the memory usage of component keeping block objects in memory.
Added transforms: LogFilter, MultiLogFilter, CallToFilter, MultiCallToFilter to only return transaction traces that match logs or called addresses.
Added support for irreversibility indexes in firehose to prevent replaying reorgs when streaming old blocks.
Added support for log and call indexes to skip old blocks that do not match any transform filter.

Changed

Updated all Firehose stack direct dependencies.
Updated confusing flag behavior for --common-system-shutdown-signal-delay and its interaction with gRPC connection draining in firehose component sometimes preventing it from shutting down.
Reporting an error is if flag merge-threshold-block-age is way too low (< 30s).

Removed

Removed some old components that are not required by Firehose stack directly, the repository is as lean as it ca now.

Fixed

Fixed Firehose gRPC listening address over plain text.
Fixed automatic merging of files within the mindreader is much more robust then before.

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Change log

Unreleased

Substreams

v2.14.0

Substreams v1.17.0

New sf.substreams.rpc.v3.Stream/Blocks endpoint added

Bug fixes

v2.13.3

v2.13.2

Substreams v1.16.6

v2.13.1

Substreams

Metering

Session (stream + workers management)

Stability

v2.13.0

Substreams (v1.16.4)

Tier1 thread / memory leak

Authentication changes

New authentication plugin

v2.12.4

Substreams (v1.16.2)

v2.12.3

v2.12.2

Substreams

v2.12.1

Substreams

v2.12.0

v2.11.13

v2.11.12

Substreams improvements v1.15.8

Poller

Various

v2.11.11

Substreams improvements v1.15.7

v2.11.10

v2.11.9

Block

v2.11.8

Substreams performance improvements v1.15.4

New 'firehose' reader

Other

v2.11.7

v2.11.6

v2.11.5

v2.11.4

Substreams (v1.15.1)

v2.11.3

Substreams

v2.11.2

Substreams (v1.14.3)

Bugfixes

Performance

Service lifecycle

Logging / errors

v2.11.1

Substreams (v1.14.1)

v2.11.0

Substreams (v1.14.0)

Reconnection time

Performance

v2.10.0

Block Model

Reader Node

Substreams v1.13.0

Capacity Management

Performance

Tools

v2.9.4

v2.9.3

v2.9.2

v2.9.1

v2.9.0

Reader

How to use `solana_program` or `alloy`/`ether-rs`