From 7c50264e647aa13d6902d2f38da98e2084e54763 Mon Sep 17 00:00:00 2001 From: garikbesson Date: Tue, 7 Oct 2025 13:38:31 +0100 Subject: [PATCH 1/4] NEAR Indexer concept page and tutorial wip --- docs/data-infrastructure/near-indexer.md | 44 ++++++ .../near-lake-framework.md | 2 +- .../tutorials/near-indexer.md | 138 ++++++++++++++++++ website/sidebars.js | 6 +- 4 files changed, 187 insertions(+), 3 deletions(-) create mode 100644 docs/data-infrastructure/near-indexer.md create mode 100644 docs/data-infrastructure/tutorials/near-indexer.md diff --git a/docs/data-infrastructure/near-indexer.md b/docs/data-infrastructure/near-indexer.md new file mode 100644 index 00000000000..952d31e742c --- /dev/null +++ b/docs/data-infrastructure/near-indexer.md @@ -0,0 +1,44 @@ +--- +id: near-indexer +title: NEAR Indexer +description: "NEAR Indexer is a micro-framework that provides a stream of blocks recorded on the NEAR network. It is designed to handle real-time events on the blockchain." +--- + +# NEAR Indexer + +The NEAR Indexer is a micro-framework that delivers a stream of blocks recorded on the NEAR network. It is specifically designed to handle real-time events on the blockchain. + +:::note GitHub repo + +https://github.com/near/nearcore/tree/master/chain/indexer + +::: + +--- + +## Rationale + +As scaling dApps enter NEAR’s mainnet, an issue may arise: how do they quickly and efficiently access state from our deployed smart contracts, and cut out the cruft? Contracts may grow to have complex data structures and querying the network RPC may not be the optimal way to access state data. The NEAR Indexer Framework allows for streams to be captured and indexed in a customized manner. The typical use-case is for this data to make its way to a relational database. Seeing as this is custom per project, there is engineering work involved in using this framework. + +With the NEAR Indexer, developers can perform both high-level data aggregation and low-level introspection of blockchain events. + +:::note +[Data Lake](./near-lake-framework.md#data-lake) which is a source of data for [NEAR Lake Framework](./near-lake-framework.md) is feeded by a running the [NEAR Lake Indexer](https://github.com/aurora-is-near/near-lake-indexer) that is built on top of NEAR Indexer. +::: + +--- + +## How It Works + +The NEAR Indexer works by running a node that processes blocks as they are added to the blockchain. It provides a stream of these blocks, allowing developers to subscribe to and process them in real-time. + +Learn how to run it following the [tutorial](./tutorials/near-indexer.md). + +--- + +## Latency +Comparing to [NEAR Lake Framework](./near-lake-framework.md) in terms of latency the NEAR Indexer is significantly faster as it reads data directly from the blockchain the same way as RPC nodes do. + +:::info +The full comparison table you can find [here](near-lake-framework.md#comparison-with-near-indexer-framework). +::: diff --git a/docs/data-infrastructure/near-lake-framework.md b/docs/data-infrastructure/near-lake-framework.md index f681ce19ce5..3608c47cafa 100644 --- a/docs/data-infrastructure/near-lake-framework.md +++ b/docs/data-infrastructure/near-lake-framework.md @@ -88,7 +88,7 @@ $17,20 + $21,60 = $30,16 --- -## Comparison with [NEAR Indexer Framework](https://github.com/near/nearcore/tree/master/chain/indexer) +## Comparison with [NEAR Indexer Framework](near-indexer.md) NEAR Lake Framework is reading data from AWS S3, while the NEAR Indexer is running a full node and reading data from the blockchain directly in real time. diff --git a/docs/data-infrastructure/tutorials/near-indexer.md b/docs/data-infrastructure/tutorials/near-indexer.md new file mode 100644 index 00000000000..87a238b04fb --- /dev/null +++ b/docs/data-infrastructure/tutorials/near-indexer.md @@ -0,0 +1,138 @@ +--- +id: listen-to-realtime-events +title: Listen to Realtime Events +description: "This tutorial will guide you through building an indexer using the NEAR Indexer Framework. The indexer will listen for FunctionCalls on a specific contract and log the details of each call." +--- + +import {Github} from "@site/src/components/UI/Codetabs" +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +import MovingForwardSupportSection from '@site/src/components/MovingForwardSupportSection'; + +In this tutorial, we will build an indexer using the NEAR Indexer Framework. The indexer will listen realtime events from NEAR blockchain for FunctionCalls on a specific contract and log the details of each call. + +The full source code for the indexer is available in the [GitHub repository](https://github.com/near-examples/near-indexer?tab=readme-ov-file). + +--- + +## Initialization + +To run the NEAR Indexer connected to a network we need to have node configs and keys prepopulated. Navigate to the directory where you cloned the example and run the following command to initialize the configuration for the desired network: + + + + ```bash + cargo run --release -- --home-dir ~/.near/localnet init + ``` + + + ```bash + cargo run --release -- --home-dir ~/.near/testnet init --chain-id testnet --download-config rpc --download-genesis + ``` + + + ```bash + cargo run --release -- --home-dir ~/.near/mainnet init --chain-id mainnet --download-config rpc --download-genesis + ``` + + + +The above command should initialize necessary configs and keys to run `localnet/testnet/mainnet` in `~/.near/(localnet|testnet|mainnet)`. + +The main configuration file for the node is `config.json`. + +The above code will download the official genesis and config. The recommended config is accessible at: + +- [testnet config.json](https://s3-us-west-1.amazonaws.com/build.nearprotocol.com/nearcore-deploy/testnet/rpc/config.json) +- [mainnet config.json](https://s3-us-west-1.amazonaws.com/build.nearprotocol.com/nearcore-deploy/mainnet/rpc/config.json) + +This configuration is intended for RPC nodes. Any extra settings required for the indexer should be manually added to the configuration file config.json in your --home-dir (e.g. ~/.near/testnet/config.json). + +Configs for the specified network are in the --home-dir provided folder. We need to ensure that NEAR Indexer follows all the necessary shards, so set "tracked_shards_config": "AllShards" parameters in ~/.near/testnet/config.json. Hint: See the Tweaks section below to learn more about further configuration options. + +After that we can run NEAR Indexer. + + + + ```bash + cargo run --release -- --home-dir ~/.near/localnet run + ``` + + + ```bash + cargo run --release -- --home-dir ~/.near/testnet run + ``` + + + ```bash + cargo run --release -- --home-dir ~/.near/mainnet run + ``` + + + +--- + +## Custom Settings + +By default, nearcore is configured to do as little work as possible while still operating on an up-to-date state. Indexers may have different requirements, so there is no solution that would work for everyone, and thus we are going to provide you with the set of knobs you can tune for your requirements. + +### Shards/Accounts to Track +As already has been mentioned above, the most common tweak you need to apply is listing all the shards you want to index data from; to do that, you should ensure that "tracked_shards_config" in the config.json lists all the shard UIDs: + +```json +... +"tracked_shards_config": { + "Shards": [ + "s3.v3", + "s4.v3" + ] +}, +... +``` +Or, if you want to track specific accounts: + +```json +... +"tracked_shards_config": { + "Accounts": [ + "account_a", + "account_b" + ] +}, +... +``` + +
+ +### Sync Mode +You can choose Indexer Framework sync mode by setting what to stream: + +- LatestSynced - Real-time syncing, always taking the latest finalized block to stream +- FromInterruption - Starts syncing from the block NEAR Indexer was interrupted last time +- BlockHeight(u64) - Specific block height to start syncing from. + + + +
+ +### Historical Data + +Indexer Framework also exposes access to the internal APIs (see Indexer::client_actors method), so you can fetch data about any block, transaction, etc, yet by default, nearcore is configured to remove old data (garbage collection), so querying the data that was observed a few epochs before may return an error saying that the data is not found. If you only need blocks streaming, you don't need this tweak, but if you need access to the historical data right from your Indexer, consider updating "archive" setting in config.json to true: + +```json +... +"archive": true, +... +``` + +## Parsing the Block Data + +From the block data, we can access the transactions, their receipts and actions. In this example, we will look for FunctionCall actions on a specific contract and log the details of each call. + + + + \ No newline at end of file diff --git a/website/sidebars.js b/website/sidebars.js index b4115780ef7..13a75d364d2 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -478,6 +478,7 @@ const sidebar = { 'Indexers': [ 'data-infrastructure/indexers', 'data-infrastructure/near-lake-framework', + 'data-infrastructure/near-indexer', ], }, ] @@ -504,8 +505,9 @@ const sidebar = { 'data-infrastructure/tutorials/running-near-lake/run-lake-indexer', 'data-infrastructure/tutorials/running-near-lake/lake-start-options', 'data-infrastructure/tutorials/running-near-lake/credentials', - ] - } + ], + }, + "data-infrastructure/tutorials/listen-to-realtime-events" ], }, ] From 7f3c62b352520acc223b53e97f2e0651f07b20d9 Mon Sep 17 00:00:00 2001 From: garikbesson Date: Tue, 7 Oct 2025 13:43:05 +0100 Subject: [PATCH 2/4] fix some code links --- docs/data-infrastructure/tutorials/near-indexer.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/data-infrastructure/tutorials/near-indexer.md b/docs/data-infrastructure/tutorials/near-indexer.md index 87a238b04fb..35d726389ad 100644 --- a/docs/data-infrastructure/tutorials/near-indexer.md +++ b/docs/data-infrastructure/tutorials/near-indexer.md @@ -112,7 +112,7 @@ You can choose Indexer Framework sync mode by setting what to stream: - BlockHeight(u64) - Specific block height to start syncing from.
@@ -132,7 +132,7 @@ Indexer Framework also exposes access to the internal APIs (see Indexer::client_ From the block data, we can access the transactions, their receipts and actions. In this example, we will look for FunctionCall actions on a specific contract and log the details of each call. \ No newline at end of file From e3464af72ecaabbceafe74d5de0cb71ae8a1f0c0 Mon Sep 17 00:00:00 2001 From: Guillermo Alejandro Gallardo Diez Date: Tue, 7 Oct 2025 15:27:12 +0200 Subject: [PATCH 3/4] fix: fixes to the text --- docs/data-infrastructure/near-indexer.md | 43 ++++++++++------- .../tutorials/near-indexer.md | 48 ++++++++++++------- 2 files changed, 55 insertions(+), 36 deletions(-) diff --git a/docs/data-infrastructure/near-indexer.md b/docs/data-infrastructure/near-indexer.md index 952d31e742c..4e3c6101c46 100644 --- a/docs/data-infrastructure/near-indexer.md +++ b/docs/data-infrastructure/near-indexer.md @@ -6,39 +6,46 @@ description: "NEAR Indexer is a micro-framework that provides a stream of blocks # NEAR Indexer -The NEAR Indexer is a micro-framework that delivers a stream of blocks recorded on the NEAR network. It is specifically designed to handle real-time events on the blockchain. +As scaling dApps enter NEAR’s mainnet, an issue may arise: how do they quickly and efficiently access state from our deployed smart contracts, and cut out the cruft? Contracts may grow to have complex data structures and querying the network RPC may not be the optimal way to access state data. -:::note GitHub repo +The [NEAR Indexer](https://github.com/near/nearcore/tree/master/chain/indexer) is a micro-framework specifically designed to handle real-time events on the blockchain, allowing to capture and index streams of blocks in a customized manner. -https://github.com/near/nearcore/tree/master/chain/indexer - -::: - ---- - -## Rationale +With the NEAR Indexer, developers can perform both high-level data aggregation and low-level introspection of blockchain events. -As scaling dApps enter NEAR’s mainnet, an issue may arise: how do they quickly and efficiently access state from our deployed smart contracts, and cut out the cruft? Contracts may grow to have complex data structures and querying the network RPC may not be the optimal way to access state data. The NEAR Indexer Framework allows for streams to be captured and indexed in a customized manner. The typical use-case is for this data to make its way to a relational database. Seeing as this is custom per project, there is engineering work involved in using this framework. +:::tip -With the NEAR Indexer, developers can perform both high-level data aggregation and low-level introspection of blockchain events. +For those searching to not build their own indexer, the [NEAR Lake Framework](./near-lake-framework.md) provides a simpler way to access blockchain data in real-time -:::note -[Data Lake](./near-lake-framework.md#data-lake) which is a source of data for [NEAR Lake Framework](./near-lake-framework.md) is feeded by a running the [NEAR Lake Indexer](https://github.com/aurora-is-near/near-lake-indexer) that is built on top of NEAR Indexer. ::: --- ## How It Works -The NEAR Indexer works by running a node that processes blocks as they are added to the blockchain. It provides a stream of these blocks, allowing developers to subscribe to and process them in real-time. +The NEAR Indexer works by **running a node** and processing blocks as they are added to the blockchain. The framework provides a stream of blocks, allowing developers to subscribe and process these blocks in real-time. + +:::tip Learn how to run it following the [tutorial](./tutorials/near-indexer.md). +::: + --- -## Latency +## Comparison with [NEAR Indexer Framework](near-indexer.md) + Comparing to [NEAR Lake Framework](./near-lake-framework.md) in terms of latency the NEAR Indexer is significantly faster as it reads data directly from the blockchain the same way as RPC nodes do. -:::info -The full comparison table you can find [here](near-lake-framework.md#comparison-with-near-indexer-framework). -::: +Feature | Indexer Framework | Lake Framework +------- | ----------------- | -------------- +Allows to follow the blocks and transactions in the NEAR Protocol | **Yes** | **Yes**
(but only mainnet and testnet networks) +Decentralized | **Yes** | No
(Pagoda Inc dumps the blocks to AWS S3) +Reaction time (end-to-end) | minimum 3.8s (estimated average 5-7s) | [minimum 3.9s (estimated average 6-8s)](#latency) +Reaction time (framework overhead only) | 0.1s | 0.2-2.2s +Estimated cost of infrastructure | [$500+/mo](https://near-nodes.io/rpc/hardware-rpc) | [**$20/mo**](#cost) +Ease of maintenance | Advanced
(need to follow every nearcore upgrade, and sync state) | **Easy**
(deploy once and forget) +How long will it take to start? | days (on mainnet/testnet) | **seconds** +Ease of local development | Advanced
(localnet is a good option, but testing on testnet/mainnet is too heavy) | **Easy**
(see [tutorials](./tutorials/near-lake-state-changes-indexer.md)) +Programming languages that a custom indexer can be implemented with | Rust only | **Any**
(currently, helper packages are released in [Python](http://pypi.org/project/near-lake-framework), [JavaScript](https://www.npmjs.com/package/near-lake-framework), and [Rust](https://crates.io/crates/near-lake-framework)) + +--- \ No newline at end of file diff --git a/docs/data-infrastructure/tutorials/near-indexer.md b/docs/data-infrastructure/tutorials/near-indexer.md index 35d726389ad..ac11d7abad8 100644 --- a/docs/data-infrastructure/tutorials/near-indexer.md +++ b/docs/data-infrastructure/tutorials/near-indexer.md @@ -11,13 +11,18 @@ import MovingForwardSupportSection from '@site/src/components/MovingForwardSuppo In this tutorial, we will build an indexer using the NEAR Indexer Framework. The indexer will listen realtime events from NEAR blockchain for FunctionCalls on a specific contract and log the details of each call. +To get our indexer up and running we will need two steps: + +1. To [initialize](#initialization) the indexer +2. To [start it](#run) + The full source code for the indexer is available in the [GitHub repository](https://github.com/near-examples/near-indexer?tab=readme-ov-file). --- ## Initialization -To run the NEAR Indexer connected to a network we need to have node configs and keys prepopulated. Navigate to the directory where you cloned the example and run the following command to initialize the configuration for the desired network: +In order for our indexer to process blocks it needs to join the NEAR network as a node. To do that, we need first to initialize it, which will download the blockchain `genesis` config, and create a `key` for our node to communicate with other nodes: @@ -37,20 +42,24 @@ To run the NEAR Indexer connected to a network we need to have node configs and -The above command should initialize necessary configs and keys to run `localnet/testnet/mainnet` in `~/.near/(localnet|testnet|mainnet)`. +Depending on the network we want to connect, the keys will be created in different folders (`~/.near/`). -The main configuration file for the node is `config.json`. +#### Config File -The above code will download the official genesis and config. The recommended config is accessible at: +A configuration file (`~/.near//config.json`) is created automatically, whoever, it is recommended to replace with one of the following ones, intended for RPC nodes: - [testnet config.json](https://s3-us-west-1.amazonaws.com/build.nearprotocol.com/nearcore-deploy/testnet/rpc/config.json) - [mainnet config.json](https://s3-us-west-1.amazonaws.com/build.nearprotocol.com/nearcore-deploy/mainnet/rpc/config.json) -This configuration is intended for RPC nodes. Any extra settings required for the indexer should be manually added to the configuration file config.json in your --home-dir (e.g. ~/.near/testnet/config.json). +:::note Configuration Options + +See the [Custom Configuration](#custom-configuration) section below to learn more about further configuration options. -Configs for the specified network are in the --home-dir provided folder. We need to ensure that NEAR Indexer follows all the necessary shards, so set "tracked_shards_config": "AllShards" parameters in ~/.near/testnet/config.json. Hint: See the Tweaks section below to learn more about further configuration options. +--- + +## Starting the Indexer -After that we can run NEAR Indexer. +After we finish initializing the indexer, and configuring it, we can start it by running the following command: @@ -72,12 +81,23 @@ After that we can run NEAR Indexer. --- -## Custom Settings +## Parsing the Block Data + +From the block data, we can access the transactions, their receipts and actions. In this example, we will look for FunctionCall actions on a specific contract and log the details of each call. + + + +--- + +## Custom Configuration -By default, nearcore is configured to do as little work as possible while still operating on an up-to-date state. Indexers may have different requirements, so there is no solution that would work for everyone, and thus we are going to provide you with the set of knobs you can tune for your requirements. +By default, nearcore is configured to do as little work as possible while still operating on an up-to-date state. Indexers may have different requirements, so you might need to tweak the configuration based on yours. ### Shards/Accounts to Track -As already has been mentioned above, the most common tweak you need to apply is listing all the shards you want to index data from; to do that, you should ensure that "tracked_shards_config" in the config.json lists all the shard UIDs: + +We need to ensure that NEAR Indexer follows all the necessary shards, so by default the `"tracked_shards_config"` is set to `"AllShards"`. The most common tweak you might need to apply is listing to specific shards; to do that, lists all the shard UIDs you want to track in the `"tracked_shards_config"` section: ```json ... @@ -127,12 +147,4 @@ Indexer Framework also exposes access to the internal APIs (see Indexer::client_ ... ``` -## Parsing the Block Data - -From the block data, we can access the transactions, their receipts and actions. In this example, we will look for FunctionCall actions on a specific contract and log the details of each call. - - - \ No newline at end of file From 9dfb89ba005b63d76a5f35a38eca9b0fdb63b52a Mon Sep 17 00:00:00 2001 From: garikbesson Date: Fri, 10 Oct 2025 13:20:48 +0100 Subject: [PATCH 4/4] update near-indexer tutorial --- .../tutorials/near-indexer.md | 95 +++++++++++++++++-- 1 file changed, 88 insertions(+), 7 deletions(-) diff --git a/docs/data-infrastructure/tutorials/near-indexer.md b/docs/data-infrastructure/tutorials/near-indexer.md index ac11d7abad8..4e417395387 100644 --- a/docs/data-infrastructure/tutorials/near-indexer.md +++ b/docs/data-infrastructure/tutorials/near-indexer.md @@ -16,7 +16,7 @@ To get our indexer up and running we will need two steps: 1. To [initialize](#initialization) the indexer 2. To [start it](#run) -The full source code for the indexer is available in the [GitHub repository](https://github.com/near-examples/near-indexer?tab=readme-ov-file). +The full source code for the indexer example is available in the [GitHub repository](https://github.com/near-examples/near-indexer?tab=readme-ov-file). --- @@ -55,6 +55,8 @@ A configuration file (`~/.near//config.json`) is created automatically, See the [Custom Configuration](#custom-configuration) section below to learn more about further configuration options. +::: + --- ## Starting the Indexer @@ -64,17 +66,17 @@ After we finish initializing the indexer, and configuring it, we can start it by ```bash - cargo run --release -- --home-dir ~/.near/localnet run + cargo run --release -- --home-dir ~/.near/localnet --accounts bob.near --block-height 137510 run ``` ```bash - cargo run --release -- --home-dir ~/.near/testnet run + cargo run --release -- --home-dir ~/.near/testnet --accounts bob.testnet --block-height 218137510 run ``` ```bash - cargo run --release -- --home-dir ~/.near/mainnet run + cargo run --release -- --home-dir ~/.near/mainnet --accounts bob.near --block-height 167668637 run ``` @@ -87,7 +89,7 @@ From the block data, we can access the transactions, their receipts and actions. + start="71" end="154" /> --- @@ -97,7 +99,7 @@ By default, nearcore is configured to do as little work as possible while still ### Shards/Accounts to Track -We need to ensure that NEAR Indexer follows all the necessary shards, so by default the `"tracked_shards_config"` is set to `"AllShards"`. The most common tweak you might need to apply is listing to specific shards; to do that, lists all the shard UIDs you want to track in the `"tracked_shards_config"` section: +We need to ensure that NEAR Indexer follows all the necessary shards, so by default the `"tracked_shards_config"` is set to `"AllShards"`. The most common tweak you might need to apply is listing to specific shards; to do that, lists all the shard UIDs you want to track in the `"tracked_shards_config"` section (`~/.near//config.json` file): ```json ... @@ -133,7 +135,86 @@ You can choose Indexer Framework sync mode by setting what to stream: + start="34" end="34" /> + +
+ +### Streaming Mode +You can choose Indexer Framework streaming mode by setting what to stream: + +- StreamWhileSyncing - Stream while node is syncing +- WaitForFullSync - Don't stream until the node is fully synced + + + +
+ +### Finality +You can choose finality level at which blocks are streamed: + +- None - `optimistic`, a block that (though unlikely) might be skipped +- DoomSlug - `near-final`, a block that is irreversible, unless at least one block producer is slashed +- Final - `final`, the block is final and irreversible. + + + +
+ +### Boot Nodes +If your node can't find any peers to connect to, you can manually specify some boot nodes in the `config.json` file. You can get a list of active peers for your network by running the following command: + + + + ```bash + curl -X POST https://rpc.testnet.near.org \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "method": "network_info", + "params": [], + "id": "dontcare" + }' | \ + jq '.result.active_peers as $list1 | .result.known_producers as $list2 | + $list1[] as $active_peer | $list2[] | + select(.peer_id == $active_peer.id) | + "\(.peer_id)@\($active_peer.addr)"' |\ + awk 'NR>2 {print ","} length($0) {print p} {p=$0}' ORS="" | sed 's/"//g' + ``` + + + ```bash + curl -X POST https://rpc.mainnet.near.org \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "method": "network_info", + "params": [], + "id": "dontcare" + }' | \ + jq '.result.active_peers as $list1 | .result.known_producers as $list2 | + $list1[] as $active_peer | $list2[] | + select(.peer_id == $active_peer.id) | + "\(.peer_id)@\($active_peer.addr)"' |\ + awk 'NR>2 {print ","} length($0) {print p} {p=$0}' ORS="" | sed 's/"//g' + ``` + + + +And then add the output to the `boot_nodes` section of your `config.json` file as a string: + +```json +... +"network": { + "addr": "0.0.0.0:24567", + "boot_nodes": "ed25519:8oVENgBp6zJfnwXFe...", + ... +}, +... +```