Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a new block-proxy service (disk-caching reverse proxy for NEAR block data with upstream fallback + dedup + metrics) and updates indexer-staking to consume blocks via nb-neardata (optionally overridden by NEARDATA_URL) instead of the MinIO/S3-based streamer.
Changes:
- Add
apps/block-proxy(data plane + admin plane), including caching, upstream clients (S3/MinIO, fastnear, NEAR Lake), stats/metrics, and Docker packaging/docs. - Add mainnet/testnet compose files for running block-proxy.
- Update
indexer-stakingto usenb-neardataand simplify configuration (replace S3 envs with optionalNEARDATA_URL).
Reviewed changes
Copilot reviewed 29 out of 30 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| yarn.lock | Adds dependencies for block-proxy (AWS SDK, uuid, etc.). |
| testnet-block-proxy.yml | Compose service definition for testnet block-proxy. |
| mainnet-block-proxy.yml | Compose service definition for mainnet block-proxy. |
| apps/indexer-staking/src/types/types.ts | Config type updated to include optional neardataUrl and remove S3 fields. |
| apps/indexer-staking/src/services/stream.ts | Switch streaming from nb-blocks-minio to nb-neardata with network/url inputs. |
| apps/indexer-staking/src/services/staking.ts | Update imports to nb-neardata types. |
| apps/indexer-staking/src/libs/utils.ts | Update import to nb-neardata ExecutionStatus. |
| apps/indexer-staking/src/config.ts | Replace S3 env parsing with optional NEARDATA_URL. |
| apps/indexer-staking/package.json | Swap dependency from nb-blocks-minio to nb-neardata. |
| apps/block-proxy/tsconfig.json | Adds TS config for new app with #* path aliasing. |
| apps/block-proxy/src/upstream/s3.ts | Implements S3/MinIO upstream fetcher with size checks and a timeout wrapper. |
| apps/block-proxy/src/upstream/near-lake.ts | Implements NEAR Lake upstream (assemble block.json + shards) with timeout wrapper. |
| apps/block-proxy/src/upstream/index.ts | Implements fallback chain and singleflight-style dedup map. |
| apps/block-proxy/src/upstream/fastnear.ts | Implements fastnear upstream fetcher with timeouts and size checks. |
| apps/block-proxy/src/types.ts | Defines response/stat types for admin endpoints. |
| apps/block-proxy/src/stats.ts | Implements stats aggregation and snapshots for /stats. |
| apps/block-proxy/src/state.ts | Wires cache/upstreams/stats/dedup into a shared AppState. |
| apps/block-proxy/src/server.ts | Adds data-plane API: /v0/block/:height, /v0/last_block/final, /healthz, /readyz. |
| apps/block-proxy/src/metrics.ts | Adds Prometheus metrics registry and instruments key counters/histograms. |
| apps/block-proxy/src/index.ts | App entrypoint: start servers, start eviction loop, graceful shutdown. |
| apps/block-proxy/src/config.ts | Adds env-based config with derived defaults (fastnear + NEAR Lake bucket). |
| apps/block-proxy/src/cache/path.ts | Implements sharded on-disk path strategy for cached blocks. |
| apps/block-proxy/src/cache/index.ts | Implements cache read/write + background write + TTL-based eviction loop. |
| apps/block-proxy/src/admin.ts | Adds admin-plane API: /metrics and /stats. |
| apps/block-proxy/package.json | Declares block-proxy app deps/scripts. |
| apps/block-proxy/README.md | Documents API, config, and architecture. |
| apps/block-proxy/Dockerfile | Adds multi-stage Docker build using turbo prune + build. |
| apps/block-proxy/.eslintrc.cjs | Adds ESLint config for the new app. |
| apps/block-proxy/.env.example | Provides example env configuration for block-proxy. |
| .github/workflows/workflow-build.yaml | Adds block-proxy to CI build matrix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| NEAR_LAKE_BUCKET: str({ default: '' }), | ||
| NEAR_LAKE_ENABLED: bool({ default: true }), | ||
| NEAR_LAKE_REGION: str({ default: 'eu-central-1' }), |
There was a problem hiding this comment.
NEAR_LAKE_ENABLED defaults to true, which means a zero-config deployment will attempt NEAR Lake (AWS S3) access and may introduce unexpected latency/credential errors. This contradicts the “safe defaults / fastnear-only works out of the box” goal; consider defaulting NEAR_LAKE_ENABLED to false and letting users opt-in explicitly (or only enabling when AWS creds are detected).
apps/block-proxy/src/state.ts
Outdated
| nearLakeEnabled: config.nearLakeEnabled, | ||
| ready: false, | ||
| s3, | ||
| s3Enabled: config.s3Enabled, |
There was a problem hiding this comment.
s3Enabled is set directly from config.s3Enabled even when S3Upstream.create(config) returns null (missing endpoint/bucket/creds). This makes /stats report S3 as enabled when it’s effectively disabled, and also makes state inconsistent. Consider deriving s3Enabled from both the flag and successful upstream construction (e.g., config.s3Enabled && s3 != null).
| s3Enabled: config.s3Enabled, | |
| s3Enabled: config.s3Enabled && s3 !== null, |
apps/block-proxy/src/upstream/s3.ts
Outdated
| const result = await Promise.race([ | ||
| this.getObject(key, height), | ||
| new Promise<never>((_, reject) => | ||
| setTimeout( | ||
| () => | ||
| reject( | ||
| new Error( | ||
| `S3 fetch timed out for block ${height} after ${this.timeoutMs}ms`, | ||
| ), | ||
| ), | ||
| this.timeoutMs, | ||
| ), | ||
| ), | ||
| ]); |
There was a problem hiding this comment.
The timeout implementation uses Promise.race but does not abort the in-flight S3Client.send() call when the timeout fires. Under load, timed-out requests can continue consuming sockets/memory and accumulate in the background. Consider wiring an AbortController/abortSignal into the AWS SDK request (and clearing the timer) so the request is actually cancelled on timeout.
| const result = await Promise.race([ | ||
| this.fetchAssembled(height), | ||
| new Promise<never>((_, reject) => | ||
| setTimeout( | ||
| () => | ||
| reject( | ||
| new Error( | ||
| `NEAR Lake fetch timed out for block ${height} after ${this.timeoutMs}ms`, | ||
| ), | ||
| ), | ||
| this.timeoutMs, | ||
| ), | ||
| ), | ||
| ]); |
There was a problem hiding this comment.
This timeout uses Promise.race but does not abort the underlying NEAR Lake S3 requests (block.json and shard fetches). If the timeout triggers, those S3 operations can continue in the background and pile up. Consider using AbortController/abortSignal with S3Client.send() (and propagating it into getObject/fetchAssembled) so requests are cancelled when the timeout elapses.
apps/block-proxy/src/cache/index.ts
Outdated
| const evictRecursive = (dir: string): void => { | ||
| let entries: fs.Dirent[]; | ||
| try { | ||
| entries = fs.readdirSync(dir, { withFileTypes: true }); | ||
| } catch (err) { | ||
| logger.warn( | ||
| { dir, error: String(err) }, | ||
| 'failed to read cache directory during eviction', | ||
| ); | ||
| return; | ||
| } | ||
|
|
||
| for (const entry of entries) { | ||
| const fullPath = path.join(dir, entry.name); | ||
|
|
||
| if (entry.isDirectory()) { | ||
| evictRecursive(fullPath); | ||
| continue; | ||
| } | ||
|
|
||
| const filename = entry.name; | ||
| let stem: string; | ||
| if (filename.endsWith('.json.zst')) { | ||
| stem = filename.slice(0, -9); | ||
| } else if (filename.endsWith('.json')) { | ||
| stem = filename.slice(0, -5); | ||
| } else { | ||
| continue; | ||
| } | ||
|
|
||
| const height = parseInt(stem, 10); | ||
| if (isNaN(height)) continue; | ||
|
|
||
| scanned++; | ||
|
|
||
| // Only evict if within the recent block window of the tip | ||
| const isRecent = | ||
| currentTip >= this.recentBlockWindow | ||
| ? height > currentTip - this.recentBlockWindow | ||
| : height <= currentTip; | ||
|
|
||
| if (!isRecent) continue; | ||
|
|
||
| // Check file modification time | ||
| let stat: fs.Stats; | ||
| try { | ||
| stat = fs.statSync(fullPath); | ||
| } catch { | ||
| continue; | ||
| } | ||
|
|
||
| const ageMs = Date.now() - stat.mtimeMs; | ||
| if (ageMs > this.cacheTtlMs) { | ||
| try { | ||
| fs.unlinkSync(fullPath); | ||
| evicted++; | ||
| logger.debug( | ||
| { age_secs: Math.floor(ageMs / 1000), height }, | ||
| 'evicted recent block from cache', | ||
| ); | ||
| } catch (err) { | ||
| logger.warn( | ||
| { error: String(err), height }, | ||
| 'failed to evict cache file', | ||
| ); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
runEviction() walks the entire cache directory using synchronous fs operations (readdirSync/statSync/unlinkSync) on a 60s interval. With a large cache, this can block the Node event loop and directly impact request latency. Consider making eviction async (fsp.*), limiting work per tick (yielding between directories), or moving eviction to a worker/child process.
apps/block-proxy/src/server.ts
Outdated
| logger.error( | ||
| { error_count: upstreamErr.errors?.length, height }, | ||
| 'block request failed: all upstreams exhausted', | ||
| ); | ||
| res.status(502).json({ | ||
| error: 'all upstream sources failed', | ||
| height, | ||
| sources: upstreamErr.errors || [], |
There was a problem hiding this comment.
GET /v0/block/:height always returns 502 when all upstreams fail, even when upstreams are returning “not found” for the requested height (e.g., fastnear 404 / S3 NoSuchKey). Since the service is intended to be URL-compatible with neardata.xyz, consider distinguishing not-found from upstream failures and returning 404 when all sources indicate absence (or propagating the most appropriate status code).
| logger.error( | |
| { error_count: upstreamErr.errors?.length, height }, | |
| 'block request failed: all upstreams exhausted', | |
| ); | |
| res.status(502).json({ | |
| error: 'all upstream sources failed', | |
| height, | |
| sources: upstreamErr.errors || [], | |
| const errors = upstreamErr.errors || []; | |
| // Determine if all upstreams reported a "not found" condition (e.g. HTTP 404, S3 NoSuchKey). | |
| const allNotFound = | |
| errors.length > 0 && | |
| errors.every( | |
| (e) => | |
| (typeof (e as any).statusCode === 'number' && | |
| (e as any).statusCode === 404) || | |
| (typeof (e as any).code === 'string' && | |
| (e as any).code === 'NoSuchKey'), | |
| ); | |
| if (allNotFound) { | |
| logger.warn( | |
| { error_count: errors.length, height }, | |
| 'block request: block not found on any upstream', | |
| ); | |
| res.status(404).json({ | |
| error: 'block not found', | |
| height, | |
| sources: errors, | |
| }); | |
| return; | |
| } | |
| logger.error( | |
| { error_count: errors.length, height }, | |
| 'block request failed: all upstreams exhausted', | |
| ); | |
| res.status(502).json({ | |
| error: 'all upstream sources failed', | |
| height, | |
| sources: errors, |
6485162 to
e7b8564
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
apps/block-proxy/src/server.ts
Outdated
| const requestId = _req.headers['x-request-id'] || uuidv4(); | ||
| res.setHeader('x-request-id', requestId); |
There was a problem hiding this comment.
x-request-id can be a string[] when a client sends the header multiple times. Currently that array can be echoed back via res.setHeader, which is usually unintended and can break downstream log correlation. Consider normalizing to a single string (e.g., first value) and coercing to string before setting the response header.
| const requestId = _req.headers['x-request-id'] || uuidv4(); | |
| res.setHeader('x-request-id', requestId); | |
| const incomingRequestId = _req.headers['x-request-id']; | |
| const requestId = | |
| (Array.isArray(incomingRequestId) | |
| ? incomingRequestId[0] | |
| : incomingRequestId) || uuidv4(); | |
| res.setHeader('x-request-id', String(requestId)); |
apps/block-proxy/src/server.ts
Outdated
| }; | ||
| const errors = upstreamErr.errors || []; | ||
|
|
||
| const allNotFound = errors.length > 0 && errors.every((e) => e.notFound); |
There was a problem hiding this comment.
allNotFound is computed from errors, but the error list can include non-upstream failures (e.g., a cache read error from fetchBlock). In that case, even if every upstream returned 404/notFound, this endpoint will return 502 instead of 404. Consider basing the 404 decision only on upstream sources (exclude cache) or only on errors that explicitly set notFound.
| const allNotFound = errors.length > 0 && errors.every((e) => e.notFound); | |
| const notFoundAnnotatedErrors = errors.filter((e) => e.notFound !== undefined); | |
| const allNotFound = | |
| notFoundAnnotatedErrors.length > 0 && | |
| notFoundAnnotatedErrors.every((e) => e.notFound === true); |
apps/block-proxy/src/index.ts
Outdated
| // Start background eviction loop (every 60s) | ||
| const evictionInterval = state.cache.startEvictionLoop(state.stats); | ||
|
|
There was a problem hiding this comment.
The eviction loop is started unconditionally. When CACHE_ENABLED=false this will still wake up every 60s and scan the cache directory (and log), which is unnecessary overhead and can also fail due to permissions. Consider only starting the eviction interval when caching is enabled.
| startEvictionLoop(stats: StatsCollector): ReturnType<typeof setInterval> { | ||
| return setInterval(() => { | ||
| this.runEviction(stats).catch((err) => { | ||
| logger.warn({ error: String(err) }, 'eviction loop error'); | ||
| }); | ||
| }, 60_000); | ||
| } |
There was a problem hiding this comment.
startEvictionLoop runs a full recursive scan of the entire cache directory every 60 seconds. With a large cache (many files), this can become a significant and constant I/O load. Consider increasing the interval, adding a cap/budget per run, or using a strategy that avoids rescanning the whole tree each time (e.g., sharded time buckets or tracking last-scan state).
apps/block-proxy/src/server.ts
Outdated
| const height = parseInt(heightStr, 10); | ||
|
|
||
| if (isNaN(height) || height <= 0) { | ||
| res | ||
| .status(400) | ||
| .json({ error: 'block height must be a positive integer' }); | ||
| return; | ||
| } | ||
|
|
There was a problem hiding this comment.
parseInt will accept inputs like "123abc" and treat them as height 123. If you want /v0/block/:height to strictly require a decimal integer path segment, validate the param with a digits-only check before parsing (or compare String(height) back to the original).
| const height = parseInt(heightStr, 10); | |
| if (isNaN(height) || height <= 0) { | |
| res | |
| .status(400) | |
| .json({ error: 'block height must be a positive integer' }); | |
| return; | |
| } | |
| // Require the height path segment to be a strict positive decimal integer (no signs, no extras). | |
| if (!/^[1-9]\d*$/.test(heightStr)) { | |
| res | |
| .status(400) | |
| .json({ error: 'block height must be a positive integer' }); | |
| return; | |
| } | |
| const height = Number(heightStr); |
| errors.push({ error: String(err), source: 'cache' }); | ||
| } | ||
| } |
There was a problem hiding this comment.
Cache read errors are pushed into the same errors array as upstream fetch failures. That list is later used by the HTTP layer to decide between 404 vs 502; including cache errors (which are neither "not found" nor an upstream source) can cause incorrect 502 responses. Consider tracking cache failures separately or marking them so they don't affect "all upstreams not found" classification.
| errors.push({ error: String(err), source: 'cache' }); | |
| } | |
| } | |
| } | |
| } | |
| } |
apps/block-proxy/src/index.ts
Outdated
| // Create cache directory (fail fast) | ||
| state.cache.ensureDir(); |
There was a problem hiding this comment.
state.cache.ensureDir() is called unconditionally at startup. If CACHE_ENABLED=false, this still creates the cache directory and can fail in environments where the process intentionally has no write access. Consider only creating the directory when caching is enabled (or making it conditional on config.cacheEnabled).
| // Create cache directory (fail fast) | |
| state.cache.ensureDir(); | |
| // Create cache directory (fail fast, only when caching is enabled) | |
| if (config.cacheEnabled) { | |
| state.cache.ensureDir(); | |
| } |
| port: config.port, | ||
| s3AccessKey: mask(config.s3AccessKey), | ||
| s3Bucket: config.s3Bucket || '<unset>', | ||
| s3Enabled: config.s3Enabled, | ||
| s3Endpoint: config.s3Endpoint || '<unset>', | ||
| s3Region: config.s3Region, | ||
| s3SecretKey: mask(config.s3SecretKey), |
There was a problem hiding this comment.
logConfigSummary logs s3Enabled: config.s3Enabled, but at runtime S3 is effectively enabled only if S3_ENABLED=true and all required S3 fields are set (since S3Upstream.create can return null and state.s3Enabled becomes false). This can mislead operators during incident/debugging; consider logging the effective enablement (or failing fast when S3_ENABLED=true but config is incomplete).
apps/block-proxy/package.json
Outdated
| "lint:check": "tsc --noEmit && eslint ./" | ||
| }, | ||
| "dependencies": { | ||
| "@aws-sdk/client-s3": "3.712.0", |
There was a problem hiding this comment.
This pins @aws-sdk/client-s3 to 3.712.0, which (per the lockfile) causes multiple AWS SDK versions to be installed alongside the existing 3.749.0. Consider aligning the version to the repo’s current AWS SDK version (or using a workspace-level resolution) to reduce bundle size and avoid subtle behavior differences across versions.
| "@aws-sdk/client-s3": "3.712.0", | |
| "@aws-sdk/client-s3": "3.749.0", |
|
|
||
| FROM --platform=$BUILDPLATFORM base AS builder | ||
| WORKDIR /app | ||
| RUN yarn global add turbo |
There was a problem hiding this comment.
yarn global add turbo installs the turbo CLI from the public npm registry without pinning a version, so every build pulls a mutable "latest" binary that is immediately executed in the build stage. If the turbo package or its registry entry is ever compromised, an attacker could gain code execution in your build environment and tamper with the pruned workspace or downstream build artifacts that end up in the production image. Pin turbo to a specific, vetted version (for example by using turbo@<version> or a prebuilt image with turbo baked in) and avoid relying on an unversioned global install during Docker builds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e7b8564 to
b085517
Compare
Block-proxy:
Indexer-staking: