Skip to content

feat(block-proxy): add block-proxy#1362

Merged
andrewklau merged 2 commits intomainfrom
feat/block-proxy-ts
Mar 4, 2026
Merged

feat(block-proxy): add block-proxy#1362
andrewklau merged 2 commits intomainfrom
feat/block-proxy-ts

Conversation

@andrewklau
Copy link
Contributor

Block-proxy:

  • Data plane (port 3000): /v0/block/:height, /v0/last_block/final, /healthz, /readyz
  • Admin plane (port 3001): /metrics (Prometheus), /stats (JSON)
  • Fallback chain: cache -> S3/MinIO -> fastnear -> NEAR Lake
  • Singleflight dedup to prevent upstream stampede
  • Background cache writes, TTL-based eviction loop
  • README.md with full API docs and architecture diagram
  • .env.example with safe defaults (fastnear-only works out of box)

Indexer-staking:

  • Switch from nb-blocks-minio to nb-neardata
  • Replace 6 S3 env vars with optional NEARDATA_URL
  • Default URL derived from NETWORK (just works without config)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new block-proxy service (disk-caching reverse proxy for NEAR block data with upstream fallback + dedup + metrics) and updates indexer-staking to consume blocks via nb-neardata (optionally overridden by NEARDATA_URL) instead of the MinIO/S3-based streamer.

Changes:

  • Add apps/block-proxy (data plane + admin plane), including caching, upstream clients (S3/MinIO, fastnear, NEAR Lake), stats/metrics, and Docker packaging/docs.
  • Add mainnet/testnet compose files for running block-proxy.
  • Update indexer-staking to use nb-neardata and simplify configuration (replace S3 envs with optional NEARDATA_URL).

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
yarn.lock Adds dependencies for block-proxy (AWS SDK, uuid, etc.).
testnet-block-proxy.yml Compose service definition for testnet block-proxy.
mainnet-block-proxy.yml Compose service definition for mainnet block-proxy.
apps/indexer-staking/src/types/types.ts Config type updated to include optional neardataUrl and remove S3 fields.
apps/indexer-staking/src/services/stream.ts Switch streaming from nb-blocks-minio to nb-neardata with network/url inputs.
apps/indexer-staking/src/services/staking.ts Update imports to nb-neardata types.
apps/indexer-staking/src/libs/utils.ts Update import to nb-neardata ExecutionStatus.
apps/indexer-staking/src/config.ts Replace S3 env parsing with optional NEARDATA_URL.
apps/indexer-staking/package.json Swap dependency from nb-blocks-minio to nb-neardata.
apps/block-proxy/tsconfig.json Adds TS config for new app with #* path aliasing.
apps/block-proxy/src/upstream/s3.ts Implements S3/MinIO upstream fetcher with size checks and a timeout wrapper.
apps/block-proxy/src/upstream/near-lake.ts Implements NEAR Lake upstream (assemble block.json + shards) with timeout wrapper.
apps/block-proxy/src/upstream/index.ts Implements fallback chain and singleflight-style dedup map.
apps/block-proxy/src/upstream/fastnear.ts Implements fastnear upstream fetcher with timeouts and size checks.
apps/block-proxy/src/types.ts Defines response/stat types for admin endpoints.
apps/block-proxy/src/stats.ts Implements stats aggregation and snapshots for /stats.
apps/block-proxy/src/state.ts Wires cache/upstreams/stats/dedup into a shared AppState.
apps/block-proxy/src/server.ts Adds data-plane API: /v0/block/:height, /v0/last_block/final, /healthz, /readyz.
apps/block-proxy/src/metrics.ts Adds Prometheus metrics registry and instruments key counters/histograms.
apps/block-proxy/src/index.ts App entrypoint: start servers, start eviction loop, graceful shutdown.
apps/block-proxy/src/config.ts Adds env-based config with derived defaults (fastnear + NEAR Lake bucket).
apps/block-proxy/src/cache/path.ts Implements sharded on-disk path strategy for cached blocks.
apps/block-proxy/src/cache/index.ts Implements cache read/write + background write + TTL-based eviction loop.
apps/block-proxy/src/admin.ts Adds admin-plane API: /metrics and /stats.
apps/block-proxy/package.json Declares block-proxy app deps/scripts.
apps/block-proxy/README.md Documents API, config, and architecture.
apps/block-proxy/Dockerfile Adds multi-stage Docker build using turbo prune + build.
apps/block-proxy/.eslintrc.cjs Adds ESLint config for the new app.
apps/block-proxy/.env.example Provides example env configuration for block-proxy.
.github/workflows/workflow-build.yaml Adds block-proxy to CI build matrix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +14 to +16
NEAR_LAKE_BUCKET: str({ default: '' }),
NEAR_LAKE_ENABLED: bool({ default: true }),
NEAR_LAKE_REGION: str({ default: 'eu-central-1' }),
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEAR_LAKE_ENABLED defaults to true, which means a zero-config deployment will attempt NEAR Lake (AWS S3) access and may introduce unexpected latency/credential errors. This contradicts the “safe defaults / fastnear-only works out of the box” goal; consider defaulting NEAR_LAKE_ENABLED to false and letting users opt-in explicitly (or only enabling when AWS creds are detected).

Copilot uses AI. Check for mistakes.
nearLakeEnabled: config.nearLakeEnabled,
ready: false,
s3,
s3Enabled: config.s3Enabled,
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s3Enabled is set directly from config.s3Enabled even when S3Upstream.create(config) returns null (missing endpoint/bucket/creds). This makes /stats report S3 as enabled when it’s effectively disabled, and also makes state inconsistent. Consider deriving s3Enabled from both the flag and successful upstream construction (e.g., config.s3Enabled && s3 != null).

Suggested change
s3Enabled: config.s3Enabled,
s3Enabled: config.s3Enabled && s3 !== null,

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +84
const result = await Promise.race([
this.getObject(key, height),
new Promise<never>((_, reject) =>
setTimeout(
() =>
reject(
new Error(
`S3 fetch timed out for block ${height} after ${this.timeoutMs}ms`,
),
),
this.timeoutMs,
),
),
]);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout implementation uses Promise.race but does not abort the in-flight S3Client.send() call when the timeout fires. Under load, timed-out requests can continue consuming sockets/memory and accumulate in the background. Consider wiring an AbortController/abortSignal into the AWS SDK request (and clearing the timer) so the request is actually cancelled on timeout.

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +117
const result = await Promise.race([
this.fetchAssembled(height),
new Promise<never>((_, reject) =>
setTimeout(
() =>
reject(
new Error(
`NEAR Lake fetch timed out for block ${height} after ${this.timeoutMs}ms`,
),
),
this.timeoutMs,
),
),
]);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timeout uses Promise.race but does not abort the underlying NEAR Lake S3 requests (block.json and shard fetches). If the timeout triggers, those S3 operations can continue in the background and pile up. Consider using AbortController/abortSignal with S3Client.send() (and propagating it into getObject/fetchAssembled) so requests are cancelled when the timeout elapses.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +126
const evictRecursive = (dir: string): void => {
let entries: fs.Dirent[];
try {
entries = fs.readdirSync(dir, { withFileTypes: true });
} catch (err) {
logger.warn(
{ dir, error: String(err) },
'failed to read cache directory during eviction',
);
return;
}

for (const entry of entries) {
const fullPath = path.join(dir, entry.name);

if (entry.isDirectory()) {
evictRecursive(fullPath);
continue;
}

const filename = entry.name;
let stem: string;
if (filename.endsWith('.json.zst')) {
stem = filename.slice(0, -9);
} else if (filename.endsWith('.json')) {
stem = filename.slice(0, -5);
} else {
continue;
}

const height = parseInt(stem, 10);
if (isNaN(height)) continue;

scanned++;

// Only evict if within the recent block window of the tip
const isRecent =
currentTip >= this.recentBlockWindow
? height > currentTip - this.recentBlockWindow
: height <= currentTip;

if (!isRecent) continue;

// Check file modification time
let stat: fs.Stats;
try {
stat = fs.statSync(fullPath);
} catch {
continue;
}

const ageMs = Date.now() - stat.mtimeMs;
if (ageMs > this.cacheTtlMs) {
try {
fs.unlinkSync(fullPath);
evicted++;
logger.debug(
{ age_secs: Math.floor(ageMs / 1000), height },
'evicted recent block from cache',
);
} catch (err) {
logger.warn(
{ error: String(err), height },
'failed to evict cache file',
);
}
}
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runEviction() walks the entire cache directory using synchronous fs operations (readdirSync/statSync/unlinkSync) on a 60s interval. With a large cache, this can block the Node event loop and directly impact request latency. Consider making eviction async (fsp.*), limiting work per tick (yielding between directories), or moving eviction to a worker/child process.

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +55
logger.error(
{ error_count: upstreamErr.errors?.length, height },
'block request failed: all upstreams exhausted',
);
res.status(502).json({
error: 'all upstream sources failed',
height,
sources: upstreamErr.errors || [],
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GET /v0/block/:height always returns 502 when all upstreams fail, even when upstreams are returning “not found” for the requested height (e.g., fastnear 404 / S3 NoSuchKey). Since the service is intended to be URL-compatible with neardata.xyz, consider distinguishing not-found from upstream failures and returning 404 when all sources indicate absence (or propagating the most appropriate status code).

Suggested change
logger.error(
{ error_count: upstreamErr.errors?.length, height },
'block request failed: all upstreams exhausted',
);
res.status(502).json({
error: 'all upstream sources failed',
height,
sources: upstreamErr.errors || [],
const errors = upstreamErr.errors || [];
// Determine if all upstreams reported a "not found" condition (e.g. HTTP 404, S3 NoSuchKey).
const allNotFound =
errors.length > 0 &&
errors.every(
(e) =>
(typeof (e as any).statusCode === 'number' &&
(e as any).statusCode === 404) ||
(typeof (e as any).code === 'string' &&
(e as any).code === 'NoSuchKey'),
);
if (allNotFound) {
logger.warn(
{ error_count: errors.length, height },
'block request: block not found on any upstream',
);
res.status(404).json({
error: 'block not found',
height,
sources: errors,
});
return;
}
logger.error(
{ error_count: errors.length, height },
'block request failed: all upstreams exhausted',
);
res.status(502).json({
error: 'all upstream sources failed',
height,
sources: errors,

Copilot uses AI. Check for mistakes.
@andrewklau andrewklau force-pushed the feat/block-proxy-ts branch 2 times, most recently from 6485162 to e7b8564 Compare March 2, 2026 07:43
@andrewklau andrewklau requested a review from Copilot March 2, 2026 07:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 30 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +16 to +17
const requestId = _req.headers['x-request-id'] || uuidv4();
res.setHeader('x-request-id', requestId);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-request-id can be a string[] when a client sends the header multiple times. Currently that array can be echoed back via res.setHeader, which is usually unintended and can break downstream log correlation. Consider normalizing to a single string (e.g., first value) and coercing to string before setting the response header.

Suggested change
const requestId = _req.headers['x-request-id'] || uuidv4();
res.setHeader('x-request-id', requestId);
const incomingRequestId = _req.headers['x-request-id'];
const requestId =
(Array.isArray(incomingRequestId)
? incomingRequestId[0]
: incomingRequestId) || uuidv4();
res.setHeader('x-request-id', String(requestId));

Copilot uses AI. Check for mistakes.
};
const errors = upstreamErr.errors || [];

const allNotFound = errors.length > 0 && errors.every((e) => e.notFound);
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allNotFound is computed from errors, but the error list can include non-upstream failures (e.g., a cache read error from fetchBlock). In that case, even if every upstream returned 404/notFound, this endpoint will return 502 instead of 404. Consider basing the 404 decision only on upstream sources (exclude cache) or only on errors that explicitly set notFound.

Suggested change
const allNotFound = errors.length > 0 && errors.every((e) => e.notFound);
const notFoundAnnotatedErrors = errors.filter((e) => e.notFound !== undefined);
const allNotFound =
notFoundAnnotatedErrors.length > 0 &&
notFoundAnnotatedErrors.every((e) => e.notFound === true);

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +44
// Start background eviction loop (every 60s)
const evictionInterval = state.cache.startEvictionLoop(state.stats);

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eviction loop is started unconditionally. When CACHE_ENABLED=false this will still wake up every 60s and scan the cache directory (and log), which is unnecessary overhead and can also fail due to permissions. Consider only starting the eviction interval when caching is enabled.

Copilot uses AI. Check for mistakes.
Comment on lines +125 to +131
startEvictionLoop(stats: StatsCollector): ReturnType<typeof setInterval> {
return setInterval(() => {
this.runEviction(stats).catch((err) => {
logger.warn({ error: String(err) }, 'eviction loop error');
});
}, 60_000);
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startEvictionLoop runs a full recursive scan of the entire cache directory every 60 seconds. With a large cache (many files), this can become a significant and constant I/O load. Consider increasing the interval, adding a cap/budget per run, or using a strategy that avoids rescanning the whole tree each time (e.g., sharded time buckets or tracking last-scan state).

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +32
const height = parseInt(heightStr, 10);

if (isNaN(height) || height <= 0) {
res
.status(400)
.json({ error: 'block height must be a positive integer' });
return;
}

Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseInt will accept inputs like "123abc" and treat them as height 123. If you want /v0/block/:height to strictly require a decimal integer path segment, validate the param with a digits-only check before parsing (or compare String(height) back to the original).

Suggested change
const height = parseInt(heightStr, 10);
if (isNaN(height) || height <= 0) {
res
.status(400)
.json({ error: 'block height must be a positive integer' });
return;
}
// Require the height path segment to be a strict positive decimal integer (no signs, no extras).
if (!/^[1-9]\d*$/.test(heightStr)) {
res
.status(400)
.json({ error: 'block height must be a positive integer' });
return;
}
const height = Number(heightStr);

Copilot uses AI. Check for mistakes.
Comment on lines +90 to +92
errors.push({ error: String(err), source: 'cache' });
}
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache read errors are pushed into the same errors array as upstream fetch failures. That list is later used by the HTTP layer to decide between 404 vs 502; including cache errors (which are neither "not found" nor an upstream source) can cause incorrect 502 responses. Consider tracking cache failures separately or marking them so they don't affect "all upstreams not found" classification.

Suggested change
errors.push({ error: String(err), source: 'cache' });
}
}
}
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +21
// Create cache directory (fail fast)
state.cache.ensureDir();
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state.cache.ensureDir() is called unconditionally at startup. If CACHE_ENABLED=false, this still creates the cache directory and can fail in environments where the process intentionally has no write access. Consider only creating the directory when caching is enabled (or making it conditional on config.cacheEnabled).

Suggested change
// Create cache directory (fail fast)
state.cache.ensureDir();
// Create cache directory (fail fast, only when caching is enabled)
if (config.cacheEnabled) {
state.cache.ensureDir();
}

Copilot uses AI. Check for mistakes.
Comment on lines +94 to +100
port: config.port,
s3AccessKey: mask(config.s3AccessKey),
s3Bucket: config.s3Bucket || '<unset>',
s3Enabled: config.s3Enabled,
s3Endpoint: config.s3Endpoint || '<unset>',
s3Region: config.s3Region,
s3SecretKey: mask(config.s3SecretKey),
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logConfigSummary logs s3Enabled: config.s3Enabled, but at runtime S3 is effectively enabled only if S3_ENABLED=true and all required S3 fields are set (since S3Upstream.create can return null and state.s3Enabled becomes false). This can mislead operators during incident/debugging; consider logging the effective enablement (or failing fast when S3_ENABLED=true but config is incomplete).

Copilot uses AI. Check for mistakes.
"lint:check": "tsc --noEmit && eslint ./"
},
"dependencies": {
"@aws-sdk/client-s3": "3.712.0",
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pins @aws-sdk/client-s3 to 3.712.0, which (per the lockfile) causes multiple AWS SDK versions to be installed alongside the existing 3.749.0. Consider aligning the version to the repo’s current AWS SDK version (or using a workspace-level resolution) to reduce bundle size and avoid subtle behavior differences across versions.

Suggested change
"@aws-sdk/client-s3": "3.712.0",
"@aws-sdk/client-s3": "3.749.0",

Copilot uses AI. Check for mistakes.

FROM --platform=$BUILDPLATFORM base AS builder
WORKDIR /app
RUN yarn global add turbo
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yarn global add turbo installs the turbo CLI from the public npm registry without pinning a version, so every build pulls a mutable "latest" binary that is immediately executed in the build stage. If the turbo package or its registry entry is ever compromised, an attacker could gain code execution in your build environment and tamper with the pruned workspace or downstream build artifacts that end up in the production image. Pin turbo to a specific, vetted version (for example by using turbo@<version> or a prebuilt image with turbo baked in) and avoid relying on an unversioned global install during Docker builds.

Copilot uses AI. Check for mistakes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andrewklau andrewklau force-pushed the feat/block-proxy-ts branch from e7b8564 to b085517 Compare March 2, 2026 09:14
@andrewklau andrewklau merged commit bfc1ec5 into main Mar 4, 2026
1 check passed
@andrewklau andrewklau deleted the feat/block-proxy-ts branch March 4, 2026 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants