-
Notifications
You must be signed in to change notification settings - Fork 0
feat(prover-api): production-ready claim migration prover with security hardening #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v2
Are you sure you want to change the base?
Conversation
Split monolithic main.rs into separate modules for better maintainability: - config.rs: Configuration loading and validation - types.rs: Shared types (AppState, JobEntry, CachedProof, etc.) - handlers.rs: HTTP request handlers (submit_job, job_status, health) - prover.rs: ZK proof generation logic - queue.rs: JobQueue and WorkerPool for concurrent proof generation - cache.rs: Proof caching with TTL cleanup - rate_limit.rs: Per-pubkey rate limiting - jobs.rs: Job entry cleanup - validation.rs: Request validation utilities Added production features: - Configurable worker pool size and queue capacity - Background cleanup tasks for cache, rate limits, and stale jobs - Max body size limiting - Proof timeout configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
… and IP rate limiting Adds production hardening features: - Load eligibility data from merkle-tree.json at startup with pubkey-based O(1) lookup - Verify sr25519 signatures before proof generation using schnorrkel - IP-based rate limiting (separate from pubkey rate limiting) to catch bots/scanners - Add proof metrics tracking (completions, timeouts, background tasks) - Update Dockerfile to bundle eligibility data with configurable path 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Increase cache TTL from 10 minutes to 1 hour since proofs are deterministic and expensive to compute - Reduce queue capacity from 100 to 50 for better wait time estimation (~1 hour max with 4 workers) - Increase jobs TTL from 10 minutes to 1 hour to let users return for their proof 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
ProverClient::from_env() defaults to Reserved mode which has an invalid domain for the mainnet network. Explicitly use NetworkMode::Mainnet via ProverClient::builder().network_for(NetworkMode::Mainnet) to ensure correct mainnet RPC URL is used. Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add trust_proxy_headers config for secure IP extraction behind reverse proxies - When false (default), only socket address is used for rate limiting - When true, respects X-Forwarded-For and X-Real-IP headers - Strip port from socket addresses for consistent rate limiting - Add configurable RPC timeout for on-chain verification - timeout_seconds field in VerifyOnchainConfig - Uses RPC_TIMEOUT_SECONDS env var (default: 10s) - Fix unbounded concurrency when proof generation times out - Acquire semaphore permit before spawning task, not inside - Hold permit until blocking task actually completes (not just timeout) - Use tokio::select! instead of tokio::time::timeout to keep handle valid - Add decrement_timed_out_still_running metric tracking - Fix JobQueue::from_sender to use shared queue_size counter - Prevents queue size tracking inconsistencies - Add error handling for misconfigured state Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add httpmock dev dependency for mocking RPC responses - Add tests for verify_onchain_proof success and revert cases - Add test for check_already_claimed with claimed/unclaimed scenarios - Minor code cleanup and formatting fixes Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
I'll analyze this and get back to you. |
|
Claude finished @vutuanlinh2k2's task —— View job Comprehensive Code ReviewI've thoroughly analyzed this PR which refactors the SP1 prover API from monolithic to modular architecture with significant security improvements. Here's my detailed feedback: ✅ Architectural Improvements
🏗️ Excellent Architectural RefactoringStrengths:
Suggestions:
🔒 Strong Security ImplementationMulti-layered security approach: Rate Limiting (
|
Actually Production-Ready As-IsThe code is already well-designed for production. Most of Claude's suggestions are nice-to-haves, not blockers:
The One Fix Worth MakingTimestamp consistency in // Lines 43-48: capture timestamp once
let now = now_ts();
let elapsed = now - entry.last_request_at;
if elapsed >= self.window_seconds {
entry.last_request_at = now; // Use same timestamp
entry.request_count = 1;What's Already Solid
ConclusionThe API is production-ready now. If you want to be thorough, fix the timestamp issue - it's a 2-line change. Everything else is polish. |
Capture timestamp once at the start of check_and_update() to avoid potential race condition where multiple calls to now_ts() could return different values. Co-Authored-By: Claude Opus 4.5 <[email protected]>

Summary
This PR makes the SP1 prover API production-ready with comprehensive security improvements and reliability fixes:
trust_proxy_headersconfig for secure IP extraction behind reverse proxiesTest plan
cargo test -p prover-apito verify new httpmock tests pass🤖 Generated with Claude Code