diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 745b6cb..bad8216 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -1,135 +1,5 @@ # CLAUDE.md -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. +Use [`AGENTS.md`](../AGENTS.md) as the canonical instruction file for this repository. -## Project Overview - -This is a Kotlin library for building blockchain indexers for VeChainThor. It provides parallel processing, dependency management, and automatic retry logic for indexing blockchain data. - -## Build Commands - -```bash -# Build the project -./gradlew build - -# Run tests -./gradlew test - -# Run a specific test class -./gradlew test --tests "org.vechain.indexer.utils.IndexerOrderUtilsTest" - -# Check code coverage -./gradlew jacocoTestReport -# Coverage report: build/reports/jacoco/test/html/index.html - -# Format code (ktfmt with Google style) -./gradlew spotlessApply - -# Check formatting -./gradlew spotlessCheck - -# Clean build artifacts -./gradlew clean -``` - -## Core Architecture - -### Indexer Types - -The library provides two main indexer implementations via `IndexerFactory.build()`: - -1. **BlockIndexer**: Full block-by-block processing - - Used when `includeFullBlock()` is set or when `dependsOn()` is configured - - Can inspect transaction call data via `callDataClauses()` - - Processes reverted transactions - - Required for dependent indexers - -2. **LogsIndexer**: Fast event-based syncing - - Used by default when no dependencies and full block not required - - Fetches only event logs and transfer logs via Thor API - - More efficient for event-driven indexing - - Implements `fastSync()` to quickly catch up to finalized blocks - -### Dependency Management & Sequential Processing - -The `IndexerRunner` orchestrates multiple indexers using topological sorting (`IndexerOrderUtils.topologicalOrder()`): - -- All indexers placed in single group, ordered by dependencies (dependencies before dependents) -- Indexers process same block **sequentially** within group, honoring dependency order -- IndexerRunner uses channels to buffer blocks and coordinate processing -- Example: If `IndexerA` depends on `IndexerB`, then `IndexerB` processes block N before `IndexerA` processes block N -- Only single-dependency chains supported (each indexer can depend on at most one other) - -### Lifecycle & States - -Indexer states (defined in `Status` enum): -- `NOT_INITIALISED` → `INITIALISED` → `FAST_SYNCING` → `SYNCING` → `FULLY_SYNCED` -- `SHUT_DOWN`: Terminal state - -Initialization flow: -1. `initialise()`: Determines starting block, calls `rollback()` on processor -2. `fastSync()`: (LogsIndexer only) Catches up to finalized block using log events -3. `processBlock()`: Main processing loop with reorg detection - -### Reorg Detection - -Reorg detection in `BlockIndexer.checkForReorg()`: -- Compares `block.parentID` with `previousBlock.id` -- On detection: logs error, calls `rollback()`, throws `ReorgException` -- Only checks when `currentBlockNumber > startBlock` and `previousBlock != null` - -### Event Processing - -Event processing pipeline (`CombinedEventProcessor`): -1. **ABI Events**: Configured via `abis()` - loads JSON ABI files -2. **Business Events**: Configured via `businessEvents()` - custom event definitions with conditional logic -3. **VET Transfers**: Included by default unless `excludeVetTransfers()` is called - -Events are decoded and returned as `IndexedEvent` objects to the `IndexerProcessor.process()` method. - -### IndexerProcessor Interface - -Implementations must provide: -- `getLastSyncedBlock()`: Returns last successfully processed block (or null) -- `rollback(blockNumber)`: Reverts data for specified block -- `process(entry)`: Handles `IndexingResult.BlockResult` (full block) or `IndexingResult.LogResult` (log batch) - -## Code Style - -- **Formatting**: ktfmt with Google style, 4-space indents (enforced by Spotless) -- **Language**: Kotlin with Java 21 target -- **Testing**: JUnit 5, MockK for mocking, Strikt for assertions - -## Important Implementation Details - -### IndexerFactory Configuration - -The factory uses a builder pattern. Key methods: -- `name()`, `thorClient()`, `processor()`: Required -- `startBlock()`: Default is 0 -- `dependsOn()`: Forces BlockIndexer (needed for dependency coordination). Single-parent only. -- `includeFullBlock()`: Forces BlockIndexer (enables access to gas, reverted txs) -- `blockBatchSize()`: For LogsIndexer, controls log fetch batch size (default 100). For IndexerRunner, controls channel buffer (default 1). -- `logFetchLimit()`: Pagination limit for Thor API calls (default 1000) - -### Retry Logic - -`IndexerRunner.retryUntilSuccess()` wraps: -- Indexer initialization -- Block fetching -- Block processing - -On failure: logs error, waits 1 second, retries indefinitely (until success or cancellation). - -## Testing Notes - -- Mock Thor client for unit tests -- Use `TestableLogsIndexer` pattern to test internal sync logic -- Verify topological ordering for dependency chains in `IndexerOrderUtilsTest` -- Test reorg scenarios by providing blocks with mismatched `parentID` - -## Preferences -Be extremely concise. Sacrifice grammar for the sake of concision. -Always prefer simple solution over complex ones. -When unsure, ask for clarification. -run `make format` after making code changes to ensure proper formatting. \ No newline at end of file +Do not duplicate or extend project guidance here unless Claude-specific behavior genuinely requires it. diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..9ccaec0 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,144 @@ +# Agent Guide + +This file is the canonical instruction entry point for coding agents working with `indexer-core`. + +It is intentionally lean. Use it to build a correct mental model quickly, then open the linked repo docs instead of inferring behavior from scattered source files. + +## What This Library Is + +`indexer-core` is a Kotlin library for building VeChainThor indexers. + +At a high level it provides: + +- `IndexerProcessor` as the application persistence boundary +- `IndexerFactory` as the only supported way to configure and build indexers +- `IndexerRunner` to initialise, fast-sync when possible, coordinate dependencies, and keep indexers running through retries and reorg recovery +- two runtime modes: + - `LogsIndexer` for fast log-based catch-up when you only need decoded events / transfers + - `BlockIndexer` when you need full block context or dependency ordering + +Do not ask users to construct indexers manually from implementation classes unless they are working on the library internals themselves. For normal usage, all indexers should be built with `IndexerFactory`. + +## Who This Guide Is For + +This guide is for both: + +- agents changing `indexer-core` itself +- agents helping a consumer integrate `indexer-core` into another service + +If the task is library maintenance, preserve public behavior documented in the repo docs unless the change explicitly updates that behavior. + +If the task is consumer guidance, optimize for correct mode selection and integration advice before discussing internals. + +## Required Onboarding Path + +Before making claims about library behavior, read in this order: + +1. [`README.md`](README.md) +2. [`docs/README.md`](docs/README.md) +3. one targeted guide based on the task: + - runtime model and lifecycle: [`docs/IndexerOverview.md`](docs/IndexerOverview.md) + - log-based mode and fast sync: [`docs/LogsIndexerOverview.md`](docs/LogsIndexerOverview.md) + - ABI loading and decoded events: [`docs/EventsAndABIHandling.md`](docs/EventsAndABIHandling.md) + - business event design: [`docs/BusinessEvents.md`](docs/BusinessEvents.md) + - upgrade / compatibility questions: [`docs/MIGRATION-8.0.0.md`](docs/MIGRATION-8.0.0.md) + +The repo markdown docs are the source of truth. Prefer them over memory, ad hoc code reading, or external copies. + +## Mental Model To Keep In Mind + +- `IndexerProcessor` is where consumers persist progress and domain data. +- The runtime may emit either `IndexingResult.LogResult` or `IndexingResult.BlockResult`; processors should handle both when relevant to the configuration. +- Startup rollback is intentional. It is a data-integrity feature, not a bug. +- Reorg recovery is part of the runtime contract. Consumers are expected to implement deterministic rollback behavior. +- Dependencies affect execution semantics, not just throughput. Adding `dependsOn(...)` changes how the runtime must coordinate indexers. + +## Mode Selection Checklist + +Use this checklist before recommending or editing indexer configuration. + +Choose the default factory-built log mode when: + +- the consumer only needs decoded ABI events, business events, or VET transfers +- fastest catch-up is the priority +- there is no same-block dependency on another indexer + +Choose `includeFullBlock()` when the consumer needs: + +- full block contents +- reverted transaction visibility +- gas / fee metadata from full block processing +- clause inspection results from `callDataClauses(...)` + +Choose `dependsOn(...)` when: + +- one indexer must finish a block before another processes that same block + +Important: + +- `LogsIndexer` and `BlockIndexer` are not interchangeable modes. +- `dependsOn(...)` forces block-based execution semantics. +- `includeFullBlock()` forces block-based execution semantics. + +Choose business events when: + +- downstream consumers care about higher-level actions rather than every raw event + +Choose raw ABI events when: + +- the consumer needs each decoded event individually +- there is no stable semantic grouping worth encoding as a business event + +## Guardrails + +- Build indexers through `IndexerFactory`, not by manually wiring implementation classes in application code. +- Do not describe `LogsIndexer` and `BlockIndexer` as equivalent choices with different performance profiles. They expose different runtime behavior and different data. +- Do not treat startup rollback as suspicious behavior. It is part of the library’s safety model. +- Do not rely on stale documentation copies. The repo docs are authoritative. +- Do not present internal implementation details as stable public API unless they are explicitly documented as such. + +## Common Agent Tasks + +Optimize guidance for these common tasks: + +- explaining how to integrate `indexer-core` into another service +- changing the library itself +- debugging behavior differences between `LogsIndexer` and `BlockIndexer` +- designing ABI-driven or business-event-driven indexing setups + +Documentation updates matter, but they are secondary to preserving correct runtime behavior and public guidance. + +## Verification Expectations + +When changing this library: + +- run targeted tests for the touched behavior as a minimum +- run broader `./gradlew test` when the change is cross-cutting or affects shared runtime behavior +- run formatting checks or formatting fixes when Kotlin code changes + +Minimum standard before claiming completion: + +- the changed behavior is covered by tests or an existing test path was exercised +- any affected public guidance remains consistent with the repo docs +- the response states clearly if full verification was not run + +Useful commands: + +```bash +./gradlew test +./gradlew test --tests "org.vechain.indexer.SomeTest" +./gradlew spotlessCheck +./gradlew spotlessApply +``` + +## When Working From Source + +The codebase is useful for confirmation, but agents should not need to reverse-engineer the library from source just to understand its purpose. + +Read source after the docs when you need to: + +- confirm an implementation detail +- debug a behavioral discrepancy +- update internals while preserving the documented contract + +If source and docs appear to disagree, call that out explicitly instead of silently choosing one.