Skip to content

Add formal performance benchmarks #29

@walterra

Description

@walterra

Overview

PERFORMANCE.md currently references benchmark numbers without actual test data. We need to run formal benchmarks and document real results.

Benchmark Scenarios

  1. File Ingestion

    • Small file: 10,000 docs (~1 KB each, ~10 MB file)
    • Large file: 1,000,000 docs (~1 KB each, ~1 GB file)
    • Measure: time, throughput (docs/sec), memory usage
  2. Reindexing

    • Local reindex: 10,000 docs with simple transform
    • Measure: time, throughput, memory
  3. Cross-Version Reindex

    • ES 8.x → ES 9.x: 10,000 docs
    • Measure: time, throughput, memory, overhead vs same-version
  4. Transform Complexity

    • No transform vs simple vs complex transform
    • Impact on throughput
  5. Buffer Size Impact

    • Test different bufferSize values (1 MB, 5 MB, 10 MB, 20 MB)
    • Document optimal settings for different document sizes

Test Environment

Document hardware specs:

  • CPU, RAM, disk type (SSD vs HDD)
  • Elasticsearch version
  • Node.js version
  • Local vs remote ES

Deliverables

  • Benchmark script in scripts/benchmark.js
  • Update PERFORMANCE.md with real results
  • Add benchmark results to CI (optional)
  • Document reproducibility steps

Acceptance Criteria

  • Real benchmark data replaces placeholder content in PERFORMANCE.md
  • Results are reproducible with documented steps
  • Multiple scenarios covered (file sizes, reindex, transforms)
  • Memory usage verified to be constant

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions