Skip to content

Latest commit

 

History

History

README.md

Three-way engine benchmarks

Apples-to-apples comparison of PineForge against two open-source PineScript runtimes:

  • PyneCore — Python framework that runs @pyne Python translated from Pine source via the PyneSys cloud compiler. Apache 2.0.
  • PineTS — TypeScript transpiler/runtime that runs raw .pine source in Node.js / browsers. AGPL-3.0.

Headline

Last refresh: 2026-05-26. Engine v0.6.0, PyneCore 6.4.6, PineTS 0.9.16.

Scale: 100 strategies × ~167,000 TV trades verified.

PineForge PyneCore TV ground truth
Strategies 100 100 100
Trades emitted 167,381 253,031 167,301
🟢 excellent 100 / 100 85 / 100
🟢 strong 0 / 100 2 / 100
🟡 moderate 0 / 100 10 / 100
🟠 weak 0 / 100 3 / 100

PineForge hits canonical excellent tier on 100 of 100 strategies (was 48/50 in 2025-05). Median speed: 119× faster than PyneCore across 99 commonly-timed strategies — see results/speed.md. Per-strategy detail: results/summary.md.

Reproducing

Prerequisites:

  • cmake ≥ 3.16 + C++17 compiler (clang or gcc)
  • uv (Python package manager) + Python 3.11+
  • node ≥ 20

No API keys, no downloads, no LFS. All inputs (OHLCV, .pine sources, generated.cpp, tv_trades.csv, strategy_pyne.py) are committed to the benchmarks/assets submodule.

# 1. Clone engine + assets submodule
git clone https://github.com/pineforge-4pass/pineforge-engine.git
cd pineforge-engine
git submodule update --init benchmarks/assets

# 2. Build runtime + 100 bench strategy dylibs
cmake -B build -DPINEFORGE_BUILD_TESTS=ON -DPINEFORGE_BUILD_BENCH_STRATEGIES=ON
cmake --build build --target pineforge bench_strategies -j

# 3. Install bench deps (Python via uv, Node via npm)
cd benchmarks
uv sync
npm install
cd ..

# 4. Run end-to-end (~5 min wall-time)
bash benchmarks/run_all.sh

# 5. Read the results
cat benchmarks/results/summary.md

Optional speed sweep (~30 min, GBench + per-strategy timing × 3 engines):

cmake -B build -DPINEFORGE_BUILD_SPEED_BENCH=ON
cmake --build build --target pineforge_bench -j
# run_all.sh picks it up automatically; or skip with SKIP_SPEED=1

run_all.sh env knobs:

  • SKIP_BUILD=1 — skip cmake build (use existing artifacts)
  • SKIP_PINEFORGE=1 / SKIP_PYNE=1 / SKIP_PINETS=1 — skip individual engines
  • SKIP_SPEED=1 — skip the per-strategy speed sweep
  • SKIP_REPORTS=1 — skip the comparator step

What gets reproduced

The harness produces these reports under results/:

Report What it compares Engines
summary.md Per-strategy tier (excellent / strong / moderate / weak / minimal) PineForge ↔ PyneCore vs TV
trade_comparison.md Per-strategy entry/exit/PnL p90 deltas PineForge ↔ PyneCore vs TV
indicator_comparison.md Per-bar indicator values (10 indicators × 41k bars) PineForge ↔ PyneCore ↔ PineTS
speed.md Per-strategy wall-time (median + p95) PineForge (in-process) vs PyneCore (subprocess) vs PineTS canonical

Per-strategy raw artifacts under benchmarks/assets/strategies/<NN-slug>/:

  • strategy.pine — canonical PineScript source (Apache-2.0)
  • generated.cpp — codegen output (compiled to strategy.dylib via cmake)
  • strategy_pyne.py — PyneSys cloud-compiler output (committed; no API key needed to reproduce)
  • tv_trades.csv — TradingView ground truth
  • pineforge_trades.csv / pynecore_trades.csv — engine outputs (regenerated by run_all.sh)
  • inputs.json (selective) — engine runtime_overrides (bar magnifier mode, OHLCV trim) + parity profile

Layout

benchmarks/
├── assets/                          PRIVATE git submodule with all fixtures
│   ├── data/ETHUSDT_15.csv          53,930-bar Binance ETH/USDT-USDT 15m feed (committed)
│   └── strategies/
│       ├── CMakeLists.txt           per-strategy bench_strategies cmake target
│       ├── 01-sma-cross/            see "raw artifacts" above
│       ├── 02-…  100-…/             100 strategies total
│       └── _indicators/             canonical 10-indicator script
├── runners/
│   ├── run_pineforge_canonical.cpp  PineForge canonical indicator runner (links libpineforge)
│   ├── run_pinets_canonical.mjs     PineTS canonical indicator runner (Node)
│   └── run_pynecore.py              PyneCore strategy runner (wraps `pyne run` + TV-schema normalize)
├── speed/
│   ├── pineforge_bench.cpp          Google Benchmark harness for PineForge (in-process)
│   ├── time_pynecore.py             Subprocess wall-time timer for PyneCore
│   ├── time_pinets.mjs              Subprocess wall-time timer for PineTS canonical
│   ├── aggregate.py                 Combine 3 timing JSONs → speed.md
│   └── CMakeLists.txt               GBench fetch + build config
├── results/                         Refreshed reports (committed)
├── compare.py                       Trade-list 3-way comparator (TV ↔ PineForge ↔ PyneCore)
├── compare_indicators.py            Per-bar indicator 3-way comparator
├── paths.py                         Path constants
├── pyproject.toml + uv.lock         Python deps
├── package.json                     Node deps
└── run_all.sh                       Single-command reproducer

Maintenance scripts (add new bench slots, refresh OHLCV, refresh strategy_pyne.py via PyneSys, re-emit generated.cpp via codegen) live in maintainer-only bench-maintenance scripts — not needed for reproduction.

Methodology

Trade-list comparison (PineForge ↔ PyneCore vs TradingView)

For each strategy:

  1. PineForge: cmake --build produces strategy.dylib from committed generated.cpp. scripts/run_strategy.py loads it via ctypes, pumps the OHLCV through, emits pineforge_trades.csv (TV schema).
  2. PyneCore: pyne run executes the committed strategy_pyne.py against the same OHLCV. runners/run_pynecore.py normalizes the output to TV schema → pynecore_trades.csv.
  3. TV ground truth: tv_trades.csv — exported from TradingView's broker emulator running the same .pine source on the same OHLCV.

compare.py mirrors the canonical scripts/verify_corpus.py validator (single source of truth):

  1. Aligns trades by direction + entry-time within a 1-hour gating window with a $3 entry-price gate.
  2. Trims both lists to the matched-trade window ± 1h (per-engine; eliminates edge-trade count inflation when engine and TV disagree on warmup boundary).
  3. Honors inputs.json per-strategy: parity_profile, runtime_overrides (bar magnifier mode), ohlcv_start_ms (warmup trim), expected_tier (anomaly/engine_only override), validation_overrides.expect_tv_match.
  4. Computes 4-dim p90 deltas vs TV: count, entry-price, exit-price, P&L.
  5. Classifies into 5 tiers: excellent → strong → moderate → weak → minimal. Strategies with trail_* exits get the production threshold profile (relaxed exit + PnL).

Indicator-value comparison (three-way)

A single canonical script (assets/strategies/_indicators/canonical.pine) computes 10 common indicators (ta.ema, ta.sma, ta.rsi, ta.atr, ta.macd × 3, ta.bb × 3) over the full 41,307-bar feed. Each engine emits one CSV with per-bar values; compare_indicators.py reports p50 / p90 / p99 / max relative deltas across every indicator-pair.

Speed measurement

  • PineForge: Google Benchmark in-process, dlopens strategy.dylib per iteration, pumps full OHLCV, reports median + p95 over 20 iterations. Includes dlopen cost (cold-load realistic for FFI-style consumers).
  • PyneCore: Subprocess wall-time of pyne run, including Python startup + framework import. Median over 20 invocations.
  • PineTS: Subprocess wall-time of node run_pinets_canonical.mjs. Strategy backtester not implemented upstream; timing is for canonical indicator script only.

Mixed methodology disclosure: GBench in-process is the realistic cost for an FFI-callable native engine. Subprocess timing is the realistic cost for engines whose API entry point IS the process. Each number reflects what a real consumer of that engine would see.

Fairness

This is fair. All three engines consume the same 53,930-bar Binance ETH/USDT-USDT 15m OHLCV feed at assets/data/ETHUSDT_15.csv. The PyneCore Python is the official cloud-compiler output for the same .pine sources PineForge runs against — no hand-translation. Where engines have configurable behaviors (commission, slippage, default qty, bar magnifier), they're set identically via the strategy(...) decoration in .pine source plus per-strategy inputs.json::runtime_overrides.

This is not a victory lap. PineForge wins by design on the trade-list side because PineTS does not implement strategy backtesting yet (a fact of timing, not architecture). On indicator outputs all three engines have to agree to within tight tolerances, and divergences in either direction are flagged as defects. PineForge ↔ PyneCore trade-list deltas are expected because order-matching and fill semantics differ slightly — this harness measures that gap.

License

Same as the parent repository (Apache 2.0). Three pieces deserve explicit notes:

assets/data/ETHUSDT_15.csv

Binance USDT-M futures ETH/USDT-USDT 15-minute OHLCV, 53,930 bars covering 2024-10-20 → 2026-05-04 (~6 months pre-TV-export warmup + full TV chart range). Public market data — not copyrightable in the US/EU. Pinned for reproducibility; refresh via pineforge-utils/bench-maintenance/fetch_extended_ohlcv.py (maintainer-only).

assets/strategies/<NN-slug>/strategy.pine

PineScript sources are clean-room PineForge originals carrying Apache-2.0 SPDX headers. Same authorship and license as the corpus probes under corpus/validation/.

assets/strategies/<NN-slug>/strategy_pyne.py

Mechanically translated derivatives of the corresponding strategy.pine, produced by the PyneSys cloud compiler (pyne compile, PyneComp v6.4.6). Committed for reproducibility-without-API-key — anyone reproducing the benchmark uses the committed output. The PyneSys compiler is a tool (like gcc); its output does not transfer copyright to the vendor. These files inherit the underlying strategy.pine license (Apache-2.0).

PineTS (AGPL-3.0)

Running PineTS at benchmark time pulls AGPL-3.0 code into Node's process. Permissible for running the benchmark; redistributing the whole toolchain as a single binary would trigger copyleft. We publish only numerical results (CSVs, markdown tables), not PineTS source.

Full licensing context: ../LEGAL.md.