Releases · snowfluke/paperium

31 Dec 17:12

snowfluke

v2.1.0

ec765c8

v2.1.0: Entry Price, UX Improvements & Performance Boost Latest

Latest

What's New

Entry Price Support

Limit Order Entry Pricing: XGBoost now calculates optimal entry prices for limit orders (1x ATR below current price)
Market vs Limit Orders: Automatic order type detection based on confidence (≥85% = MARKET, <85% = LIMIT)
Entry Price Display: New Entry column in signals table showing recommended entry price with discount percentage
Accurate Allocation: Position sizing and SL/TP calculations now based on entry price, not current price

Display Improvements

Cleaner Table Layout: Separated SL, TP, Trail, Est Profit, and Est Loss into individual columns
Full Rupiah Format: Changed Est P/L from confusing millions notation (+0.17M) to full Rupiah (+Rp 170,000)
Removed Shares Column: Simplified display - allocation amount is more important than share count
Max Hold Days Reminder: Added configuration display showing 5-day time stop limit
Better Allocation Summary: Removed hardcoded "(3%)" text, now shows dynamic percentages

Performance Optimizations

Batch Database Loading: Reduced from ~80 individual queries per day to 1 batch query (10-12x faster)
Skip Held Positions: No longer rescans stocks already in portfolio during signal scanning
Faster Backtest: Ensemble backtest now completes in ~5-10 minutes instead of 50+ minutes

Progress Tracking

Real-time Stats: Replaced misleading time estimates with useful metrics (active positions, closed trades)
Current Date Display: Shows which date is being processed during backtest
Accurate Progress: Better visibility into backtest execution state

Technical Details

Modified Files

`ml/xgb_inference.py`: Added entry_pct calculation (0.0 for MARKET, 1.0x ATR for LIMIT)
`scripts/signals.py`: Entry price display, separated columns, full Rupiah format, max hold days
`scripts/eval_ensemble.py`: Batch loading, skip held tickers, real-time progress stats

Configuration

Max Hold Days: 5 days (from `config.ml.tbl_horizon`)
Entry Price Logic:
- High confidence (≥85%): Enter at market price
- Moderate confidence (<85%): Enter 1x ATR below current price

Upgrade Notes

All changes are backward compatible. Existing XGBoost models from v2.0.0 will work with this release.

Full Changelog

Entry Price

Calculate entry price based on order type (MARKET/LIMIT)
Use entry price for position sizing in signals and backtest
Display entry price with percentage discount in table

Display

Separate SL/TP/Trail into individual columns with percentages
Separate Est Profit/Est Loss into individual columns
Change format from "+0.17M" to "+Rp 170,000" for Indonesian users
Remove Shares column for cleaner table
Add "Max Hold Days: 5 days (time stop)" to configuration display
Fix allocation summary labels (remove hardcoded "3%")

Performance

Add `load_all_tickers_batch()` for single-query batch loading
Filter out already-held tickers in `scan_signals()`
Optimize database queries with window functions

Progress

Replace `TimeRemainingColumn` with real-time stats display
Show "Positions: X | Trades: Y" during backtest
Show current date being processed

Full Diff: v2.0.0...v2.1.0

Assets 2

31 Dec 12:55

snowfluke

v2.0.0

7150628

Paperium v2.0.0 - LSTM Deep Learning Edition

Major architectural transition from traditional ML to Deep Learning for IHSG quantitative trading.

Overview

Paperium v2 represents a complete paradigm shift in our approach to stock prediction:

From: XGBoost + Hand-crafted Technical Indicators (RSI, MACD, Bollinger Bands)
To: PyTorch LSTM + Raw OHLCV Sequences

This change is based on the hypothesis that neural networks can learn better feature representations directly from raw price data than human-engineered indicators.

Architecture Changes

Model

New: 2-layer LSTM (Hidden Size: 8)
- Input: 100-day sequences of raw OHLCV data
- Output: 3-class classification (Loss/Neutral/Profit)
Old: XGBoost with 20+ hand-crafted features

Labeling System

New: Triple Barrier Method (TBL)
- ±3% price barriers with 5-day holding horizon
- Path-dependent classification
- Class 0: Hit stop-loss (-3%)
- Class 1: Time expired (neutral)
- Class 2: Hit take-profit (+3%)
Old: Simple close-to-close returns

Feature Engineering

New: Raw OHLCV normalization only
- Price: Normalized relative to first day of window
- Volume: Log-normalized
Old: 20+ technical indicators (RSI, MACD, ATR, BB, OBV, etc.)

New Features

Signal Generation & Allocation

Confidence-Weighted Capital Allocation: Higher confidence signals automatically receive proportionally larger allocations
- Formula: allocation_i = total_capital × (confidence_i / sum_of_confidences)
Blacklist Filtering: Automatically excludes 72 illiquid/suspended stocks
Flexible Output Modes: Show all signals or only allocated positions with P/L estimates
Live Data Fetching: --fetch-latest flag to pull current market data from Yahoo Finance

Performance Optimizations

Sequence Caching: 45x speedup on training data preparation
- First run: ~45 seconds (957 tickers)
- Subsequent runs: ~1 second (cache hit)
- Intelligent cache invalidation based on DB version and config changes
Batch Progress Tracking: Real-time updates every 10 batches during training

User Experience

Timestamped Logging: All scripts now show [MM:SS | +Δs] timestamps
Training Dashboard: Live batch-level progress (127/200 (63%))
Fresh vs Retrain: Choose to start new model or continue from checkpoint
Interactive CLI: Rich terminal UI with tables, panels, and progress bars

Breaking Changes

Removed Features

Portfolio management system (max positions, owned stocks, slots)
All technical indicator calculations
XGBoost model and dependencies
morning_signals.py (replaced by signals.py)

File Changes

morning_signals.py → signals.py
Removed: ml/ensemble.py, ml/meta_labeling.py, portfolio/, strategy/
Added: utils/logger.py for timestamped logging

API Changes

signals.py now requires:
- --capital <amount> (optional): Total capital to allocate
- --num-stock <n> (optional): Number of stocks to buy
- --fetch-latest (optional): Fetch current market data
train.py now supports:
- --retrain: Continue from best_lstm.pt
- --epochs <n>: Custom epoch count

Migration Guide

For Users of Gen 4 (XGBoost Version)

The XGBoost-based system is still available in the Git history. To access it:

git checkout 388d491  # Last XGBoost commit

Upgrading to v2

Pull latest code: git pull origin main
Install dependencies: uv sync
Train new LSTM model: python run.py → Option 2
Generate signals: python run.py → Option 1

Performance Metrics

Based on 2024-01-01 to 2025-09-30 backtest:

Model Accuracy: ~60% on validation set
Win Rate: 50-55% (signals that hit +3% target)
Training Time: ~5-10 minutes (with cache)
Inference Speed: ~5-10 seconds for full universe

Technical Details

Dependencies

PyTorch 2.x (MPS/CUDA/CPU auto-detection)
Rich library for terminal UI
scikit-learn for metrics
yfinance for data fetching

Data Pipeline

Fetch daily OHLCV from Yahoo Finance
Store in SQLite with indexing
Generate 100-day rolling sequences
Apply Triple Barrier Labeling
Train LSTM with early stopping

Configuration

All parameters in config.py:

Window size: 100 days
TBL horizon: 5 days
TBL barrier: 3.0%
Batch size: 64
Learning rate: 0.001
Hidden size: 8 (2 layers)

Research Reference

This implementation is inspired by research on Triple Barrier Labeling and meta-labeling:
https://arxiv.org/pdf/2504.02249v1

Contributors

Special thanks to all contributors who helped test and refine this major release.

Full Changelog: v1.0.0...v2.0.0

Assets 2

Uh oh!

Releases: snowfluke/paperium

v2.1.0: Entry Price, UX Improvements & Performance Boost

What's New

Entry Price Support

Display Improvements

Performance Optimizations

Progress Tracking

Technical Details

Modified Files

Configuration

Upgrade Notes

Full Changelog

Uh oh!

Paperium v2.0.0 - LSTM Deep Learning Edition

Overview

Architecture Changes

Model

Labeling System

Feature Engineering

New Features

Signal Generation & Allocation

Performance Optimizations

User Experience

Breaking Changes

Removed Features

File Changes

API Changes

Migration Guide

For Users of Gen 4 (XGBoost Version)

Upgrading to v2

Performance Metrics

Technical Details

Dependencies

Data Pipeline

Configuration

Research Reference

Contributors

Uh oh!