Releases: snowfluke/paperium
v2.1.0: Entry Price, UX Improvements & Performance Boost
What's New
Entry Price Support
- Limit Order Entry Pricing: XGBoost now calculates optimal entry prices for limit orders (1x ATR below current price)
- Market vs Limit Orders: Automatic order type detection based on confidence (≥85% = MARKET, <85% = LIMIT)
- Entry Price Display: New Entry column in signals table showing recommended entry price with discount percentage
- Accurate Allocation: Position sizing and SL/TP calculations now based on entry price, not current price
Display Improvements
- Cleaner Table Layout: Separated SL, TP, Trail, Est Profit, and Est Loss into individual columns
- Full Rupiah Format: Changed Est P/L from confusing millions notation (+0.17M) to full Rupiah (+Rp 170,000)
- Removed Shares Column: Simplified display - allocation amount is more important than share count
- Max Hold Days Reminder: Added configuration display showing 5-day time stop limit
- Better Allocation Summary: Removed hardcoded "(3%)" text, now shows dynamic percentages
Performance Optimizations
- Batch Database Loading: Reduced from ~80 individual queries per day to 1 batch query (10-12x faster)
- Skip Held Positions: No longer rescans stocks already in portfolio during signal scanning
- Faster Backtest: Ensemble backtest now completes in ~5-10 minutes instead of 50+ minutes
Progress Tracking
- Real-time Stats: Replaced misleading time estimates with useful metrics (active positions, closed trades)
- Current Date Display: Shows which date is being processed during backtest
- Accurate Progress: Better visibility into backtest execution state
Technical Details
Modified Files
- `ml/xgb_inference.py`: Added entry_pct calculation (0.0 for MARKET, 1.0x ATR for LIMIT)
- `scripts/signals.py`: Entry price display, separated columns, full Rupiah format, max hold days
- `scripts/eval_ensemble.py`: Batch loading, skip held tickers, real-time progress stats
Configuration
- Max Hold Days: 5 days (from `config.ml.tbl_horizon`)
- Entry Price Logic:
- High confidence (≥85%): Enter at market price
- Moderate confidence (<85%): Enter 1x ATR below current price
Upgrade Notes
All changes are backward compatible. Existing XGBoost models from v2.0.0 will work with this release.
Full Changelog
Entry Price
- Calculate entry price based on order type (MARKET/LIMIT)
- Use entry price for position sizing in signals and backtest
- Display entry price with percentage discount in table
Display
- Separate SL/TP/Trail into individual columns with percentages
- Separate Est Profit/Est Loss into individual columns
- Change format from "+0.17M" to "+Rp 170,000" for Indonesian users
- Remove Shares column for cleaner table
- Add "Max Hold Days: 5 days (time stop)" to configuration display
- Fix allocation summary labels (remove hardcoded "3%")
Performance
- Add `load_all_tickers_batch()` for single-query batch loading
- Filter out already-held tickers in `scan_signals()`
- Optimize database queries with window functions
Progress
- Replace `TimeRemainingColumn` with real-time stats display
- Show "Positions: X | Trades: Y" during backtest
- Show current date being processed
Full Diff: v2.0.0...v2.1.0
Paperium v2.0.0 - LSTM Deep Learning Edition
Major architectural transition from traditional ML to Deep Learning for IHSG quantitative trading.
Overview
Paperium v2 represents a complete paradigm shift in our approach to stock prediction:
- From: XGBoost + Hand-crafted Technical Indicators (RSI, MACD, Bollinger Bands)
- To: PyTorch LSTM + Raw OHLCV Sequences
This change is based on the hypothesis that neural networks can learn better feature representations directly from raw price data than human-engineered indicators.
Architecture Changes
Model
- New: 2-layer LSTM (Hidden Size: 8)
- Input: 100-day sequences of raw OHLCV data
- Output: 3-class classification (Loss/Neutral/Profit)
- Old: XGBoost with 20+ hand-crafted features
Labeling System
- New: Triple Barrier Method (TBL)
- ±3% price barriers with 5-day holding horizon
- Path-dependent classification
- Class 0: Hit stop-loss (-3%)
- Class 1: Time expired (neutral)
- Class 2: Hit take-profit (+3%)
- Old: Simple close-to-close returns
Feature Engineering
- New: Raw OHLCV normalization only
- Price: Normalized relative to first day of window
- Volume: Log-normalized
- Old: 20+ technical indicators (RSI, MACD, ATR, BB, OBV, etc.)
New Features
Signal Generation & Allocation
- Confidence-Weighted Capital Allocation: Higher confidence signals automatically receive proportionally larger allocations
- Formula:
allocation_i = total_capital × (confidence_i / sum_of_confidences)
- Formula:
- Blacklist Filtering: Automatically excludes 72 illiquid/suspended stocks
- Flexible Output Modes: Show all signals or only allocated positions with P/L estimates
- Live Data Fetching:
--fetch-latestflag to pull current market data from Yahoo Finance
Performance Optimizations
- Sequence Caching: 45x speedup on training data preparation
- First run: ~45 seconds (957 tickers)
- Subsequent runs: ~1 second (cache hit)
- Intelligent cache invalidation based on DB version and config changes
- Batch Progress Tracking: Real-time updates every 10 batches during training
User Experience
- Timestamped Logging: All scripts now show
[MM:SS | +Δs]timestamps - Training Dashboard: Live batch-level progress (127/200 (63%))
- Fresh vs Retrain: Choose to start new model or continue from checkpoint
- Interactive CLI: Rich terminal UI with tables, panels, and progress bars
Breaking Changes
Removed Features
- Portfolio management system (max positions, owned stocks, slots)
- All technical indicator calculations
- XGBoost model and dependencies
morning_signals.py(replaced bysignals.py)
File Changes
morning_signals.py→signals.py- Removed:
ml/ensemble.py,ml/meta_labeling.py,portfolio/,strategy/ - Added:
utils/logger.pyfor timestamped logging
API Changes
signals.pynow requires:--capital <amount>(optional): Total capital to allocate--num-stock <n>(optional): Number of stocks to buy--fetch-latest(optional): Fetch current market data
train.pynow supports:--retrain: Continue from best_lstm.pt--epochs <n>: Custom epoch count
Migration Guide
For Users of Gen 4 (XGBoost Version)
The XGBoost-based system is still available in the Git history. To access it:
git checkout 388d491 # Last XGBoost commitUpgrading to v2
- Pull latest code:
git pull origin main - Install dependencies:
uv sync - Train new LSTM model:
python run.py→ Option 2 - Generate signals:
python run.py→ Option 1
Performance Metrics
Based on 2024-01-01 to 2025-09-30 backtest:
- Model Accuracy: ~60% on validation set
- Win Rate: 50-55% (signals that hit +3% target)
- Training Time: ~5-10 minutes (with cache)
- Inference Speed: ~5-10 seconds for full universe
Technical Details
Dependencies
- PyTorch 2.x (MPS/CUDA/CPU auto-detection)
- Rich library for terminal UI
- scikit-learn for metrics
- yfinance for data fetching
Data Pipeline
- Fetch daily OHLCV from Yahoo Finance
- Store in SQLite with indexing
- Generate 100-day rolling sequences
- Apply Triple Barrier Labeling
- Train LSTM with early stopping
Configuration
All parameters in config.py:
- Window size: 100 days
- TBL horizon: 5 days
- TBL barrier: 3.0%
- Batch size: 64
- Learning rate: 0.001
- Hidden size: 8 (2 layers)
Research Reference
This implementation is inspired by research on Triple Barrier Labeling and meta-labeling:
https://arxiv.org/pdf/2504.02249v1
Contributors
Special thanks to all contributors who helped test and refine this major release.
Full Changelog: v1.0.0...v2.0.0