-
Notifications
You must be signed in to change notification settings - Fork 86
0. Related Work
Trade execution optimization: https://www.cis.upenn.edu/~mkearns/papers/rlexec.pdf (M.Kearns)
Electronic Trading in Order-Driven Markets: Efficient Execution (M.Kearns)
Method
- Expected execution price
- x-axis: limit order relative to its own side of the market
- y-axis: “return” (difference between the execution price and the mid-spread price at the beginning of the time period) e.g. return = mid-spread - (ex.price/mid-spread)
- Risk
- x-axis: every limit order price
- y-axis: Standard deviation of returns
- Market order: sweep the sell book for the entire size at once
- Marketable limit order: transact with top of the sell book and then leave the residual shares sitting on top of of the buy book.
- Efficient Pricing Frontier
- Markowitz efficient frontier: shows trade-off between risk and return in an investment
- Risk-return profile: every possible execution strategy on a two-dimensional graph
- x-axis: standard deviation
- y-axis: returns
- Efficient pricing frontier --> top part of risk-return
Results
- Order size
- More expensive to trade lager orders
- More risky (not getting executed) to trade lager orders
- Large orders require more aggressive pricing
- Possible improvement by splitting into several pieces
- Time Window
- Shorter time interval is more expensive as it requires more aggressive order pricing
- Longer time interval is less expensive but riskier
- Time of the day
- Only relevant if transacting over a long time period
- Otherwise generalization impossible
- Market Conditions
- Cheaper to trade on high-volume days, but also riskier (surges in volume -> higher volatility -> adverse price movements more likely)
- More aggressive pricing on low-volume days
- Depth of a book may not be as significant as volume, when it comes to limit order pricing
Algorithmic Challenges in Modern Financial Markets (M.Kearns) http://www.eecs.harvard.edu/~cat/cs/diss/paperlinks/ectutorial2006.pdf
Deep Reinforcement Learning Based Trading Application at JP Morgan Chase https://medium.com/@ranko.mosic/reinforcement-learning-based-trading-application-at-jp-morgan-chase-f829b8ec54f2
A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
Using real-time cluster configurations of streaming asynchronous features as online state descriptors in financial markets
Optimal Trade Execution: An Evolutionary Approach
Impact cost: Moving the price up by executing large buy orders (e.g. down by sell orders) at once. By splitting up a big order (e.g. V shares) into smaller pieces and spreading the execution over a time horizon H the impact cost can be lessened.
Opportunity cost: Arises when the price moves against our favour while splitting a big order into pieces and delaying execution. Therefore the opportunity to execute at a better price.
Trade execution strategy: Optimizes trade-off between impact cost and opportunity cost and therefore trying to find best execution.
Measuring execution performance:
- Bid-ask mid-spread at t of execution initialization [Kearns]
- Volume Weighted Average Price (VWAP): vwap = sum(price*volume) / sum(volume)
Backtesting: Process of executing a given strategy on historical data do determine what its performance would have been had it been used on a certain time t in past.
- Price-only would not incorporate volume and limit orders (liquidity) available.
- Limit orders allow for an educated guess whereby it is assumed that trades are filled by those and therefore ignores the time priority of all other limit orders at the same price level
Paper Book
Deep Reinforcement Learning for Pairs Trading
Reinforcement Learning For Automated Trading
Algorithm Trading using Q-Learning and Recurrent Reinforcement Learning
Modeling Stock Order Flows and Learning Market-Making from Data
T: 1hr basis
Multiple Kernel Learning on the Limit Order Book
Purpose: Investigates currency order books to find patterns which can be exploited with the aim of forecasting movement. SVM classification techniques with different kernels along with two Multiple Kernel Learning (MKL) techniques, SimpleMKL, are being used.
Simulating and analyzing order book data: The queue-reactive model
“Market making” in an order book model and its impact on the spread
Deep Reinforcement Learning for Pairs Trading
A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem https://arxiv.org/pdf/1706.10059v2.pdf
Cryptocurrency Portfolio Management with Deep Reinforcement Learning https://arxiv.org/pdf/1612.01277v5.pdf
Modeling Stock Order Flows and Learning Market-Making from Data
Agent Inspired Trading Using Recurrent Reinforcement Learning and LSTM Neural Networks
Purpose Buy/Hold/Sell
Valuable Information
- Presence of large amounts of noise and non-stationarity in the datasets, which could cause severe problems for a value function approach.
- Recurrent reinforcement learning
- provides immedieate feedback to optimize the strategy
- has ability to produce real valued actions or weights naturally without resorting to the discretization (which is necessary for value function approaches)
- Sharpe Ration and Downside Deviation Ratio can be formulated to enable on-line learning with recurrent RL
- Uses gradient ascent to optimize
- LSTM handles deep structure on feature learning and the time expansion parts
- Agent
- Risk-adjusted return using Sharp Ratio (return / std(return), given trading period t) or Downside Deviation Ratio
Deep Direct Reinforcement Learning for Financial Signal Representation and Trading
Robust Optimization of Order Execution http://www.ece.ust.hk/~palomar/Publications_files/2015/FengPalomarRubio-TSP2015%20-%20Robust_Order_Execution.pdf
Purpose
- We propose the use of the conditional value-at-risk (CVaR) of the execution cost as risk measure, which allows to take into consideration only the unfavorable part of the return distribution, or, equivalently, unwanted high cost.
- Due to the parameter estimation errors in the price model, the naive strategies given by the nominal problem may perform badly in the real market, and hence it is extremely important to take such parameters estimation errors into consideration. To deal with this, we extend both the traditional mean-variance approach and our proposed CVaR approach to their robust design counterparts.
Statements
Variance:
- However, the variance has been recognized not to be practical since it is a symmetric measure of risk and, hence, penalizes the low-cost events.
- However, it is well known that variance is not an appropriate risk measure when dealing with financial returns from non-normal, negatively skewed, and leptokurtic distributions [22]
Value-at-risk:
- VaR is also known to have the limitations of lacking subadditivity and not properly describing the losses in the tail of concern [22].
- In order to overcome the inadequacy of variance or VaR, Conditional VaR (CVaR, also known in the literature as Expected Shortfall, Expected Tail Loss, Tail Conditional Expectation, and Tail VaR) has been proposed as an alternative risk measurement [23] which has the desired properties e.g., convexity and coherence, [22], and thus has been employed significantly in financial engineering, see [24]–[27] for portfolio or risk management
Parameter estimation: *
Learning to Trade via Direct Reinforcement http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=935097
Optimal Trading Strategy in a Limit Order Market with Imperfect Liquidity https://editorialexpress.com/cgi-bin/conference/download.cgi?db_name=res_phd_2013&paper_id=271
Optimal order placement in limit order markets https://arxiv.org/abs/1210.1625
Introduction to Learning to Trade with Reinforcement Learning http://www.wildml.com/2018/02/introduction-to-learning-to-trade-with-reinforcement-learning/
- Sharp ratio or Drawdown as reward functions.
- Reinforcement Learning allows for end-to-end optimization and maximizes (potentially delayed) rewards.
- a strategy may work well in a bearish environment, but lose money in a bullish environment. Partly, this is due to the simplistic nature of the policy, which does not have a parameterization powerful enough to learn to adapt to changing market conditions.
- However, if we explicitly modeled the other agents in the environment, our agent could learn to exploit their strategies. In essence, we are reformulating the problem from “market prediction” to “agent exploitation”. This is much more similar to what we are doing in multiplayer games, like DotA.
- in the trading case, most states in the environment are bad, and there are only a few good ones. A naive random approach to exploration will almost never stumble upon those good state-actions pairs. A new approach is necessary here.
- There are many ways to speed up the training of Reinforcement Learning agents, including transfer learning, and using auxiliary tasks. For example, we could imagine pre-training an agent with an expert policy, or adding auxiliary tasks, such as price prediction
Why is machine learning in finance so hard? https://www.hardikp.com/2018/02/11/why-is-machine-learning-in-finance-so-hard/
Limit Order Book Visualisation http://parasec.net/transmission/order-book-visualisation/
Limit Order Book reconstruction, visualization and statistical analysis of the order flow https://www.ethz.ch/content/dam/ethz/special-interest/mtec/chair-of-entrepreneurial-risks-dam/documents/dissertation/master%20thesis/thesis_schroeter.pdf
Optimal Placement in a Limit Order Book Roughly speaking, algorithmic trading is based on two different time scales: the daily or weekly scale, and a smaller (ten to hundred seconds) time scale. The first step is to optimally slice big orders into smaller ones on a daily basis with the goal to minimize the price impact and/or to maximize the expected utility; the second step is to optimally place the orders within seconds. The former is the well-known optimal execution problem and the latter is the much less-studied optimal placement problem.
Deep Reinforcement Learning for Optimal Order Placement in a Limit Order Book https://videos.re-work.co/videos/426-deep-reinforcement-learning-for-optimal-order-placement-in-a-limit-order-book https://docs.google.com/presentation/d/1bsK-3GTvgtpE0WJOrue1u7ZsacftzSi_JGNSnLTdayY/edit#slide=id.p