@@ -992,33 +992,77 @@ \subsubsection{Counterfactual analysis}%
992992
993993
994994
995- \subsection {Existing broker concepts }%
995+ \subsection {Existing broker implementations }%
996996\label {sub:existing_broker_concepts }
997997Before designing my own agent, it is helpful to investigate previously developed agents and their design to understand
998998the current state of research. For this, I have analyzed the papers of the AgentUDE, TacTex and COLDPower, as they
999- performed well in previous tournaments. Their architectures, models and performances are summarized in the following
1000- sections. These are based on publications that describe the TacTex, COLDPower and AgentUDE agents of 2015, as these are
1001- the last publications of these brokers that are available on the \ac {PowerTAC} website. Unfortunatley, the source code
1002- of these agents has not been made available, which does not allow introspection of the exact inner mechanics.
999+ performed well in previous tournaments and because their creators have published their concepts. Their architectures,
1000+ models and performances are summarized in the following sections. These are based on publications that describe the
1001+ TacTex, COLDPower and AgentUDE agents of 2015, as these are the last publications of these brokers that are available on
1002+ the \ac {PowerTAC} website. Unfortunatley, the source code of these agents has not been made available, which does not
1003+ allow introspection of the exact inner mechanics.
1004+
1005+ From what is visible by their shared binaries, all agents are based on java and do not employ any other technologies to
1006+ perform their actions during competitions.
1007+
1008+
1009+ \subsubsection {Tariff market strategies }%
1010+ \label {ssub:tariff_market_strategies }
1011+
1012+ AgentUDE deploys an agressive but rigid tariff market strategy, offering cheap tariffs at the beginning of the game to
1013+ trigger competing agents to react. It also places high transaction costs on the tariffs, by making use of early
1014+ withdrawl penalties and bonus payments \cite []{ozdemir2017strategy }. While this may be beneficial for the success in the
1015+ competition, it doesn't translate into real-world scenarios as energy markets are not a round based, finite game.
1016+
1017+ TacTex does not target tariff fees such as early withdrawl fees to make a profit. It also doesn't publish tariffs for
1018+ production of energy \cite []{tactexurieli2016mdp } although this is based on a 2016 paper and it is likely that the developers have improved
1019+ their algorithms in subsequent competitions. TacTex has modelled the entire competition as a \ac {MDP} and included the
1020+ tariff market actions in this model. It selects a tariff from a set of predefined fixed-rate consumption tariffs to
1021+ reduce the action space complexity of the agent. Ultimately though, it uses \ac {RL} to decide on its tariff market
1022+ actions, reducing the possible actions based on domain knowledge.
1023+
1024+ COLDPower also deploys \ac {RL} approaches with a Q-Learning based agent choosing from a range of predefined changes to
1025+ its existing tariff portfolio. It can perform the following actions: \emph {maintain, lower, raise, inline, minmax, wide,
1026+ bottom }. These actions describe fixed action strategies that have been constructed based on domain knowledge. The agent
1027+ is not \emph {learning } how to behave in the market on a low level but rather on a more abstract level. It can be
1028+ compared to an \ac {RL} agent that doesn't learn how to perform locomotion to move a controlable body through space but
1029+ rather one that may choose the direction of the walking, without the need to understand \emph {how } to walk. While this
1030+ leads to quick results, it may significantly reduce the possible performance as the solution space is greatly reduced.
1031+
1032+ \subsubsection {Wholesale market strategies }%
1033+ \label {ssub:wholesale_market_strategies }
1034+
1035+
1036+ AgentUDE considers the wholesale market to include both demand and price prediction. For the demand prediction, AgentUDE
1037+ uses a simple weighted estimation based on the previous time-step and the demand of 24 hours before the target time-step
1038+ \cite []{ozdemir2015winner }. Their price prediction is more complex and involves a dynamic programming model based on
1039+ \cite []{tesauro2002strategic } to find \emph {similar hours } in recent history and determine current prices using
1040+ Q-Learning \cite []{ozdemir2017strategy }. Their \ac {MDP} is constructed in a way that the agent needs to determine the
1041+ limit price that minimizes costs. It only has one action dimension which describes the limit price and its environment
1042+ observation is represented by a belief function $ f(s,a)$ which makes it a \ac {POMDP}. The agent uses value iteration to
1043+ solve the Bellman equations, determining the expected price. The ultimate limit prices are then determined based on a
1044+ heuristic that works by offering higher prices for "short-term" purchases and adjusting this to also offer higher prices
1045+ in the case of an expected higher overall trading volume \cite []{ozdemir2017strategy }.
1046+
1047+ TacTex considers the wholesale market actions to be part of the overal complexity reduced \ac {MDP}. It uses a demand
1048+ predictor to determine the \ac {mWh} amount to order and sets this amount as the amount that is placed in the order. The
1049+ predictor is based on the actual customer models of the simulation server itself. While this surely leads to good
1050+ performance, it can be argued whether this is something that actually benefits the research goal. The price predictor is
1051+ a linear regression model based on the bootstrap period, corrected by a bias correction based on the prediction error of
1052+ the last 24 hours \cite []{tactexurieli2016mdp }.
1053+
1054+ COLDPower deploys a linear regression model to predict prices and determines the demand by "using the energy demand
1055+ historical information" \cite []{cuevas2015distributed }. The order is placed accordingly.
10031056
10041057
1005-
1006-
1007-
1008- \subsubsection {Decision areas }%
1009- \label {ssub:decision_areas }
1010- Of the three main markets, all agents participate actively in the tariff market, only AgentUDE participates in the
1011- balancing market and obviously every agent participates in the customer market. The way each agent approaches the
1012- customer or tariff market is very different however.
1013- % TODO STOP GOOD NIGHT
1014-
1015-
1016-
1017- \subsubsection {Decision models }%
1018- \label {ssub:decision_models }
1019-
10201058\subsubsection {Past performances }%
10211059\label {ssub:past_performances }
1060+ % TODO STOP :
1061+ % some summary on how they performed in comparison to each other in past competitions.
1062+ % summary pages 2017 http://powertac.org/log_archive/PowerTAC_2017_finals.html
1063+ % 2016 http://powertac.org/log_archive/PowerTAC_2016_finals.html
1064+ % also in-depth graphs of the balances from data, polished up with some python
1065+
10221066
10231067\chapter {Implementation }
10241068\label {cha:implementation }
0 commit comments