@@ -779,7 +779,7 @@ \section{PowerTAC: A Competitive Simulation}%
779779% -------------------------------------------------------------------------------
780780NOTES :
781781 - pretty much complete.
782- - missing: analysis of competing broker behaviors
782+ - missing: analysis of competing broker perforances
783783% -------------------------------------------------------------------------------
784784In the following chapter, I will introduce the \acf {PowerTAC}. It's simulating a liberalized retail electrical energy
785785market where multiple autonomous agents compete in different markets. Firstly, a retail market where agents, or
@@ -1003,29 +1003,69 @@ \subsection{Existing broker concepts}%
10031003\label {sub:existing_broker_concepts }
10041004Before designing my own agent, it is helpful to investigate previously developed agents and their design to understand
10051005the current state of research. For this, I have analyzed the papers of the AgentUDE, TacTex and COLDPower, as they
1006- performed well in previous tournaments. Their architectures, models and performances are summarized in the following
1007- sections. These are based on publications that describe the TacTex, COLDPower and AgentUDE agents of 2015, as these are
1008- the last publications of these brokers that are available on the \ac {PowerTAC} website. Unfortunatley, the source code
1009- of these agents has not been made available, which does not allow introspection of the exact inner mechanics.
1006+ performed well in previous tournaments and because their creators have published their concepts. Their architectures,
1007+ models and performances are summarized in the following sections. These are based on publications that describe the
1008+ TacTex, COLDPower and AgentUDE agents of 2015, as these are the last publications of these brokers that are available on
1009+ the \ac {PowerTAC} website. Unfortunatley, the source code of these agents has not been made available, which does not
1010+ allow introspection of the exact inner mechanics.
1011+
1012+ From what is visible by their shared binaries, all agents are based on java and do not employ any other technologies to
1013+ perform their actions during competitions.
1014+
1015+
1016+ \subsubsection {Tariff market strategies }%
1017+ \label {ssub:tariff_market_strategies }
1018+
1019+ AgentUDE deploys an agressive but rigid tariff market strategy, offering cheap tariffs at the beginning of the game to
1020+ trigger competing agents to react. It also places high transaction costs on the tariffs, by making use of early
1021+ withdrawl penalties and bonus payments \cite []{ozdemir2017strategy }. While this may be beneficial for the success in the
1022+ competition, it doesn't translate into real-world scenarios as energy markets are not a round based, finite game.
1023+
1024+ TacTex does not target tariff fees such as early withdrawl fees to make a profit. It also doesn't publish tariffs for
1025+ production of energy \cite []{tactexurieli2016mdp } although this is based on a 2016 paper and it is likely that the developers have improved
1026+ their algorithms in subsequent competitions. TacTex has modelled the entire competition as a \ac {MDP} and included the
1027+ tariff market actions in this model. It selects a tariff from a set of predefined fixed-rate consumption tariffs to
1028+ reduce the action space complexity of the agent. Ultimately though, it uses \ac {RL} to decide on its tariff market
1029+ actions, reducing the possible actions based on domain knowledge.
1030+
1031+ COLDPower also deploys \ac {RL} approaches with a Q-Learning based agent choosing from a range of predefined changes to
1032+ its existing tariff portfolio. It can perform the following actions: \emph {maintain, lower, raise, inline, minmax, wide,
1033+ bottom }. These actions describe fixed action strategies that have been constructed based on domain knowledge. The agent
1034+ is not \emph {learning } how to behave in the market on a low level but rather on a more abstract level. It can be
1035+ compared to an \ac {RL} agent that doesn't learn how to perform locomotion to move a controlable body through space but
1036+ rather one that may choose the direction of the walking, without the need to understand \emph {how } to walk. While this
1037+ leads to quick results, it may significantly reduce the possible performance as the solution space is greatly reduced.
1038+
1039+ \subsubsection {Wholesale market strategies }%
1040+ \label {ssub:wholesale_market_strategies }
1041+
1042+
1043+ AgentUDE considers the wholesale market to include both demand and price prediction. For the demand prediction, AgentUDE
1044+ uses a simple weighted estimation based on the previous time-step and the demand of 24 hours before the target time-step
1045+ \cite []{ozdemir2015winner }. Their price prediction is more complex and involves a dynamic programming model based on
1046+ \cite []{tesauro2002strategic } to find \emph {similar hours } in recent history and determine current prices using
1047+ Q-Learning \cite []{ozdemir2017strategy }. Their \ac {MDP} is constructed in a way that the agent needs to determine the
1048+ limit price that minimizes costs. It only has one action dimension which describes the limit price and its environment
1049+ observation is represented by a belief function $ f(s,a)$ which makes it a \ac {POMDP}. The agent uses value iteration to
1050+ solve the Bellman equations, determining the expected price. The ultimate limit prices are then determined based on a
1051+ heuristic that works by offering higher prices for "short-term" purchases and adjusting this to also offer higher prices
1052+ in the case of an expected higher overall trading volume \cite []{ozdemir2017strategy }.
1053+
1054+ TacTex considers the wholesale market actions to be part of the overal complexity reduced \ac {MDP}. It uses a demand
1055+ predictor to determine the \ac {mWh} amount to order and sets this amount as the amount that is placed in the order. The
1056+ predictor is based on the actual customer models of the simulation server itself. While this surely leads to good
1057+ performance, it can be argued whether this is something that actually benefits the research goal. The price predictor is
1058+ a linear regression model based on the bootstrap period, corrected by a bias correction based on the prediction error of
1059+ the last 24 hours \cite []{tactexurieli2016mdp }.
1060+
1061+ COLDPower deploys a linear regression model to predict prices and determines the demand by "using the energy demand
1062+ historical information" \cite []{cuevas2015distributed }. The order is placed accordingly.
10101063
10111064
1012-
1013-
1014-
1015- \subsubsection {Decision areas }%
1016- \label {ssub:decision_areas }
1017- Of the three main markets, all agents participate actively in the tariff market, only AgentUDE participates in the
1018- balancing market and obviously every agent participates in the customer market. The way each agent approaches the
1019- customer or tariff market is very different however.
1020- % TODO STOP GOOD NIGHT
1021-
1022-
1023-
1024- \subsubsection {Decision models }%
1025- \label {ssub:decision_models }
1026-
10271065\subsubsection {Past performances }%
10281066\label {ssub:past_performances }
1067+ % TODO STOP :
1068+ % some summary on how they performed in comparison to each other in past competitions.
10291069
10301070\chapter {Implementation }
10311071\label {cha:implementation }
0 commit comments