You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/chaps/implementation.tex
+33-6Lines changed: 33 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -35,11 +35,17 @@ \section{Preprocessing}
35
35
After the translation, the data is usually structured in a multi-dimensional array which can be read by numpy and
36
36
processed with Keras. First, some preprocessing can be applied with scikit-learn to analyze the structure of the data as
37
37
well as ensure the values that are fed to the \ac {NN} don't negatively impact the learning progress. The overall
38
-
approach follows the recommendations of \citeauthor{Goodfellow-et-al-2016}.
38
+
approach follows the recommendations of \citet{Goodfellow-et-al-2016}.
39
39
40
40
\section{Connecting Python agents to PowerTAC}
41
41
42
-
To connect an agent based on Python to the \ac{PowerTAC} systems, a new adapter needs to be developed. In 2018, a simple bridge was provided by the team that allowed external processes to communicate with the system through a bridge via the provided sample-broker. All messages received by the broker are written to a First in First Out pipe on the local file system and a second pipe is created to read messages from the external process. To also allow network based access, I created an alternative which is based on \ac{GRPC} to transmit the messages between the adapter and the final client. This lets many different languages communicate with the adapter via network connections \footnote{https://github.com/powertac/broker-adapter}
42
+
To connect an agent based on Python to the \ac{PowerTAC} systems, a new adapter needs to be developed. In 2018, a simple
43
+
bridge was provided by the team that allowed external processes to communicate with the system through a bridge via the
44
+
provided sample-broker. All messages received by the broker are written to a First in First Out pipe on the local file
45
+
system and a second pipe is created to read messages from the external process. To also allow network based access, I
46
+
created an alternative which is based on \ac{GRPC} to transmit the messages between the adapter and the final client.
47
+
This lets many different languages communicate with the adapter via network connections
Because the programming language is different from the supplied sample-broker, many of the domain objects need to be redefined and some code redeveloped. The classes in \ac {PowerTAC} which are transfered between the client and the server are all annotated so that the xml serializer can translate between the xml and object variants without errors. This helps to recreate a similar functionality for the needed classes in the python environment. If the project was started again today, it might have been simpler to first define a set of message types in a language such as Protocoll Buffers, the underlying technology of \ac {GRPC}, but because all current systems rely on \ac {JMS} communication, it is better to manually recreate these translators. The \ac {XML} parsing libraries provided by Python can be used to parse the \ac {XML} that is received.
45
51
\section{Paralleling environments with Kubernetes}
@@ -56,15 +62,15 @@ \section{Agent Models}
56
62
57
63
-- high-level agent \ac {RL} problem
58
64
59
-
While \citeauthor{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} with all three markets
65
+
While \citet{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} (although they interpret it as a \ac {MDP} for ease of implementation) with all three markets
60
66
integrated into one problem, I believe breaking the problem into disjunct subproblems is a better approach as each of
61
67
them can be looked at in separation and a learning algorithm can be applied to improve performance without needing to
62
68
consider potentially other areas of decision making. One such example is the estimation of fitness for a given tariff in
63
69
a given environment. A tariffs' competitiveness in a given environment is independent of the wholesale or balancing
64
70
trading strategy of the agent since the customers do not care about the profitability of the agent or how often it
65
-
receives balancing penalties.
71
+
receives balancing penalties. While the broker might incur large losses if a tariff is too competitive (by offering prices that are below the profitablity line of the broker), such a tariff would theoretically be quiet competitive and should therefore be rated as such. The question which of the tariffs to actually offer on the market is a separate problem.
66
72
67
-
\subsection{Customer Market}
73
+
\subsection{Tariff Market}
68
74
69
75
The goal of the customer market is to get as many subscribers as possible for the most profitable tariffs the broker
70
76
offers on the market. The tariffs offered in the market compete for the limited number of customers available and every
@@ -91,10 +97,11 @@ \subsection{Customer Market}
91
97
%tariffs %TODO really, I go genetic?
92
98
\end{enumerate}
93
99
100
+
%TODO not yet actually realized, still applicable?
94
101
\subsubsection{Tariff fitness learning}
95
102
To learn the fitness of a tariff while considering its environment, supervised learning techniques can be applied. To do
96
103
this, features need to be created from the tariffs specifications and its competitive environment. Similar work has been
97
-
done by \citeauthor{cuevas2015distributed} who discretized the tariff market in four variables describing the
104
+
done by \citet{cuevas2015distributed} who discretized the tariff market in four variables describing the
98
105
relationships between the competitors and their broker.
99
106
100
107
For my broker, because \ac {NN} can handle a large state spaces, I create a more detailed description of the
% how to avoid overwhelming of agent? output layer must be fairly large.
125
132
%
126
133
% time, energy, money, communication dimensions (and subdimensions)
134
+
\subsubsection{Customer demand estimation}%
135
+
\label{ssub:customer_demand_estimation}
136
+
137
+
The simplest learning component is the demand estimator. This component has no dependencies onto the other learning components and can easily be trained using historical data. This is due to the fact that the demand of a customer is only dependent on variables that are already provided in the state files of previous simulations. A customer will not use a different amount of energy if the broker implementation changes but all other variables (such as subscribed tariff, weather etc.) remain equal .
138
+
139
+
To train a model that predicts the demand amounts of customers under various conditions, a dataset of features and labels needs to be created. Because the model may also learn during the course of a running competition (allowing the model to adapt to new customer patterns), a generator based structure should be preferred. This means that a generator exists that creates $x, y$ pairs for the model to train on.
140
+
141
+
According to the simulation specification, the customer models generate their demand pattern based on their internal structure, broker factors and game factors \cite[]{ketter2018powertac}. The preprocessing pipeline therefore generates feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the label while all other components are part of the features that are used to train the model. The intuitive model class for demand patterns prediction are \ac {RNN} due to the sequential nature of the problem \cite[]{EvalGRU2014}. However, as will later be shown, the implementation of relatively shallow dense classic \ac {NN} also results in decent results.
The overall structure of the demand estimator component is shown in Figure~\ref{fig:DemandEstimator}. The model can be both trained offline based on the state files as well as online during the competition. This is possible because in both situations, the environment model of the agent is a continuous representation of the agents knowledge about the world. In fact, during the state file parsing, the environment may even hold information that the agent usually cannot observe in a competition environment. This is also the case for the demand learning, as the state files hold the demand realizations of all customers while the server during the competition only transmits the usage realizations of the customers that are subscribed to the agents tariffs. Regardless, this does not affect the ability to learn from the customers usage patterns in either setting. During a competition, the agent may learn from the realized usage of customers after each time slot is completed. Because this process may require some ressources, it is advantageous to first perform the prediction of the subscribed customers demands for the current time slot to pass this information to the wholesale component before training the model on the received meter readings
152
+
\footnote{The component code can be found under \url{https://github.com/pascalwhoop/broker-python/tree/master/agent_components/demand}}.
0 commit comments