Skip to content

Commit ba7c5ef

Browse files
committed
added a fair bit to implementation
1 parent 9e4f0f5 commit ba7c5ef

File tree

9 files changed

+76
-19
lines changed

9 files changed

+76
-19
lines changed

src/acronyms.tex

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,11 @@
33
\acro {RL} {Reinforcement Learning}
44
\acro {JMI} {Java Message Service}
55
\acro {XML} {Extensive Markup Language}
6+
\acro {JSON} {JavaScript Object Notation}
67
\acro {DeepRL} {Deep Reinforcement Learning}
8+
\acro {JAXB} {Java Architecture for XML}
9+
\acro {POJO} {Plain Old Java Object}
10+
\acro {POPO} {Plain Old Python Object}
711
\acro {DU} {Distribution Utility}
812
\acro {CHP} {Combined Heat and Power Unit}
913
\acro {GRPC} {Google Remote Process Call}
@@ -15,11 +19,11 @@
1519
\acro {PPO} {Proximal Policy Optimization}
1620
\acro {POMDP} {Partially Observable Markovian Decision Process}
1721
\acro {GPU} {Graphical Processing Unit}
18-
\acro {UL} {Unsupervised Learning}
19-
\acro {SL} {Supervised Learning}
20-
\acro {RNN} {Recurrent Neural Network}
21-
\acro {LSTM} {Long-Short Term Memory}
22-
\acro {CNN} {Convolutional Neural Network}
22+
\acro {UL} {Unsupervised Learning}
23+
\acro {SL} {Supervised Learning}
24+
\acro {RNN} {Recurrent Neural Network}
25+
\acro {LSTM} {Long-Short Term Memory}
26+
\acro {CNN} {Convolutional Neural Network}
2327
\acro {CPU} {Central Processing Unit}
2428
\acro {TF} {TensorFlow}
2529
\acro {TPU} {Tensor Processing Unit}

src/bibliography.bib

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -247,6 +247,13 @@ @misc{openaigym
247247
Eprint = {arXiv:1606.01540},
248248
}
249249

250+
@misc{jsonxml,
251+
Author = {Tom Strassner},
252+
Title = {\ac XML vs JSON},
253+
url= {\url{http://www.cs.tufts.edu/comp/150IDS/final_papers/tstras01.1/FinalReport/FinalReport.html}},
254+
year = {2017}
255+
}
256+
250257
@article{duan2017one,
251258
title={One-Shot Imitation Learning},
252259
author={Duan, Yan and Andrychowicz, Marcin and Stadie, Bradly and Ho, Jonathan and Schneider, Jonas and Sutskever, Ilya and Abbeel, Pieter and Zaremba, Wojciech},
@@ -268,6 +275,7 @@ @article{foerster2017learning
268275
year={2017}
269276
}
270277

278+
271279
@misc{tensorflow2015-whitepaper,
272280
title = { {TensorFlow}: Large-Scale Machine Learning on Heterogeneous Systems},
273281
url = {https://www.tensorflow.org/},
@@ -419,3 +427,21 @@ @inproceedings{cuevas2015distributed
419427
organization = {IEEE}
420428
}
421429

430+
@article{EvalGRU2014,
431+
author = {Junyoung Chung and
432+
{\c{C}}aglar G{\"{u}}l{\c{c}}ehre and
433+
KyungHyun Cho and
434+
Yoshua Bengio},
435+
title = {Empirical Evaluation of Gated Recurrent Neural Networks on Sequence
436+
Modeling},
437+
journal = {CoRR},
438+
volume = {abs/1412.3555},
439+
year = {2014},
440+
url = {http://arxiv.org/abs/1412.3555},
441+
archivePrefix = {arXiv},
442+
eprint = {1412.3555},
443+
timestamp = {Wed, 07 Jun 2017 14:40:04 +0200},
444+
biburl = {https://dblp.org/rec/bib/journals/corr/ChungGCB14},
445+
bibsource = {dblp computer science bibliography, https://dblp.org}
446+
}
447+

src/chaps/body.tex

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,5 +42,6 @@ \chapter{Implementation}
4242
\input{chaps/implementation.tex}
4343

4444
\chapter{Results}
45+
\input{chaps/results.tex}
4546
\chapter{Conclusion}
4647

src/chaps/implementation.tex

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,19 +35,36 @@ \section{Preprocessing}
3535
After the translation, the data is usually structured in a multi-dimensional array which can be read by numpy and
3636
processed with Keras. First, some preprocessing can be applied with scikit-learn to analyze the structure of the data as
3737
well as ensure the values that are fed to the \ac {NN} don't negatively impact the learning progress. The overall
38-
approach follows the recommendations of \citet{Goodfellow-et-al-2016}.
38+
approach follows the recommendations of \citep{Goodfellow-et-al-2016}.
3939

4040
\section{Connecting Python agents to PowerTAC}
4141

4242
To connect an agent based on Python to the \ac{PowerTAC} systems, a new adapter needs to be developed. In 2018, a simple
4343
bridge was provided by the team that allowed external processes to communicate with the system through a bridge via the
4444
provided sample-broker. All messages received by the broker are written to a First in First Out pipe on the local file
45-
system and a second pipe is created to read messages from the external process. To also allow network based access, I
46-
created an alternative which is based on \ac{GRPC} to transmit the messages between the adapter and the final client.
47-
This lets many different languages communicate with the adapter via network connections
48-
\footnote{\url{https://github.com/powertac/broker-adapter-grpc} }
45+
system and a second pipe is created to read messages from the external process. This was the first approach towards opening up the simulation to other languages and development environments.
4946

50-
Because the programming language is different from the supplied sample-broker, many of the domain objects need to be redefined and some code redeveloped. The classes in \ac {PowerTAC} which are transfered between the client and the server are all annotated so that the xml serializer can translate between the xml and object variants without errors. This helps to recreate a similar functionality for the needed classes in the python environment. If the project was started again today, it might have been simpler to first define a set of message types in a language such as Protocoll Buffers, the underlying technology of \ac {GRPC}, but because all current systems rely on \ac {JMS} communication, it is better to manually recreate these translators. The \ac {XML} parsing libraries provided by Python can be used to parse the \ac {XML} that is received.
47+
As I am interested in writing my Agent using certain frameworks which are mainly developed and maintained in Python and because it is helpful to also allow access to the adapter via network interfaces (to allow for distributed execution of the components in e.g. cloud environments), I need to adapt this to allow network based access. In general the following problems need to be solved:
48+
49+
\begin{itemize}
50+
\item Java model classes should be reused if possible, automatically generating target language model definitions from the Java source code to avoid duplication of semantically identical information
51+
\item Permit future developers using even more languages (such as C, R or Go) with little effort
52+
\item Possibly lay the basis for a change of the communication technology of the entire simulation which is more language agnostic.
53+
\end{itemize}
54+
55+
The first approach is based on \ac{GRPC} to transmit the messages between the Java sample-broker and the final client. For this, each \texttt{handleMessage} method in the three core classes of the sample-broker passes the received message along to the \ac {GRPC} infrastructure. While previous developers have handled these messages in the Java environment, I pass these messages to the ultimate environment by converting them into protobuf messages which are then sent to a connected broker who implements correpsonding handler methods in the target language. The advantage of this approach is that this theoretically allows the maintainers of the project to also adapt this approach the Java clients in general, which would then allow the makeshift Java \emph{bridge} to be avoided. The over-the-wire protocoll is also much more efficient (as the data is sent in a binary format) and the message structre is clearly documented in the \texttt{grpc\_messages.proto} file. The disadvantage is the need to translate each \ac{POJO} into a protobuf message and vice versa. This is however not different from the current XStream implementation which also requires the annotation of class files in Java to declare which properties are serialized and included in the \ac {XML} strings. If th project should adopt the \ac {GRPC} based communcation, the \ac {GRPC} architecture will then allow the server to be adressed by any of the supported languages.
56+
\footnote{Which as of today are: C++, Java, Python, Go, Ruby, C\#, Node.js, PHP and Dart}
57+
58+
A second approach is quiet similar to the original bridge but instead of writing the \ac {XML} strings to the local file system, they are passed to the final environment via \ac {GRPC} by simple messages that just serve as a wrapper for the \ac {XML} string. While this is not elegant from a engineering perspective (\ac {GRPC} should be used on a method level and messages should not contain other message formats as strings), it is simple and may lead to quick results. A problem is that the resulting \ac {XML} will then have to be parsed in the Python broker. Before the introduction of other languages, the communication was basically an internal API and broker developers only needed to concern themselves with the handling of the Java \texttt{handleMessage} method . Therefore, no formal descriptions for the structure of the \ac {XML} messages exist. All \ac {XML} parsing would therefore be based on observable structures of the \ac {XML} which can be extracted from the sample-broker logs and all model classes need to be rewritten. Furthermore, agents wanting to use other programming languages would have to reimplement all of this again, with no core reuse possible.
59+
60+
A final approach is the generation of schema definitions from the Java model classes that are transmitted between the brokers and the server. Generally, two human readable over-the-wire structures are reasonable: \ac {XML} and \ac{JSON}. \ac {XML} messages can be formally defined using \ac {XML} Schemas and the \ac{JAXB} project
61+
\footnote{\url{https://github.com/javaee/jaxb-v2}}
62+
offers to generate such schemas from Java class definitions. This however did not succeed for the \ac {PowerTAC} model definitions which lead me to create a question on StackOverflow, a discussion platform for programming questions. The resulting answer lead to the ultimate alternative which is the generation of \ac {JSON} schemas which can then be converted into Python class files
63+
\footnote{\url{https://stackoverflow.com/questions/49630662/convert-java-class-structures-to-python-classes/49777613\#49777613}}.
64+
The choice of \ac {JSON} as the base communication protocoll might also be intelligent as a future choice two reasons: Firstly, it seems to be the more popular serialization protocol in comparison to \ac {XML} \citep{jsonxml} due to its easy readability and because it is more data efficient. Secondly, \ac {GRPC} can also transmit data in \ac {JSON} form and protobuf messages can easily be printed as \ac {JSON}, making both alternatives more interoperable
65+
\footnote{\url{https://github.com/powertac/broker-adapter-grpc} }.
66+
67+
Because the programming language is different from the supplied sample-broker, many of the domain objects need to be redefined and some code redeveloped. The classes in \ac {PowerTAC} which are transfered between the client and the server are all annotated so that the xml serializer can translate between the xml and object variants without errors. This helps to recreate a similar functionality for the needed classes in the python environment. If the project was started again today, it might have been simpler to first define a set of message types in a language such as Protocoll Buffers, the underlying technology of \ac {GRPC}, but because all current systems rely on \ac {JMI} communication, it is better to manually recreate these translators. The \ac {XML} parsing libraries provided by Python can be used to parse the \ac {XML} that is received.
5168
\section{Paralleling environments with Kubernetes}
5269

5370
\section{Agent Models}
@@ -62,7 +79,7 @@ \section{Agent Models}
6279

6380
-- high-level agent \ac {RL} problem
6481

65-
While \citet{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} (although they interpret it as a \ac {MDP} for ease of implementation) with all three markets
82+
While \citep{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} (although they interpret it as a \ac {MDP} for ease of implementation) with all three markets
6683
integrated into one problem, I believe breaking the problem into disjunct subproblems is a better approach as each of
6784
them can be looked at in separation and a learning algorithm can be applied to improve performance without needing to
6885
consider potentially other areas of decision making. One such example is the estimation of fitness for a given tariff in
@@ -101,7 +118,7 @@ \subsection{Tariff Market}
101118
\subsubsection{Tariff fitness learning}
102119
To learn the fitness of a tariff while considering its environment, supervised learning techniques can be applied. To do
103120
this, features need to be created from the tariffs specifications and its competitive environment. Similar work has been
104-
done by \citet{cuevas2015distributed} who discretized the tariff market in four variables describing the
121+
done by \citep{cuevas2015distributed} who discretized the tariff market in four variables describing the
105122
relationships between the competitors and their broker.
106123

107124
For my broker, because \ac {NN} can handle a large state spaces, I create a more detailed description of the
@@ -138,7 +155,7 @@ \subsubsection{Customer demand estimation}%
138155

139156
To train a model that predicts the demand amounts of customers under various conditions, a dataset of features and labels needs to be created. Because the model may also learn during the course of a running competition (allowing the model to adapt to new customer patterns), a generator based structure should be preferred. This means that a generator exists that creates $x, y$ pairs for the model to train on.
140157

141-
According to the simulation specification, the customer models generate their demand pattern based on their internal structure, broker factors and game factors \cite[]{ketter2018powertac}. The preprocessing pipeline therefore generates feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the label while all other components are part of the features that are used to train the model. The intuitive model class for demand patterns prediction are \ac {RNN} due to the sequential nature of the problem \cite[]{EvalGRU2014}. However, as will later be shown, the implementation of relatively shallow dense classic \ac {NN} also results in decent results.
158+
According to the simulation specification, the customer models generate their demand pattern based on their internal structure, broker factors and game factors \citep[]{ketter2018powertac}. The preprocessing pipeline therefore generates feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the label while all other components are part of the features that are used to train the model. The intuitive model class for demand patterns prediction are \ac {RNN} due to the sequential nature of the problem \citep[]{EvalGRU2014}. However, as will later be shown, the implementation of relatively shallow dense classic \ac {NN} also results in decent results.
142159

143160
\begin{figure}[h]
144161
\centering

src/chaps/learning.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
According to \citet{russell2016artificial}, learning agents are those that
1+
According to \cite{russell2016artificial}, learning agents are those that
22
\emph{improve their performance on future tasks after making observations about
33
the world} \cite[p.693]{russell2016artificial}. Learning behavior is present in
44
many species most notably humans. To create a learning algorithm means that the

src/chaps/neuralnetworks.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
\begin{figure}[]
1111
\centering
1212
\includegraphics[width=0.8\linewidth]{img/perceptron.png}
13-
\caption{Model of the perceptron, taken from \citet{russell2016artificial}.}
13+
\caption{Model of the perceptron, taken from \cite[]{russell2016artificial}.}
1414
\label{fig:perceptron}
1515
\end{figure}
1616

@@ -34,7 +34,7 @@
3434
\begin{figure}[]
3535
\centering
3636
\includegraphics[width=0.3\linewidth]{img/multilayer_nn.png}
37-
\caption{Multi-layer neural network from \citet{bengio2009learning} }
37+
\caption{Multi-layer neural network from \cite[]{bengio2009learning} }
3838
\label{fig:multilayernn}
3939
\end{figure}
4040

src/chaps/results.tex

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
\section{Demand Estimator}%
2+
\label{sec:demand_estimator}
3+
4+
\section{Wholesale Market}%
5+
\label{sec:wholesale_market}
6+
7+
8+

src/chaps/supervisedlearning.tex

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
future examples that might be of the same kind but not identical. Common
33
examples of this form of learning include object recognition in images or
44
time-series prediction. One of the most known examples to date is the Imagenet
5-
classification algorithm by \citet{krizhevsky2012imagenet} which was one of
5+
classification algorithm by \cite[]{krizhevsky2012imagenet} which was one of
66
the first \ac {NN} based algorithms to break a classificatin high-score on a
77
popular image classification database. The goal is to correctly classify images
88
according to a set of defined labels. If a picture of a dog is read by the \ac
@@ -19,7 +19,8 @@
1919
The general problem of supervised learning is as follows:
2020

2121
\begin{enumerate}
22-
\item Generation of a \emph{training set} that holds a set of input-output pairs $(x_1,y_1),(x_2,y_2),...$
22+
\item Generation of a \emph{training set} that holds a set of input-output pairs \\
23+
$(x_1,y_1),(x_2,y_2),...$
2324
\item Training of algorithm against training set
2425
\item Verification of results against previously unseen \emph{test set}
2526
\end{enumerate}

src/img/UsageEstimator.png

84.9 KB
Loading

0 commit comments

Comments
 (0)