added a fair bit to implementation

pascalwhoop · pascalwhoop · commit ba7c5ef5afbe · 2018-04-16T12:27:49.000+02:00
diff --git a/src/acronyms.tex b/src/acronyms.tex
@@ -3,7 +3,11 @@
 	\acro {RL}       {Reinforcement Learning}
 	\acro {JMI}      {Java Message Service}
 	\acro {XML}      {Extensive Markup Language}
+	\acro {JSON}     {JavaScript Object Notation}
 	\acro {DeepRL}   {Deep Reinforcement Learning}
+	\acro {JAXB}     {Java Architecture for XML}
+	\acro {POJO}     {Plain Old Java Object}
+	\acro {POPO}     {Plain Old Python Object}
 	\acro {DU}       {Distribution Utility}
 	\acro {CHP}      {Combined Heat and Power Unit}
 	\acro {GRPC}     {Google Remote Process Call}
@@ -15,11 +19,11 @@
 	\acro {PPO}      {Proximal Policy Optimization}
 	\acro {POMDP}    {Partially Observable Markovian Decision Process}
 	\acro {GPU}      {Graphical Processing Unit}
-    \acro {UL}       {Unsupervised Learning}
-    \acro {SL}       {Supervised Learning}
-    \acro {RNN}      {Recurrent Neural Network}
-    \acro {LSTM}     {Long-Short Term Memory}
-    \acro {CNN}      {Convolutional Neural Network}
+        \acro {UL}       {Unsupervised Learning}
+        \acro {SL}       {Supervised Learning}
+        \acro {RNN}      {Recurrent Neural Network}
+        \acro {LSTM}     {Long-Short Term Memory}
+        \acro {CNN}      {Convolutional Neural Network}
 	\acro {CPU}      {Central Processing Unit}
 	\acro {TF}       {TensorFlow}
 	\acro {TPU}      {Tensor Processing Unit}
diff --git a/src/bibliography.bib b/src/bibliography.bib
@@ -247,6 +247,13 @@ @misc{openaigym
   Eprint = {arXiv:1606.01540},
 }
 
+@misc{jsonxml,
+	Author = {Tom Strassner},
+	Title = {\ac XML vs  JSON},
+	url= {\url{http://www.cs.tufts.edu/comp/150IDS/final_papers/tstras01.1/FinalReport/FinalReport.html}},
+	year = {2017}
+}
+
 @article{duan2017one,
   title={One-Shot Imitation Learning},
   author={Duan, Yan and Andrychowicz, Marcin and Stadie, Bradly and Ho, Jonathan and Schneider, Jonas and Sutskever, Ilya and Abbeel, Pieter and Zaremba, Wojciech},
@@ -268,6 +275,7 @@ @article{foerster2017learning
   year={2017}
 }
 
+
 @misc{tensorflow2015-whitepaper,
 title  = { {TensorFlow}: Large-Scale Machine Learning on Heterogeneous Systems},
 url    = {https://www.tensorflow.org/},
@@ -419,3 +427,21 @@ @inproceedings{cuevas2015distributed
   organization = {IEEE}
 }
 
+@article{EvalGRU2014,
+  author        = {Junyoung Chung and
+               {\c{C}}aglar G{\"{u}}l{\c{c}}ehre and
+               KyungHyun Cho and
+               Yoshua Bengio},
+  title         = {Empirical Evaluation of Gated Recurrent Neural Networks on Sequence
+               Modeling},
+  journal       = {CoRR},
+  volume        = {abs/1412.3555},
+  year          = {2014},
+  url           = {http://arxiv.org/abs/1412.3555},
+  archivePrefix = {arXiv},
+  eprint        = {1412.3555},
+  timestamp     = {Wed, 07 Jun 2017 14:40:04 +0200},
+  biburl        = {https://dblp.org/rec/bib/journals/corr/ChungGCB14},
+  bibsource     = {dblp computer science bibliography, https://dblp.org}
+}
+
diff --git a/src/chaps/body.tex b/src/chaps/body.tex
@@ -42,5 +42,6 @@ \chapter{Implementation}
 \input{chaps/implementation.tex}
 
 \chapter{Results}
+\input{chaps/results.tex}
 \chapter{Conclusion}
 
diff --git a/src/chaps/implementation.tex b/src/chaps/implementation.tex
@@ -35,19 +35,36 @@ \section{Preprocessing}
 After the translation, the data is usually structured in a multi-dimensional array which can be read by numpy and
 processed with Keras. First, some preprocessing can be applied with scikit-learn to analyze the structure of the data as
 well as ensure the values that are fed to the \ac {NN} don't negatively impact the learning progress. The overall
-approach follows the recommendations of \citet{Goodfellow-et-al-2016}.  
+approach follows the recommendations of \citep{Goodfellow-et-al-2016}.  
 
 \section{Connecting Python agents to PowerTAC}
 
 To connect an agent based on Python to the \ac{PowerTAC} systems, a new adapter needs to be developed. In 2018, a simple
 bridge was provided by the team that allowed external processes to communicate with the system through a bridge via the
 provided sample-broker. All messages received by the broker are written to a First in First Out pipe on the local file
-system and a second pipe is created to read messages from the external process. To also allow network based access, I
-created an alternative which is based on \ac{GRPC} to transmit the messages between the adapter and the final client.
-This lets many different languages communicate with the adapter via network connections
-\footnote{\url{https://github.com/powertac/broker-adapter-grpc} }
+system and a second pipe is created to read messages from the external process. This was the first approach towards opening up the simulation to other languages and development environments. 
 
-Because the programming language is different from the supplied sample-broker, many of the domain objects need to be redefined and some code redeveloped. The classes in \ac {PowerTAC} which are transfered between the client and the server are all annotated so that the xml serializer can translate between the xml and object variants without errors. This helps to recreate a similar functionality for the needed classes in the python environment. If the project was started again today, it might have been simpler to first define a set of message types in a language such as Protocoll Buffers, the underlying technology of \ac {GRPC}, but because all current systems rely on \ac {JMS} communication, it is better to manually recreate these translators. The \ac {XML} parsing libraries provided by Python can be used to parse the \ac {XML} that is received.
+As I am interested in writing my Agent using certain frameworks which are mainly developed and maintained in Python and because it is helpful to also allow access to the adapter via network interfaces (to allow for distributed execution of the components in e.g. cloud environments), I need to adapt this to allow network based access. In general the following problems need to be solved:
+
+\begin{itemize}
+	\item Java model classes should be reused if possible, automatically generating target language model definitions from the Java source code to avoid duplication of semantically identical information
+	\item Permit future developers using even more languages (such as C, R or Go) with little effort
+	\item Possibly lay the basis for a change of the communication technology of the entire simulation which is more language agnostic.
+\end{itemize}
+
+The first approach is based on \ac{GRPC} to transmit the messages between the Java sample-broker and the final client. For this, each \texttt{handleMessage} method in the three core classes of the sample-broker passes the received message along to the \ac {GRPC} infrastructure. While previous developers have handled these messages in the Java environment, I pass these messages to the ultimate environment by converting them into protobuf messages which are then sent to a connected broker who implements correpsonding handler methods in the target language. The advantage of this approach is that this theoretically allows the maintainers of the project to also adapt this approach the Java clients in general, which would then allow the makeshift Java \emph{bridge} to be avoided. The over-the-wire protocoll is also much more efficient (as the data is sent in a binary format) and the message structre is clearly documented in the \texttt{grpc\_messages.proto} file. The disadvantage is the need to translate each \ac{POJO} into a protobuf message and vice versa. This is however not different from the current XStream implementation which also requires the annotation of class files in Java to declare which properties are serialized and included in the \ac {XML} strings. If th project should adopt the \ac {GRPC} based communcation, the \ac {GRPC} architecture will then allow the server to be adressed by any of the supported languages.
+\footnote{Which as of today are: C++, Java, Python, Go, Ruby, C\#, Node.js, PHP and Dart}
+
+A second approach is quiet similar to the original bridge but instead of writing the \ac {XML} strings to the local file system, they are passed to the final environment via \ac {GRPC} by simple messages that just serve as a wrapper for the \ac {XML} string. While this is not elegant from a engineering perspective (\ac {GRPC} should be used on a method level and messages should not contain other message formats as strings), it is simple and may lead to quick results. A problem is that the resulting \ac {XML} will then have to be parsed in the Python broker. Before the introduction of other languages, the communication was basically an internal API and broker developers only needed to concern themselves with the handling of the Java \texttt{handleMessage} method . Therefore, no formal descriptions for the structure of the \ac {XML} messages exist. All \ac {XML} parsing would therefore be based on observable structures of the \ac {XML} which can be extracted from the sample-broker logs and all model classes need to be rewritten. Furthermore, agents wanting to use other programming languages would have to reimplement all of this again, with no core reuse possible.
+
+A final approach is the generation of schema definitions from the Java model classes that are transmitted between the brokers and the server. Generally, two human readable over-the-wire structures are reasonable: \ac {XML} and \ac{JSON}. \ac {XML} messages can be formally defined using \ac {XML} Schemas and the \ac{JAXB} project
+\footnote{\url{https://github.com/javaee/jaxb-v2}}
+offers to generate such schemas from Java class definitions. This however did not succeed for the \ac {PowerTAC} model definitions which lead me to create a question on StackOverflow, a discussion platform for programming questions. The resulting answer lead to the ultimate alternative which is the generation of \ac {JSON} schemas which can then be converted into Python class files  
+\footnote{\url{https://stackoverflow.com/questions/49630662/convert-java-class-structures-to-python-classes/49777613\#49777613}}.
+The choice of \ac {JSON} as the base communication protocoll might also be intelligent as a future choice two reasons: Firstly, it seems to be the more popular serialization protocol in comparison to \ac {XML} \citep{jsonxml} due to its easy readability and because it is more data efficient. Secondly, \ac {GRPC} can also transmit data in \ac {JSON} form and protobuf messages can easily be printed as \ac {JSON}, making both alternatives more interoperable
+\footnote{\url{https://github.com/powertac/broker-adapter-grpc} }.
+
+Because the programming language is different from the supplied sample-broker, many of the domain objects need to be redefined and some code redeveloped. The classes in \ac {PowerTAC} which are transfered between the client and the server are all annotated so that the xml serializer can translate between the xml and object variants without errors. This helps to recreate a similar functionality for the needed classes in the python environment. If the project was started again today, it might have been simpler to first define a set of message types in a language such as Protocoll Buffers, the underlying technology of \ac {GRPC}, but because all current systems rely on \ac {JMI} communication, it is better to manually recreate these translators. The \ac {XML} parsing libraries provided by Python can be used to parse the \ac {XML} that is received.
 \section{Paralleling environments with Kubernetes}
 
 \section{Agent Models}
@@ -62,7 +79,7 @@ \section{Agent Models}
 
 -- high-level agent \ac {RL} problem
 
-While \citet{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} (although they interpret it as a \ac {MDP} for ease of implementation) with all three markets
+While \citep{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} (although they interpret it as a \ac {MDP} for ease of implementation) with all three markets
 integrated into one problem, I believe breaking the problem into disjunct subproblems is a better approach as each of
 them can be looked at in separation and a learning algorithm can be applied to improve performance without needing to
 consider potentially other areas of decision making. One such example is the estimation of fitness for a given tariff in
@@ -101,7 +118,7 @@ \subsection{Tariff Market}
 \subsubsection{Tariff fitness learning}
 To learn the fitness of a tariff while considering its environment, supervised learning techniques can be applied. To do
 this, features need to be created from the tariffs specifications and its competitive environment. Similar work has been
-done by \citet{cuevas2015distributed} who discretized the tariff market in four variables describing the
+done by \citep{cuevas2015distributed} who discretized the tariff market in four variables describing the
 relationships between the competitors and their broker.   
 
 For my broker, because \ac {NN} can handle a large state spaces, I create a more detailed description of the
@@ -138,7 +155,7 @@ \subsubsection{Customer demand estimation}%
 
 To train a model that predicts the demand amounts of customers under various conditions, a dataset of features and labels needs to be created. Because the model may also learn during the course of a running competition (allowing the model to adapt to new customer patterns), a generator based structure should be preferred. This means that a generator exists that creates $x, y$ pairs for the model to train on.
 
-According to the simulation specification, the customer models generate their demand pattern based on their internal structure, broker factors and game factors \cite[]{ketter2018powertac}. The preprocessing pipeline therefore generates feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the label while all other components are part of the features that are used to train the model. The intuitive model class for demand patterns prediction are \ac {RNN} due to the sequential nature of the problem \cite[]{EvalGRU2014}. However, as will later be shown, the implementation of relatively shallow dense classic \ac {NN} also results in decent results. 
+According to the simulation specification, the customer models generate their demand pattern based on their internal structure, broker factors and game factors \citep[]{ketter2018powertac}. The preprocessing pipeline therefore generates feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the label while all other components are part of the features that are used to train the model. The intuitive model class for demand patterns prediction are \ac {RNN} due to the sequential nature of the problem \citep[]{EvalGRU2014}. However, as will later be shown, the implementation of relatively shallow dense classic \ac {NN} also results in decent results. 
 
 \begin{figure}[h]
 	\centering
diff --git a/src/chaps/learning.tex b/src/chaps/learning.tex
@@ -1,4 +1,4 @@
-According to \citet{russell2016artificial}, learning agents are those that
+According to \cite{russell2016artificial}, learning agents are those that
 \emph{improve their performance on future tasks after making observations about
 the world} \cite[p.693]{russell2016artificial}. Learning behavior is present in
 many species most notably humans. To create a learning algorithm means that the
diff --git a/src/chaps/neuralnetworks.tex b/src/chaps/neuralnetworks.tex
@@ -10,7 +10,7 @@
 \begin{figure}[]
     \centering
     \includegraphics[width=0.8\linewidth]{img/perceptron.png}
-    \caption{Model of the perceptron, taken from \citet{russell2016artificial}.}
+	\caption{Model of the perceptron, taken from \cite[]{russell2016artificial}.}
     \label{fig:perceptron}
 \end{figure}
 
@@ -34,7 +34,7 @@
 \begin{figure}[]
     \centering
     \includegraphics[width=0.3\linewidth]{img/multilayer_nn.png}
-    \caption{Multi-layer neural network from \citet{bengio2009learning} }
+	\caption{Multi-layer neural network from \cite[]{bengio2009learning} }
     \label{fig:multilayernn}
 \end{figure}
 
diff --git a/src/chaps/results.tex b/src/chaps/results.tex
@@ -0,0 +1,8 @@
+\section{Demand Estimator}%
+\label{sec:demand_estimator}
+
+\section{Wholesale Market}%
+\label{sec:wholesale_market}
+
+
+
diff --git a/src/chaps/supervisedlearning.tex b/src/chaps/supervisedlearning.tex
@@ -2,7 +2,7 @@
 future examples that might be of the same kind but not identical. Common
 examples of this form of learning include object recognition in images or
 time-series prediction. One of the most known examples to date is the Imagenet
-classification algorithm by \citet{krizhevsky2012imagenet} which was one of
+classification algorithm by \cite[]{krizhevsky2012imagenet} which was one of
 the first \ac {NN} based algorithms to break a classificatin high-score on a
 popular image classification database. The goal is to correctly classify images
 according to a set of defined labels. If a picture of a dog is read by the \ac
@@ -19,7 +19,8 @@
 The general problem of supervised learning is as follows:
 
 \begin{enumerate} 
-    \item Generation of a \emph{training set} that holds a set of input-output pairs $(x_1,y_1),(x_2,y_2),...$ 
+    \item Generation of a \emph{training set} that holds a set of input-output pairs \\
+	    $(x_1,y_1),(x_2,y_2),...$ 
     \item Training of algorithm against training set 
     \item Verification of results against previously unseen \emph{test set} 
 \end{enumerate}
diff --git a/src/img/UsageEstimator.png b/src/img/UsageEstimator.png

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-According to \citet{russell2016artificial}, learning agents are those that`
	`1`	`+According to \cite{russell2016artificial}, learning agents are those that`
`2`	`2`	`\emph{improve their performance on future tasks after making observations about`
`3`	`3`	`the world} \cite[p.693]{russell2016artificial}. Learning behavior is present in`
`4`	`4`	`many species most notably humans. To create a learning algorithm means that the`