added implementation of demand estimator

pascalwhoop · pascalwhoop · commit 2afa67a1dfb4 · 2018-04-11T20:53:52.000+02:00
diff --git a/.gitignore b/.gitignore
@@ -8,6 +8,7 @@ tex/MITVersion/
 .DS_STORE
 node_modules/
 *.swp
+*.swo
 
 #latex ignores
 *.glsdefs
diff --git a/src/chaps/backpropagation.tex b/src/chaps/backpropagation.tex
@@ -29,8 +29,8 @@
 concept of gradient descent algorithms. Because the activation function is most often \emph{soft}, to ensure
 differentiability and because a hard threshold creates a non-continuous function, the process of fitting the weights to
 minimize loss is called logistic regression \cite[p.729f.]{russell2016artificial}. For a detailed explanation of the gradient
-descent approach, I will refer to the works of \citeauthor{russell2016artificial} as well as
-\citeauthor{Goodfellow-et-al-2016}. 
+descent approach, I will refer to the works of \citet{russell2016artificial} as well as
+\citet{Goodfellow-et-al-2016}. 
 
  
 
diff --git a/src/chaps/body.tex b/src/chaps/body.tex
@@ -20,6 +20,10 @@ \subsection{Learning Neural Networks and Backpropagation}
 \label{sec:Backpropagation}
 \input{chaps/backpropagation.tex}
 
+\section{Recurrent Neural Networks}%
+\label{sec:recurrent_neural_networks}
+\input{chaps/recurrentnn.tex}
+
 %TODO is this part of AI?
 \chapter{Reinforcement Learning}
 \section{Policy Search}
diff --git a/src/chaps/implementation.tex b/src/chaps/implementation.tex
@@ -35,11 +35,17 @@ \section{Preprocessing}
 After the translation, the data is usually structured in a multi-dimensional array which can be read by numpy and
 processed with Keras. First, some preprocessing can be applied with scikit-learn to analyze the structure of the data as
 well as ensure the values that are fed to the \ac {NN} don't negatively impact the learning progress. The overall
-approach follows the recommendations of \citeauthor{Goodfellow-et-al-2016}.  
+approach follows the recommendations of \citet{Goodfellow-et-al-2016}.  
 
 \section{Connecting Python agents to PowerTAC}
 
-To connect an agent based on Python to the \ac{PowerTAC} systems, a new adapter needs to be developed. In 2018, a simple bridge was provided by the team that allowed external processes to communicate with the system through a bridge via the provided sample-broker. All messages received by the broker are written to a First in First Out pipe on the local file system and a second pipe is created to read messages from the external process. To also allow network based access, I created an alternative which is based on \ac{GRPC} to transmit the messages between the adapter and the final client. This lets many different languages communicate with the adapter via network connections \footnote{https://github.com/powertac/broker-adapter} 
+To connect an agent based on Python to the \ac{PowerTAC} systems, a new adapter needs to be developed. In 2018, a simple
+bridge was provided by the team that allowed external processes to communicate with the system through a bridge via the
+provided sample-broker. All messages received by the broker are written to a First in First Out pipe on the local file
+system and a second pipe is created to read messages from the external process. To also allow network based access, I
+created an alternative which is based on \ac{GRPC} to transmit the messages between the adapter and the final client.
+This lets many different languages communicate with the adapter via network connections
+\footnote{\url{https://github.com/powertac/broker-adapter-grpc} }
 
 Because the programming language is different from the supplied sample-broker, many of the domain objects need to be redefined and some code redeveloped. The classes in \ac {PowerTAC} which are transfered between the client and the server are all annotated so that the xml serializer can translate between the xml and object variants without errors. This helps to recreate a similar functionality for the needed classes in the python environment. If the project was started again today, it might have been simpler to first define a set of message types in a language such as Protocoll Buffers, the underlying technology of \ac {GRPC}, but because all current systems rely on \ac {JMS} communication, it is better to manually recreate these translators. The \ac {XML} parsing libraries provided by Python can be used to parse the \ac {XML} that is received.
 \section{Paralleling environments with Kubernetes}
@@ -56,15 +62,15 @@ \section{Agent Models}
 
 -- high-level agent \ac {RL} problem
 
-While \citeauthor{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} with all three markets
+While \citet{tactexurieli2016mdp} have defined the entire simulation as a \ac {POMDP} (although they interpret it as a \ac {MDP} for ease of implementation) with all three markets
 integrated into one problem, I believe breaking the problem into disjunct subproblems is a better approach as each of
 them can be looked at in separation and a learning algorithm can be applied to improve performance without needing to
 consider potentially other areas of decision making. One such example is the estimation of fitness for a given tariff in
 a given environment. A tariffs' competitiveness in a given environment is independent of the wholesale or balancing
 trading strategy of the agent since the customers do not care about the profitability of the agent or how often it
-receives balancing penalties.
+receives balancing penalties. While the broker might incur large losses if a tariff is too competitive (by offering prices that are below the profitablity line of the broker), such a tariff would theoretically be quiet competitive and should therefore be rated as such. The question which of the tariffs to actually offer on the market is a separate problem.
 
-\subsection{Customer Market}
+\subsection{Tariff Market}
 
 The goal of the customer market is to get as many subscribers as possible for the most profitable tariffs the broker
 offers on the market. The tariffs offered in the market compete for the limited number of customers available and every
@@ -91,10 +97,11 @@ \subsection{Customer Market}
     %tariffs %TODO really, I go genetic?
 \end{enumerate}
 
+%TODO not yet actually realized, still applicable?
 \subsubsection{Tariff fitness learning}
 To learn the fitness of a tariff while considering its environment, supervised learning techniques can be applied. To do
 this, features need to be created from the tariffs specifications and its competitive environment. Similar work has been
-done by \citeauthor{cuevas2015distributed} who discretized the tariff market in four variables describing the
+done by \citet{cuevas2015distributed} who discretized the tariff market in four variables describing the
 relationships between the competitors and their broker.   
 
 For my broker, because \ac {NN} can handle a large state spaces, I create a more detailed description of the
@@ -124,6 +131,26 @@ \subsubsection{Tariff fitness learning}
 % how to avoid overwhelming of agent? output layer must be fairly large. 
 %
 % time, energy, money, communication dimensions (and subdimensions)
+\subsubsection{Customer demand estimation}%
+\label{ssub:customer_demand_estimation}
+
+The simplest learning component is the demand estimator. This component has no dependencies onto the other learning components and can easily be trained using historical data. This is due to the fact that the demand of a customer is only dependent on variables that are already provided in the state files of previous simulations. A customer will not use a different amount of energy if the broker implementation changes but all other variables (such as subscribed tariff, weather etc.) remain equal .
+
+To train a model that predicts the demand amounts of customers under various conditions, a dataset of features and labels needs to be created. Because the model may also learn during the course of a running competition (allowing the model to adapt to new customer patterns), a generator based structure should be preferred. This means that a generator exists that creates $x, y$ pairs for the model to train on.
+
+According to the simulation specification, the customer models generate their demand pattern based on their internal structure, broker factors and game factors \cite[]{ketter2018powertac}. The preprocessing pipeline therefore generates feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the label while all other components are part of the features that are used to train the model. The intuitive model class for demand patterns prediction are \ac {RNN} due to the sequential nature of the problem \cite[]{EvalGRU2014}. However, as will later be shown, the implementation of relatively shallow dense classic \ac {NN} also results in decent results. 
+
+\begin{figure}[h]
+	\centering
+	\includegraphics[width=0.8\linewidth]{img/UsageEstimator.png}
+	\caption{Demand Estimator structure}
+	\label{fig:DemandEstimator}
+\end{figure}
+
+
+The overall structure of the demand estimator component is shown in Figure~\ref{fig:DemandEstimator}. The model can be both trained offline based on the state files as well as online during the competition. This is possible because in both situations, the environment model of the agent is a continuous representation of the agents knowledge about the world. In fact, during the state file parsing, the environment may even hold information that the agent usually cannot observe in a competition environment. This is also the case for the demand learning, as the state files hold the demand realizations of all customers while the server during the competition only transmits the usage realizations of the customers that are subscribed to the agents tariffs. Regardless, this does not affect the ability to learn from the customers usage patterns in either setting. During a competition, the agent may learn from the realized usage of customers after each time slot is completed. Because this process may require some ressources, it is advantageous to first perform the prediction of the subscribed customers demands for the current time slot to pass this information to the wholesale component before training the model on the received meter readings
+\footnote{The component code can be found under \url{https://github.com/pascalwhoop/broker-python/tree/master/agent_components/demand}}.
+
 
 \subsection{Wholesale Market}
 \subsection{Balancing Market}
diff --git a/src/chaps/learning.tex b/src/chaps/learning.tex
@@ -1,4 +1,4 @@
-According to \citeauthor{russell2016artificial}, learning agents are those that
+According to \citet{russell2016artificial}, learning agents are those that
 \emph{improve their performance on future tasks after making observations about
 the world} \cite[p.693]{russell2016artificial}. Learning behavior is present in
 many species most notably humans. To create a learning algorithm means that the
diff --git a/src/chaps/neuralnetworks.tex b/src/chaps/neuralnetworks.tex
@@ -10,7 +10,7 @@
 \begin{figure}[]
     \centering
     \includegraphics[width=0.8\linewidth]{img/perceptron.png}
-    \caption{Model of the perceptron, taken from \citeauthor{russell2016artificial}.}
+    \caption{Model of the perceptron, taken from \citet{russell2016artificial}.}
     \label{fig:perceptron}
 \end{figure}
 
@@ -34,7 +34,7 @@
 \begin{figure}[]
     \centering
     \includegraphics[width=0.3\linewidth]{img/multilayer_nn.png}
-    \caption{Multi-layer neural network from \citeauthor{bengio2009learning} }
+    \caption{Multi-layer neural network from \citet{bengio2009learning} }
     \label{fig:multilayernn}
 \end{figure}
 
diff --git a/src/chaps/recurrentnn.tex b/src/chaps/recurrentnn.tex
@@ -0,0 +1,32 @@
+As was already noted in the previous chapter, \ac {NN} can be both acyclic and cyclic graphs. The
+\emph{vanilla} \ac {NN} is usually considered to be a acyclic feed-forward network, as it has no internal state and is
+therefore more suited to describe the concepts of how the networks operate. Especially in translation and text to speech
+applications though, \ac {RNN} are very popular as they are able to act on previously seen information in a sequence of
+data. Generally they are suitable for many applications where the data has some kind of time-dependent embedding
+\cite[p.373]{Goodfellow-et-al-2016}.
+
+A \ac {RNN}, therefore computes its output based on the weights $w_i$, commonly noted as $\theta$, it's current input
+$x^t$ and it's previous hidden units internal states $h^(t-1)$.
+
+\[
+    h^t = f(h^(t-1), x^t, \theta)
+\]
+
+The network generally learns to use $h^t$ to encode previously seen aspects relevant to the current task, although this
+is inherently lossy as the previous number of inputs (i.e. $\mid t-1\mid$) is arbitrary. Figure~\ref{fig:rnn_concept}
+shows this concept.
+
+\begin{figure}[]
+    \centering
+    \includegraphics[width=0.8\linewidth]{img/rnn_concept.png}
+    \caption{A recurrent neural network conceptualized. \emph{Left}: Circuit diagram where the black square represents a
+    1 time-step delay. \emph{Right:} The same network unfolded where each node represents a particular time instance.
+    Taken from \citet{Goodfellow-et-al-2016}.}
+    \label{fig:rnn_concept}
+\end{figure}
+
+The network structure has two benefits: Firstly, it allows for arbitrary sequence length, as the network size is
+dependent on the time-step specific input and not on the number of previous timesteps. Secondly, the same network with
+the same weights (or in mathematical terms the same transition function $f$) can be used during each time-step. This
+means: When a \ac {RNN} is fed a sequence of data, the weights will stay the same throughout the sequence. They can be
+updated after the entire sequence has been processed. 
diff --git a/src/chaps/supervisedlearning.tex b/src/chaps/supervisedlearning.tex
@@ -2,7 +2,7 @@
 future examples that might be of the same kind but not identical. Common
 examples of this form of learning include object recognition in images or
 time-series prediction. One of the most known examples to date is the Imagenet
-classification algorithm by \citeauthor{krizhevsky2012imagenet} which was one of
+classification algorithm by \citet{krizhevsky2012imagenet} which was one of
 the first \ac {NN} based algorithms to break a classificatin high-score on a
 popular image classification database. The goal is to correctly classify images
 according to a set of defined labels. If a picture of a dog is read by the \ac
diff --git a/src/head.tex b/src/head.tex
@@ -3,7 +3,7 @@
 \usepackage[nolist,nohyperlinks]{acronym}
 \usepackage{listings}
 \input{snippets/tikz.tex}
-\usepackage[numbers]{natbib}
+\usepackage[]{natbib}
 \usepackage{float}
 \usepackage{glossaries}
 \usepackage[hyphens]{url}
@@ -16,3 +16,8 @@
 
 \input{acronyms.tex}
 
+%adapting the article class to Ketter requirements
+%\usepackage{showframe}
+\usepackage[left=5cm, top=2cm, bottom=2cm, right=2cm]{geometry}
+\usepackage{setspace}
+\onehalfspacing
diff --git a/src/img/Agent.png b/src/img/Agent.png
diff --git a/src/img/rnn_concept.png b/src/img/rnn_concept.png

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-According to \citeauthor{russell2016artificial}, learning agents are those that`
	`1`	`+According to \citet{russell2016artificial}, learning agents are those that`
`2`	`2`	`\emph{improve their performance on future tasks after making observations about`
`3`	`3`	`the world} \cite[p.693]{russell2016artificial}. Learning behavior is present in`
`4`	`4`	`many species most notably humans. To create a learning algorithm means that the`