Skip to content

Commit 976c870

Browse files
Pascal BrokmeierPascal Brokmeier
authored andcommitted
good night
1 parent 4d9c72f commit 976c870

File tree

1 file changed

+65
-58
lines changed

1 file changed

+65
-58
lines changed

src/chaps/implementation.tex

Lines changed: 65 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -307,74 +307,81 @@ \section{Learning Components}
307307
such a tariff would theoretically be quiet competitive and should therefore be rated as such. The question which of the
308308
tariffs to actually offer on the market is a separate problem, that balances competitiveness against profitability.
309309

310-
\subsection{Tariff Market}
311-
312-
The goal of the customer market is to get as many subscribers as possible for the most profitable tariffs the broker
313-
offers on the market. The tariffs offered in the market compete for the limited number of customers available and every
314-
customer must be subscribed to some tariff. The profitability of tariffs is limited by the base tariff which is offered
315-
by the simulation as a constant offering creating an upper bound on profitability.
316-
317-
To succeed in the customer market, the agents needs to be able to generate tariffs that are competitive. This can be
318-
broken down into two subtasks: Generating valid tariffs and evaluating their competitiveness. A tariff can be verified
319-
by passing it to the \ac {PowerTAC} server which verifies the tariff. Hence, a \ac {RL} algorithm that is tasked with
320-
creating competitive tariffs can be given feedback by penalizing non-conclusive tariffs. An invalid tariff could be one
321-
that contains overlapping rates leading to an ambivalent status. The competitiveness of a tariff depends not only on the
322-
attributes of the tariff but also on the competition environment. If the broker only competes against the default
323-
tariffs, even many mediocre tariff offerings would perform well. In an environment with many competitors on the other
324-
hand, a tariff needs to be well designed to generate profits.
325-
326-
The agents learning task for the customer market is therefore designed in the following way:
327-
328-
\begin{enumerate} \item Learning to evaluate a tariffs competitiveness in relation to the competitive environment
329-
through supervised learning on the historical state logs of previous competitions \item Running a \ac {RL}
330-
algorithm which learns to choose parameters for tariffs that are valid and profitable in a given environment
331-
%\item Learning to generate valid tariff specifications through a genetic algorithm strategy, penalizing invalid
332-
%tariffs %TODO really, I go genetic?
333-
\end{enumerate}
334-
335-
%TODO not yet actually realized, still applicable?
336-
\subsubsection{Tariff fitness learning} To learn the fitness of a tariff while considering its environment, supervised
337-
learning techniques can be applied. To do this, features need to be created from the tariffs specifications and its
338-
competitive environment. Similar work has been done by \citep{cuevas2015distributed} who discretized the tariff market
339-
in four variables describing the relationships between the competitors and their broker.
340-
341-
For my broker, because \ac {NN} can handle a large state spaces, I create a more detailed description of the
342-
environment. I still have to ensure the number of input features is fixed though, so a simple copy of all competing
343-
tariffs is not a valid input for the environment description. Instead I create the following features from the tariff
344-
market:
345-
346-
\begin{description} \item[Average Charge per hour of week Timeslot]: According to \\
347-
\texttt{TariffEvaluationHelper.java}, customer models evaluate tariffs on an per-hour basis. This means they are
348-
very precise in the evaluation of potential tariff alternatives (before the application of an irrationality
349-
factor). Hence, a per-hour precision in the input is needed. \item[Variance of Charge per hour of week
350-
Timeslot] Variance of the tariffs charges per each timeslot in a week among all competitors. \item[Average and
351-
Variance of periodic payments] Description of the markets periodic payments landscape \item[Average and Variance
352-
of one-time payments] Description of the markets one-time payments landscape \item[Average and Variance of
353-
Up/Down regulation payments] 0 for tariffs without regulation capabilities \end{description}
354-
355-
Because the \ac {PowerTAC} simulation does not return profits of brokers on a per-tariff basis and because the reasons
356-
for why a broker purchased a specific amount of energy on the wholesale market are not known, it is hard to put a
357-
profitability value on a brokers tariff if said broker offers more than one tariff on the market. Therefore the
358-
evaluation of the tariff does not include the profitability of the tariff but merely the competitiveness in regards to
359-
the attractiveness of the offer from the perspective of the customers
310+
\subsection{Customer Market}
311+
\label{sub:customer_market}
312+
313+
%TODO background?
314+
%TODO not implemented
315+
%The goal of the customer market is to get as many subscribers as possible for the most profitable tariffs the broker
316+
%offers on the market. The tariffs offered in the market compete for the limited number of customers available and every
317+
%customer must be subscribed to some tariff. The profitability of tariffs is limited by the base tariff which is offered
318+
%by the simulation as a constant offering creating an upper bound on profitability.
319+
%
320+
%To succeed in the customer market, the agents needs to be able to generate tariffs that are competitive. This can be
321+
%broken down into two subtasks: Generating valid tariffs and evaluating their competitiveness. A tariff can be verified
322+
%by passing it to the \ac {PowerTAC} server which verifies the tariff. Hence, a \ac {RL} algorithm that is tasked with
323+
%creating competitive tariffs can be given feedback by penalizing non-conclusive tariffs. An invalid tariff could be one
324+
%that contains overlapping rates leading to an ambivalent status. The competitiveness of a tariff depends not only on the
325+
%attributes of the tariff but also on the competition environment. If the broker only competes against the default
326+
%tariffs, even many mediocre tariff offerings would perform well. In an environment with many competitors on the other
327+
%hand, a tariff needs to be well designed to generate profits.
328+
%
329+
%The agents learning task for the customer market is therefore designed in the following way:
330+
%
331+
%\begin{enumerate} \item Learning to evaluate a tariffs competitiveness in relation to the competitive environment
332+
% through supervised learning on the historical state logs of previous competitions \item Running a \ac {RL}
333+
% algorithm which learns to choose parameters for tariffs that are valid and profitable in a given environment
334+
% %\item Learning to generate valid tariff specifications through a genetic algorithm strategy, penalizing invalid
335+
% %tariffs %TODO really, I go genetic?
336+
%\end{enumerate}
337+
%
338+
%%TODO not yet actually realized, still applicable?
339+
%\subsubsection{Tariff fitness learning} To learn the fitness of a tariff while considering its environment, supervised
340+
%learning techniques can be applied. To do this, features need to be created from the tariffs specifications and its
341+
%competitive environment. Similar work has been done by \citep{cuevas2015distributed} who discretized the tariff market
342+
%in four variables describing the relationships between the competitors and their broker.
343+
%
344+
%For my broker, because \ac {NN} can handle a large state spaces, I create a more detailed description of the
345+
%environment. I still have to ensure the number of input features is fixed though, so a simple copy of all competing
346+
%tariffs is not a valid input for the environment description. Instead I create the following features from the tariff
347+
%market:
348+
%
349+
%\begin{description} \item[Average Charge per hour of week Timeslot]: According to \\
350+
% \texttt{TariffEvaluationHelper.java}, customer models evaluate tariffs on an per-hour basis. This means they are
351+
% very precise in the evaluation of potential tariff alternatives (before the application of an irrationality
352+
% factor). Hence, a per-hour precision in the input is needed. \item[Variance of Charge per hour of week
353+
% Timeslot] Variance of the tariffs charges per each timeslot in a week among all competitors. \item[Average and
354+
% Variance of periodic payments] Description of the markets periodic payments landscape \item[Average and Variance
355+
% of one-time payments] Description of the markets one-time payments landscape \item[Average and Variance of
356+
% Up/Down regulation payments] 0 for tariffs without regulation capabilities \end{description}
357+
%
358+
%Because the \ac {PowerTAC} simulation does not return profits of brokers on a per-tariff basis and because the reasons
359+
%for why a broker purchased a specific amount of energy on the wholesale market are not known, it is hard to put a
360+
%profitability value on a brokers tariff if said broker offers more than one tariff on the market. Therefore the
361+
%evaluation of the tariff does not include the profitability of the tariff but merely the competitiveness in regards to
362+
%the attractiveness of the offer from the perspective of the customers
360363
% large space of decision variables / dimensions
361364
%
362365
% how to avoid overwhelming of agent? output layer must be fairly large.
363366
%
364367
% time, energy, money, communication dimensions (and subdimensions)
365368
\subsubsection{Customer demand estimation}% \label{ssub:customer_demand_estimation}
366369

367-
The simplest learning component is the demand estimator. This component has no dependencies onto the other learning
368-
components and can easily be trained using historical data. This is due to the fact that the demand of a customer is
369-
only dependent on variables that are already provided in the state files of previous simulations. A customer will not
370-
use a different amount of energy if the broker implementation changes but all other variables (such as subscribed
371-
tariff, weather etc.) remain equal .
370+
This component has no dependencies onto the other learning components and can easily be trained using historical data.
371+
It is therefore a supervised learning algorithm, matching known information in timestep $t-n$ to a prediction for the
372+
expected energy usage at timestep $t$. Known information includes: Weather forecasts, historical usages, time, tariff
373+
and customer metadata.
374+
If the customer models change across games (e.g. if a customer suddenly uses 10x the energy on rainy days), the learning
375+
model will have to learn to adapt to this change. This can be achieved by letting the model both learn from historical
376+
data initially (i.e. form the state files) and also let it learn online during the competition, based on the new
377+
customer models.
372378

373379
To train a model that predicts the demand amounts of customers under various conditions, a dataset of features and
374-
labels needs to be created. Because the model may also learn during the course of a running competition (allowing the
375-
model to adapt to new customer patterns), a generator based structure should be preferred. This means that a generator
376-
exists that creates $x, y$ pairs for the model to train on.
380+
labels needs to be created. Because the model may also learn during the course of a running competition, a generator based structure should be preferred. This means that a generator
381+
exists that creates $x, y$ pairs for the model to train on, instead of creating a large batch of learning data ahead of
382+
the learning processing.
377383

384+
%TODO STOP
378385
According to the simulation specification, the customer models generate their demand pattern based on their internal
379386
structure, broker factors and game factors \citep[]{ketter2018powertac}. The preprocessing pipeline therefore generates
380387
feature-label pairs that include: Customer, tariff, weather, time and demand information. The realized demand is the

0 commit comments

Comments
 (0)