Skip to content

Commit dd3f140

Browse files
Manage Strings
1 parent 00435de commit dd3f140

File tree

12 files changed

+385
-159
lines changed

12 files changed

+385
-159
lines changed

docs/documentation.tex

Lines changed: 65 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,8 @@ \chapter{Introduction}
5050

5151
\section{The IMP Language}
5252

53-
IMP is a typical imperative programming language. It is, therefore, composed of the classic constructs that all similar languages have:
53+
IMP is a typical imperative programming language. It is, therefore, composed of
54+
the classic constructs that all similar languages have:
5455
\begin{description}
5556
\item[Skip] An instruction that has no side effects. It's only purpose is
5657
to jump to the next instruction, without any modification to the
@@ -132,6 +133,7 @@ \subsubsection{Arithmetical Expressions}\label{sec:grammar-aexp}
132133

133134
<afactor> ::= "(" <aexp> ")"
134135
\alt `|' <array> `|'
136+
\alt `|' <cexp> `|'
135137
\alt <identifier> "[" <aexp> "]"
136138
\alt <integer> | <identifier>
137139
\end{grammar}
@@ -175,8 +177,28 @@ \subsubsection{Boolean Expressions}\label{sec:grammar-bexp}
175177
\alt <bcomparison>
176178

177179
<bcomparison> ::= <aexp> "=" <aexp> | <aexp> "<=" <aexp> | <aexp> "<" <aexp>
180+
\alt <cexp> "=" <cexp>
178181
\end{grammar}
179182
183+
\subsubsection{Characters Expressions}\label{sec:grammar-cexp}
184+
185+
Another addition to the IMP language are strings. Strings are treated similarly
186+
to an array of character, but they are immutable.
187+
188+
For simplicity, in this grammar the sequence of characters ``\texttt{[\^{}"]}''
189+
means ``all characters except the double quotes''. This is used only for
190+
simplicity: there are no restrictions on the character a string can holds (with
191+
the only exception being the double quotes).
192+
193+
\begin{grammar}
194+
<cexp> ::= <cterm> ++ <cexp> | <cterm>
195+
196+
<cterm> ::= "\"" <string> "\"" | <identifier>
197+
198+
<string> ::= "[^\"]" <string> | "[^\"]"
199+
\end{grammar}
200+
201+
180202
\subsubsection{Imperative Commands}\label{sec:grammar-com}
181203
182204
A grammar for the commands in $\mathrm{Com}$ is here presented. The recognized
@@ -188,6 +210,7 @@ \subsubsection{Imperative Commands}\label{sec:grammar-com}
188210
<command> ::= <assignment> | <ifThenElse> | <while> | "skip"
189211
190212
<assignment> ::= <identifier> ":=" <aexp>
213+
\alt <identifier> ":=" <cexp>
191214
\alt <identifier> ":=" <array>
192215
\alt <identifier> "[" <aexp> "]" ":=" <aexp>
193216
@@ -230,6 +253,9 @@ \section{Additions to the Syntax Given During Lectures}\label{sec:additions}
230253
\[a<b \iff \left(a\leq b\right) \land \neg\left(a = b\right)\]
231254
This operator has been defined as a short hand to easily work with
232255
arrays.
256+
\item A new data type is supported: strings. Strings are basically array of
257+
characters and some of the operations defined for arrays are available
258+
on strings.
233259
\end{itemize}
234260
235261
\begin{figure}[H]
@@ -264,7 +290,7 @@ \subsection{Arrays}
264290
265291
\begin{itemize}
266292
\item An array $A=\left[a_1,a_2,\ldots,a_n\right]\in\mathbb{Z}^n$, as
267-
already stated, indexed starting from 0. In IMP (as in many other
293+
already stated, is indexed starting from 0. In IMP (as in many other
268294
languages) the notation $A[i]$ is used to get the ($i+1$)-th element of
269295
the array, so $\forall i=0,\ldots,n-1: A[i]=a_{i+1}$.
270296
\item An array $A\in\mathbb{Z}^n$ is said to be of ``size'' $n$. in IMP, to
@@ -279,6 +305,29 @@ \subsection{Arrays}
279305
\[A++B=\left[ a_1, a_2, \ldots, a_n, b_1, b_2, \ldots, b_m \right]\]
280306
\end{itemize}
281307
308+
\subsubsection{Strings}
309+
310+
The IMP language has also been extended with a new grammar to manage strings.
311+
Strings, while not treated as such, can be understood as arrays of characters.
312+
The following operations are available on strings.
313+
314+
\begin{itemize}
315+
\item A string $S=\texttt{"}a_1a_2\ldots a_n\texttt{"}$, as
316+
an array, is indexed starting from 0. In IMP (as in many other
317+
languages) the notation $S[i]$ is used to get the ($i+1$)-th character
318+
of the string, so $\forall i=0,\ldots,n-1: S[i]=a_{i+1}$.
319+
\item A string $S=\texttt{"}a_1a_2\ldots a_n\texttt{"}$ is said to be of ``size'' $n$. in IMP, to
320+
get the size of a string, the same array operator
321+
$|\cdot|$ is defined such that $S\mapsto n$.
322+
\item The concatenation operation ``\texttt{++}'' is also defines on
323+
strings. Given two strings $S_1=\texttt{"}a_1a_2\ldots a_n\texttt{"}$
324+
and $S_2=\texttt{"}b_1b_2\ldots b_n\texttt{"}$ the concatenation is the
325+
operation $++$ such that:
326+
\[S_1++S_2=\texttt{"}a_1a_2\ldots a_nb_1b_2\ldots b_n\texttt{"}\]
327+
\end{itemize}
328+
329+
Note that strings are immutable: a character, once in a string, cannot be
330+
changed without rewriting the entire string.
282331
283332
\chapter{Design}
284333
@@ -470,11 +519,11 @@ \section{The Environment Module}
470519
\lstinline|Show|, to customize the way a variable is converted to String
471520
(especially when printed as output).
472521
473-
\lstinputlisting[firstline=17, lastline=18]{Environment.hs}
522+
\lstinputlisting[firstline=17, lastline=22]{Environment.hs}
474523
475524
Then, the environment itself is defined as a sequence of variables.
476525
477-
\lstinputlisting[firstline=20, lastline=20]{Environment.hs}
526+
\lstinputlisting[firstline=23, lastline=23]{Environment.hs}
478527
479528
A function to modify an existing environment is then defined. Given an
480529
environment and a variable, it returns a new environment that is:
@@ -485,21 +534,21 @@ \section{The Environment Module}
485534
already in the old environment (the previous value is then discarded
486535
after the update).
487536
\end{itemize}
488-
\lstinputlisting[firstline=22, lastline=26]{Environment.hs}
537+
\lstinputlisting[firstline=25, lastline=29]{Environment.hs}
489538
490539
Furthermore, a function that searches a variable in an environment is provided.
491540
Given an environment and a string, it searches in the environment for a
492541
variable whose name is the given string and returns its value if the variable
493542
is in the environment.
494543
495-
\lstinputlisting[firstline=28, lastline=32]{Environment.hs}
544+
\lstinputlisting[firstline=31, lastline=35]{Environment.hs}
496545
497546
Finally, a specialized version of the two above functions to work with arrays
498547
is provided. The first one updates (or saves) an array, the second one searches
499548
all the values contained in an array in the environment.
500549
501-
\lstinputlisting[firstline=34, lastline=39]{Environment.hs}
502-
\lstinputlisting[firstline=41, lastline=48]{Environment.hs}
550+
\lstinputlisting[firstline=37, lastline=42]{Environment.hs}
551+
\lstinputlisting[firstline=44, lastline=52]{Environment.hs}
503552
504553
\section{The Parser Module}
505554
@@ -548,7 +597,7 @@ \subsection{The Parser Main Module}
548597
the implementation of the parsers simpler, because it allows to assume that no
549598
unnecessary characters are in the input string.
550599
551-
\lstinputlisting[firstline=17, lastline=28]{Parser.hs}
600+
\lstinputlisting[firstline=17, lastline=30]{Parser.hs}
552601
553602
Finally, the exported \lstinline|eval| function is provided. Here the actual
554603
execution takes place: it starts from an empty environment and parses a program
@@ -558,7 +607,7 @@ \subsection{The Parser Main Module}
558607
It should be noted that the function removes all comment and white space from
559608
the given program (in this exact order) using the previously defined functions.
560609
561-
\lstinputlisting[firstline=30]{Parser.hs}
610+
\lstinputlisting[firstline=32]{Parser.hs}
562611
563612
\subsection{The Parser Core Module}
564613
\lstinputlisting[firstline=1, lastline=1]{Parser/Core.hs}
@@ -611,9 +660,12 @@ \subsection{The Parser Fundamentals Module}
611660
\item[satisfies] A parser that reads a character and succeeds if the
612661
character satisfies a predicate (given as its first parameter).
613662
\item[symbol] A parser that reads a symbol or a keyword.
663+
\item[notsymbol] A parser that reads everything except a symbol or a
664+
keyword.
614665
\end{description}
615666
616667
\lstinputlisting[firstline=8, lastline=24]{Parser/Fundamentals.hs}
668+
\lstinputlisting[firstline=86, lastline=91]{Parser/Fundamentals.hs}
617669
618670
It then defines the parsers for natural numbers by simply applying the grammar
619671
defined in \autoref{sec:grammar-numbers}.
@@ -669,7 +721,9 @@ \subsection{The Parser Aexp Module}
669721
\lstinputlisting[firstline=3, lastline=7]{Parser/Aexp.hs}
670722
671723
It then provides the parser for the arithmetical expressions by simply following
672-
the grammar given in \autoref{sec:grammar-aexp}.
724+
the grammar given in \autoref{sec:grammar-aexp}. In this same parser (doing
725+
otherwise there would be a circular dependency among modules), are implemented
726+
also the grammars from arrays and strings (\autoref{sec:grammar-cexp}).
673727
674728
\lstinputlisting[firstline=9]{Parser/Aexp.hs}
675729

0 commit comments

Comments
 (0)