-
Notifications
You must be signed in to change notification settings - Fork 2
Data handling
This section, shows some examples on how to handle datasets.
set verbose off # avoid detailed printouts
clear # clear memory
nulldata 3 # three observations (rows)
# Create a normally-distributed random variable
scalar mean = 4
scalar std_dev = 0.5
series y = normal(mean, std_dev)
series y_sq = y^2
series log_y = log(y)
series exp_y = exp(y)
series x = {1, 2, 3}'
series z = y - x
print y y_sq log_y exp_y x z --byobs
Returns the output:
y y_sq log_y exp_y x
1 3.035885 9.21660 1.110503 20.8194 1
2 4.053212 16.42853 1.399510 57.5821 2
3 4.993476 24.93480 1.608132 147.4481 3
z
1 2.035885
2 2.053212
3 1.993476
Let's open some sample dataset shipped by Gretl and create a binary dummy which takes the value of 1 if series YEAR
is either 1977 or 1980:
open abdata.gdt --quiet
series DUM = (YEAR == 1977 || YEAR == 1980)
print YEAR DUM -o --range=1:10 # print the first ten entries
The output is:
YEAR DUM
1:1 1976 0
1:2 1977 1
1:3 1978 0
1:4 1979 0
1:5 1980 1
1:6 1981 0
1:7 1982 0
1:8 1983 0
1:9 1984 0
2:1 1976 0
A series object in Gretl can include some metadata such as a descriptive labels. One can also set the description which should appear when plotting a series. Here is an example:
nulldata 3
series y = normal()
# Add a series description
setinfo y --description="Some random number"
# Instead of 'y' showing up in a graph, show another description
setinfo y --graph-name="Cool variable"
boxplot y --output=display # See the output
Suppose you have a weirdly valued dataset such as:
set verbose off
nulldata 5
series weird_values = {5, 6, 10, 20, NA}'
print weird_values --byobs
By means of the replace()
function, you we want to replace value 5 by 0, 6 by 1, 10 by 3, 20 by 4 and missing values (NA
) by -1:
# Let’s replace values
help replace
matrix find = {5, 6, 10, 20, NA}
matrix replace_by = {0, 1, 2, 3, -1}
# Create new series y with replaced values
series y = replace(weird_values, find, replace_by)
print weird_values y --byobs
The result is:
weird_values y
1 5 0
2 6 1
3 10 2
4 20 3
5 -1
Suppose you have a dataset with integer values ranging from 0 to 20. You to replace numbers from 0-5 by 1, 6-10 by 2, 11-20 by 3. How to do this? See here:
nulldata 40 # some empty dataset
# Discrete random numbers between 0 and 20
series old = randgen(i, 0, 20)
# print Var_alt --byobs
series new = NA # Initialize an empty series
# Replace 0-5 by 1
matrix find = seq(0, 5)
scalar subst = 1
series new = replace(old, find, subst)
# Replace 6-10 by 2
matrix find = seq(6, 10)
scalar subst = 2
series new = replace(new, find, subst)
# Replace 11-20 by 3
matrix find = seq(11, 20)
scalar subst = 3
series new = replace(new, find, subst)
print old new --byobs
Gets you:
Var_alt Var_neu
1 15 3
2 13 3
3 10 2
4 20 3
5 6 2
6 5 1
7 5 1
8 8 2
9 10 2
10 14 3
In Gretl you can also create a string-valued series.