@@ -102,21 +102,54 @@ IEEE floating point format, values in the range of approximately
102
102
The exact values are given by the variables @code {realmin },
103
103
@code {realmax }, and @code {eps }, respectively.
104
104
105
- Matrix objects can be of any size, and can be dynamically reshaped and
106
- resized. It is easy to extract individual rows, columns, or submatrices
107
- using a variety of powerful indexing features. @xref {Index Expressions }.
105
+ Matrix objects can be of any size, and can be dynamically reshaped and resized.
106
+ It is easy to extract individual rows, columns, or submatrices using a variety
107
+ of powerful indexing features. @xref {Index Expressions }.
108
108
109
109
@xref {Numeric Data Types }, for more information.
110
110
111
111
@node Missing Data
112
112
@subsection Missing Data
113
113
@cindex missing data
114
114
115
- It is possible to represent missing data explicitly in Octave using
116
- @code {NA } (short for ``Not Available''). Missing data can only be
117
- represented when data is represented as floating point numbers. In this
118
- case missing data is represented as a special case of the representation
119
- of @code {NaN }.
115
+ It is possible to represent missing data explicitly in Octave using NA (short
116
+ for ``@w {Not } @w {Available }''). This is helpful in distinguishing between a
117
+ property of the data (i.e., some of it was not recorded) and calculations on
118
+ the data which generated an error (i.e., created NaN values). In short, if you
119
+ do not get the result you expect is it your data or your algorithm?
120
+
121
+ The missing data marker is a special case of the representation of NaN.
122
+ Because of that, it can only be used with data represented by floating point
123
+ numbers--- no integer, logical, or char values.
124
+
125
+ In general, use NA and the test @code {isna }, to describe the dataset or to
126
+ reduce the dataset to only valid entries. Numerical calculations with NA will
127
+ generally "poison" the results and conclude with an output NA. However, this
128
+ can not be guaranteed on all platforms and NA may be replaced by NaN.
129
+
130
+ Example 1 : Describing the dataset
131
+
132
+ @example
133
+ @group
134
+ data = [1, NA, 3];
135
+ percent_missing = 100 * sum (isna (data(:))) / numel (data);
136
+ printf ('%2.0f%% of the dataset is missing\n', percent_missing);
137
+ @print {} 33% of the dataset is missing
138
+ @end group
139
+ @end example
140
+
141
+ Example 2 : Restrict calculations to valid data
142
+
143
+ @example
144
+ @group
145
+ raw_data = [1, NA, 3];
146
+ printf ('mean of raw data is %.1f\n', mean (raw_data));
147
+ @print {} mean of raw data is NA
148
+ valid_data = raw_data (! isna (raw_data));
149
+ printf ('mean of valid data is %.1f\n', mean (valid_data));
150
+ @print {} mean of valid data is 2.0
151
+ @end group
152
+ @end example
120
153
121
154
@DOCSTRING (NA)
122
155
0 commit comments