-
Notifications
You must be signed in to change notification settings - Fork 289
Description
Hi netCDF friends! Question regarding netCDF-C behavior when writing NaNs to a integer variable. I tested this with netCDF 4.8.1 on my Linux machine. I see similar results on Windows.
In MATLAB and in my netCDF-C repro program (attached), if I create a netCDF integer variable and write a NaN to it, I see inconsistent results when reading the value back which I suspect is caused by undefined behavior in C library when converting a floating point type to integer type. This stackoverflow post is similar to what I’m seeing: https://stackoverflow.com/questions/3986795/what-is-the-result-of-casting-float-inf-inf-and-nan-to-integer-in-c.
Note MATLAB NaN is the IEEE arithmetic representation for NaN as a double scalar value. Also, MATLAB stores all numeric values as double-precision floating point values.
So for example, if I create an int16 variable, write a NaN, then read it back, I get zero. Because we're passing in NaN (a double) to ncwrite, we end up calling nc_put_var_double() and passing it the C value -nan as a double. So when it gets into the netCDF library, and netCDF is trying to assign the NaN to an int16, I think the zero is the result of undefined behavior:
% write NaN to int16 variable
nccreate('ocean1.nc','temperature','Datatype','int16','Format','netcdf4');
ncwrite('ocean1.nc','temperature',NaN);
data = ncread('ocean1.nc','temperature')
data =
int16
0
Note that writing an int32 to a NaN gives a different, non-zero result:
% write NaN to int32 variable
nccreate('ocean2.nc','temperature','Datatype','int32','Format','netcdf4');
ncwrite('ocean2.nc','temperature',NaN);
data = ncread('ocean2.nc','temperature')
data =
int32
-2147483648
However, if I first cast the NaN to an int16, the internal are a little different - we end up calling nc_put_var_short (because we cast NaN to int16), and we pass in a value of zero (since in MATLAB, casting a NaN to an integer type always results in zero). Thus in this case, we're basically writing a zero and reading a zero, which makes sense.
% write NaN to int16 variable but cast the NaN to int16
nccreate('ocean3.nc','temperature','Datatype','int16','Format','netcdf4');
ncwrite('ocean3.nc','temperature',int16(NaN));
data = ncread('ocean3.nc','temperature')
data =
int16
0
Here’s a summary of the behavior when writing NaN for all the integer types in MATLAB:
For uint8, uint16, uint32, int8, and int16, writing NaN writes zero.
For int32, writing NaN writes -2147483648.
For uint64 and int64, writing NaN writes -9223372036854775808.
My “nanTest.c” program (attached as text file) tests this with int16, int32, and int64 and shows the same behavior.
Question: Is this expected behavior in the netCDF-C library? Can you consider making the behavior consistent, such as returning zeros for all cases of writing a NaN to a variable defined as an integer, or giving a warning?