Floating point comparisons in Python with Xarrray

1 minute read

Numpy’s testing suite includes numpy.testing.assert_allclose(), which is very frequently used to compare scalars and N-dimensional arrays. Xarray can most simply be thought of as a super-Numpy with metadata bringing powerful indexing and database-like methods, while retaining the speed of Numpy.

Continuous integration for Xarray is addressed with the xarray.tests module. If you try to use numpy.testing.assert_allclose on the frequently-used xarray.Dataset, errors will result like:

TypeError: cannot directly convert an xarray.Dataset into a numpy array. Instead, create an xarray.DataArray first, either with indexing on the Dataset or by invoking the to_array() method.

The simple answer is to do something like:

import xarray
import xarray.tests

# run your simulation etc.
dat = myfunc(...)  
  
# load the reference Dataset to compare against
ref = xarray.open_dataset('ref.nc')  
  
xarray.tests.assert_allclose(ref, dat)

Why didn’t we use dat.equals(ref)? Because it is generally inappropriate to directly compare floating point numbers, including the ubiquitous IEEE 754. Numerous authors have written on this subject for decades, and it’s an essential piece of knowledge for anyone doing computer engineering work. Clive Moler’s article address the topic of floating point comparisons succinctly.

floating point comparison algorithm

Across computing languages, an algorithm suitable for comparing floating point numbers actual, desired to absolute tolerance atol and relative tolerance rtol is:

isclose = abs(actual-desired) <= max(rtol * max(abs(actual), abs(desired)), atol)

isclose is Boolean True/False (an array if actual or desired are arrays.

See Fortran assert.f90 for more specific polymorphic implementation details, adaptable to almost any programming language.

Leave a comment