Check HDF5 files for corruption

HDF5 files do not have an error recovery mechanism and do not journal. There is an optional per-variable error checksum Fletcher32, which will return an error on data corruption.

  • Checking/comparing file size alone is not an adequate check for HDF5 corruption.

Here a few easy techniques to check for corrupted HDF5 files.

Python

h5tester.py checks HDF5 files for corruption and optionally finds the corrupted block(s)

Shell

apt install hdf5-tools
h5stat file.h5

You can also print the data values in the file

h5dump file.h5

On Windows, run these checks from Cygwin or WSL.

GUI HDF5 checker

HDFview variable properties corrupted question mark
HDFview showing corrupted variable with red question mark

HDFview appears to use the Fletcher32 checksum to show a red question mark if corruption is detected. Another curiosity is that the Object reference is 2^32 - 1 on the corrupted variable.

Leave a Comment