Check HDF5 files for corruption

HDF5 files do not have an error recovery mechanism. There is an optional per-variable error checksum Fletcher32, which will return an error on data corruption. Note that checking file size alone is not an adequate check for data corruption. Here is a non-exhaustive list of easy techniques to check for corruption in HDF5 files.

Python recursive directory of HDF5 checker checks HDF5 files for corruption and optionally finds the corrupted block(s)

Command line check HDF5 files for corruption

On Windows, you can run these checks from Cygwin or WSL. The package to install for Cygwin is hdf5 and for Linux/WSL is hdf5-tools.

h5stat file.h5

GUI check HDF5 files for corruption

HDFview variable properties corrupted question mark
HDFview showing corrupted variable with red question mark

HDFview appears to use the Fletcher32 checksum to show a red question mark if corruption is detected. Another curiosity is that the Object reference is 2^32 - 1 on the corrupted variable.

Leave a Comment