Check HDF5 files for corruption

less than 1 minute read

Related: HDF5 GUIs to view and edit variables in .h5 files


HDF5 files do not have an error recovery mechanism and do not journal. There is an optional per-variable error checksum Fletcher32, which will return an error on data corruption.

  • Checking/comparing file size alone is not an adequate check for HDF5 corruption.

HDF5 testing

Here a few easy techniques to check for corrupted HDF5 files.

Python

This Python-based HDF5 checking script checks HDF5 files for corruption and optionally finds the corrupted block(s) and variable(s)

Shell

apt install hdf5-tools
h5stat file.h5

You can also print the data values in the file

h5dump file.h5

On Windows, run these checks from Cygwin or WSL.

GUI HDF5 checker

HDFview variable properties corrupted question mark
HDFview showing corrupted variable with red question mark

HDFview appears to use the Fletcher32 checksum to show a red question mark if corruption is detected. Another curiosity is that the Object reference is 2^32 - 1 on the corrupted variable.

Leave a Comment