Skip to content

modis.io

HDF5Base

Parent class for interaction with HDF5 files

This class serves as a parent class for ModisRawH5 and ModisSmoothH5, enabling uniform chunked read and write of datasets and attributes to and from HDF5 files.

__init__(self, filename) special

Initialize HDF5Base instance.

Creates an instance of the HDF5Base class. This is not stricly intended to be called by itself (although it can be), but rather through a super call of the child class.

Parameters:

Name Type Description Default
filename str

Full path to the HDF5 file.

required

read_chunked(self, dataset, xoffset=0, xchunk=None, arr_out=None)

Read data from dataset in a chunked manner.

The chunks are iterated in a row by column pattern, where each chunk along row axis is yielded once the full column size is read into the array. The chunking of the colums (xchunk) can me modified, while the row chunking (ychunk) is strictly defined by the dataset. To enable the nsmooth functionality, an xoffset can be provided to skip datapoints along the time dimension. If no arr_out is provided, a new array will be created with the necessary dimensions.

Parameters:

Name Type Description Default
dataset str

Name of the dataset (expect 2d array).

required
xoffset int

Offset for time-dimension (xaxis) in file.

0
xchunk int

Chunking along time-dimension. If not specified, it'll be read from the dataset.

None
arr_out ndarray

Output array.

None

Returns:

Type Description
ndarray

Yields the output chunk as np.ndarray

Exceptions:

Type Description
AssertionError

Raised when dataset not found within HDF5 file and when provided arr_out is not correct object or shape

write_chunk(self, dataset, arr_in, xoffset=0, xchunk=None, yoffset=0)

Write chunk back to HDF5 file.

Writes complete chunk back to HDF5 file, iterating over the time-dimension (x). The chunksize for x can be adjusted manually using xchunk. To implement nupdate behaviour, xoffset can be used to skip prior datapoints in the time-dimension. To write successive spatial chunks, yoffset has to be provided (the default is 0, as it starts at the top left).

Parameters:

Name Type Description Default
dataset str

Name of the dataset (expect 2d array).

required
arr_in ndarray

Input data to be written to file.

required
xchunk int

Chunking along row (x) axis. If not specified, it'll be read from the dataset.

None
xoffset int

Offset for colums (xaxis) in file.

0
yoffset int

Offset for rows (yaxis) in file.

0

Returns:

Type Description
bool

Returns True if write was successful.

Exceptions:

Type Description
AssertionError

Raised when dataset not found within HDF5 file and when provided arr_in is not correct object or shape

HDFHandler

Class to handle reading from MODIS HDF files.

This class enables reading specific subdatasets and attributes from the raw MODIS HDF files.

iter_handles(self)

Iterates over all open dataset handles coming from open_datasets and returns a Tuple with index and a gdal.Dataset for each.

Returns:

Type Description
Tuple[int, gdal.Dataset]

Tuple with index and corresponding gdal.Dataset

open_datasets(self)

Opens the selected subdataset from all files within a context manager and stores them in a class variable. When the context manager closes, the refereces are removed, closing all datasets.

read_chunk(x, **kwargs) staticmethod

Reads a chunk of an opened subdataset.

The size of the chunk being read is defined by the values passed to gdal.Dataset.ReadAsArray with kwargs and can be as big as the entire dataset.

Parameters:

Name Type Description Default
x gdal.Dataset

GDAL Dataset.

required
**kwargs dict

kwargs passed on to gdal.Dataset.ReadAsArray

{}

Returns:

Type Description
ndarray

Requested chunk as np.ndarray