httomo.data.dataset_store.DataSetStoreWriter#

class httomo.data.dataset_store.DataSetStoreWriter(slicing_dim: Literal[0, 1, 2], comm: mpi4py.MPI.Comm, temppath: PathLike, store_backing: DataSetStoreBacking = DataSetStoreBacking.RAM)[source]#

A DataSetSink that can be used to store block-wise data in the current chunk (for the current process).

It uses memory by default - but if there’s a memory allocation error, a temporary h5 file is used to back the dataset’s memory.

The make_reader method can be used to create a DataSetStoreReader from this writer. It is intended to be used after the writer has finished, to read the data blockwise again. It will use the same underlying data store (h5 file or memory)

Methods

__init__(slicing_dim, comm, temppath[, ...])

finalize()

Method intended to be called after writing all blocks is done, to give implementations a chance to write everything to disk and close the file, etc.

make_reader([new_slicing_dim, padding])

Create a reader from this writer, reading from the same store.

write_block(block)

Writes a block to the store, starting at the index in dataset.chunk_index, in the current slicing dimension.

Attributes

aux_data

chunk_shape

Returns the shape of a chunk, i.e. the data processed in the current MPI process (whether it fits in memory or not).

comm

filename

global_index

Returns the start index of the chunk within the global data array

global_shape

Global data shape across all processes that we eventually have to write.

is_file_based

slicing_dim

Slicing dimension - 0, 1, or 2

property aux_data: AuxiliaryData#
property chunk_shape: Tuple[int, int, int]#

Returns the shape of a chunk, i.e. the data processed in the current MPI process (whether it fits in memory or not)

property comm: mpi4py.MPI.Comm#
property filename: Path | None#
finalize()[source]#

Method intended to be called after writing all blocks is done, to give implementations a chance to write everything to disk and close the file, etc.

property global_index: Tuple[int, int, int]#

Returns the start index of the chunk within the global data array

property global_shape: Tuple[int, int, int]#

Global data shape across all processes that we eventually have to write.

property is_file_based: bool#
make_reader(new_slicing_dim: Literal[0, 1, 2] | None = None, padding: Tuple[int, int] | None = None) DataSetSource[source]#

Create a reader from this writer, reading from the same store. The optional parameter padding can be used if data should be returned with padding slices, given as a tuple of (before, after)

property slicing_dim: Literal[0, 1, 2]#

Slicing dimension - 0, 1, or 2

write_block(block: DataSetBlock)[source]#

Writes a block to the store, starting at the index in dataset.chunk_index, in the current slicing dimension.

NOTE: Implementers should make sure to move the dataset to CPU if required - it may be on GPU when this method is called.