Code Description¶
HdfMap¶
hdfmap Map objects within an HDF5 file and create a dataset namespace.
Usage¶
HdfMap from NeXus file¶
from hdfmap import create_nexus_map, load_hdf
hmap = create_nexus_map('file.nxs')
with load_hdf('file.nxs') as nxs:
address = hmap.get_address('energy')
energy = nxs[address][()]
string = hmap.format_hdf(nxs, "the energy is {energy:.2f} keV")
d = hmap.get_dataholder(nxs) # classic data table, d.scannable, d.metadata
Shortcuts - single file reloading class¶
from hdfmap import NexusLoader
scan = NexusLoader('file.nxs')
[data1, data2] = scan.get_data(['dataset_name_1', 'dataset_name_2'])
data = scan.eval('dataset_name_1 * 100 + 2')
string = scan.format('my data is {dataset_name_1:.2f}')
Shortcuts - multifile load data¶
from hdfmap import hdf_data, hdf_eval, hdf_format, hdf_image
all_data = hdf_data([f"file{n}.nxs" for n in range(100)], 'dataset_name')
normalised_data = hdf_eval(filenames, 'total / Transmission / (rc / 300.)')
descriptions = hdf_eval(filenames, 'Energy: {en:5.3f} keV')
image = hdf_image(filenames, index=31)
Copyright 2024-2025 Daniel G. Porter
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
By Dr Dan Porter Diamond Light Source Ltd 2024-2025
HdfLoader
¶
HDF Loader contains the filename and hdfmap for a HDF file, the hdfmap contains all the dataset paths and a namespace, allowing data to be called from the file using variable names, loading only the required datasets for each operation.
E.G.¶
hdf = HdfLoader('file.hdf')
[data1, data2] = hdf.get_data(*['dataset_name_1', 'dataset_name_2'])
data = hdf.eval('dataset_name_1 * 100 + 2')
string = hdf.format('my data is {dataset_name_1:.2f}')
print(hdf.summary())
Source code in src/hdfmap/reloader_class.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
|
eval(expression, default=DEFAULT)
¶
Evaluate an expression using the namespace of the hdf file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression
|
str
|
str expression to be evaluated |
required |
default
|
returned if varname not in namespace |
DEFAULT
|
Returns:
Type | Description |
---|---|
eval(expression) |
Source code in src/hdfmap/reloader_class.py
find_hdf_paths(string, name_only=True, whole_word=False)
¶
Find any dataset paths that contain the given string argument
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string
|
str
|
str to find in list of datasets |
required |
name_only
|
bool
|
if True, search only the name of the dataset, not the full path |
True
|
whole_word
|
bool
|
if True, search only for case in-sensitive name |
False
|
Returns:
Type | Description |
---|---|
list[str]
|
list of hdf paths |
Source code in src/hdfmap/reloader_class.py
find_names(string)
¶
Find any dataset names that contain the given string argument, searching names in self.combined
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string
|
str
|
str to find in list of datasets |
required |
Returns:
Type | Description |
---|---|
list[str]
|
list of names |
Source code in src/hdfmap/reloader_class.py
format(expression, default=DEFAULT)
¶
Evaluate a formatted string expression using the namespace of the hdf file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expression
|
str
|
str expression using {name} format specifiers |
required |
default
|
returned if varname not in namespace |
DEFAULT
|
Returns:
Type | Description |
---|---|
eval_hdf(f"expression") |
Source code in src/hdfmap/reloader_class.py
get_data(*name_or_path, index=(), default=None, direct_load=False)
¶
Return data from dataset in file, converted into either datetime, str or squeezed numpy.array objects See hdfmap.eval_functions.dataset2data for more information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name_or_path
|
str name or path pointing to dataset in hdf file |
()
|
|
index
|
slice
|
index or slice of data in hdf file |
()
|
default
|
value to return if name not found in hdf file |
None
|
|
direct_load
|
return str, datetime or squeezed array if False, otherwise load data directly |
False
|
Returns:
Type | Description |
---|---|
dataset2data(dataset) -> datetime, str or squeezed array as required. |
Source code in src/hdfmap/reloader_class.py
get_hdf_path(name_or_path)
¶
get_image(index=None)
¶
Get image data from file, using default image path
Parameters:
Name | Type | Description | Default |
---|---|---|---|
index
|
slice
|
(slice,) or None to take the middle image |
None
|
Returns:
Type | Description |
---|---|
ndarray
|
numpy array of image |
Source code in src/hdfmap/reloader_class.py
get_scannables()
¶
get_string(*name_or_path, index=(), default='', units=False)
¶
Return data from dataset in file, converted into summary string See hdfmap.eval_functions.dataset2data for more information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name_or_path
|
str name or path pointing to dataset in hdf file |
()
|
|
index
|
slice
|
index or slice of data in hdf file |
()
|
default
|
value to return if name not found in hdf file |
''
|
|
units
|
if True and attribute 'units' available, append this to the result |
False
|
Returns:
Type | Description |
---|---|
dataset2str(dataset) -> str |
Source code in src/hdfmap/reloader_class.py
NexusLoader
¶
Bases: HdfLoader
Nexus Loader contains the filename and hdfmap for a NeXus file, the hdfmap contains all the dataset paths and a namespace, allowing data to be called from the file using variable names, loading only the required datasets for each operation. E.G. hdf = NexusLoader('file.hdf') [data1, data2] = hdf.get_data(['dataset_name_1', 'dataset_name_2']) data = hdf.eval('dataset_name_1 * 100 + 2') string = hdf.format('my data is {dataset_name_1:.2f}')
Source code in src/hdfmap/reloader_class.py
compare_maps(map1, map2)
¶
Compare two HdfMap objects
Source code in src/hdfmap/file_functions.py
create_hdf_map(hdf_filename)
¶
Create a HdfMap from a hdf file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
str filename of hdf file |
required |
Returns:
Type | Description |
---|---|
HdfMap
|
HdfMap |
create_nexus_map(hdf_filename, groups=None, default_entry_only=False)
¶
Create a HdfMap from a NeXus file, loading default parameters and allowing a reduced, single entry map
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
str filename of hdf file |
required |
groups
|
None | list[str]
|
list of groups to collect datasets from |
None
|
default_entry_only
|
bool
|
if True, only the first or default entry will be loaded |
False
|
Returns:
Type | Description |
---|---|
NexusMap
|
NexusMap |
Source code in src/hdfmap/file_functions.py
hdf_compare(hdf_filename1, hdf_filename2, all_links=False)
¶
Compare hdf tree structure between two files
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename1
|
str
|
filename of hdf file |
required |
hdf_filename2
|
str
|
filename of hdf file |
required |
all_links
|
bool, if True, also show soft links |
False
|
Returns:
Type | Description |
---|---|
str
|
str |
Source code in src/hdfmap/hdf_loader.py
hdf_data(filenames, name_or_path, hdf_map=None, index=(), default=None, fixed_output=False)
¶
General purpose function to retrieve data from HDF files
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filenames
|
str | list[str]
|
str or list of str - file paths |
required |
name_or_path
|
str | list[str]
|
str or list of str - names or paths of HDF datasets |
required |
hdf_map
|
HdfMap
|
HdfMap object, or None to generate from first file |
None
|
index
|
dataset index or slice |
()
|
|
default
|
value to give if dataset doesn't exist in file |
None
|
|
fixed_output
|
if True, always returns list of list |
False
|
Returns:
Type | Description |
---|---|
list[files: list[names]] |
Source code in src/hdfmap/file_functions.py
hdf_dataset_list(hdf_filename, all_links=True, group='/')
¶
Generate list of all datasets in the hdf file structure
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
filename of hdf file |
required |
all_links
|
bool, if True, also include soft links |
True
|
|
group
|
str
|
only display tree structure of this group (default root) |
'/'
|
Returns:
Type | Description |
---|---|
list[str]
|
list of str addresses |
Source code in src/hdfmap/hdf_loader.py
hdf_eval(filenames, expression, hdf_map=None, default=None, fixed_output=False)
¶
Evaluate expression using dataset names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filenames
|
str | list[str]
|
str or list of str - file paths |
required |
expression
|
str
|
str expression to evaluate in each file, e.g. "roi2_sum / Transmission" |
required |
hdf_map
|
HdfMap
|
HdfMap object, or None to generate from first file |
None
|
default
|
value to give if dataset doesn't exist in file |
None
|
|
fixed_output
|
if True, always returns list len(filenames) |
False
|
Returns:
Type | Description |
---|---|
list, len(filenames) |
Source code in src/hdfmap/file_functions.py
hdf_find(hdf_filename, *names_or_classes, attributes=('NX_class', 'local_name'))
¶
find groups and datasets within hdf file matching a set of names or class names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
filename of hdf file |
required |
names_or_classes
|
str
|
object names or NXclass names to search for |
()
|
attributes
|
tuple[str]
|
list of attr fields to check against names |
('NX_class', 'local_name')
|
Returns:
Type | Description |
---|---|
tuple[list[str], list[str]]
|
groups[], datasets[] |
Source code in src/hdfmap/hdf_loader.py
hdf_find_first(hdf_filename, *names_or_classes, attributes=('NX_class', 'local_name'))
¶
return the first path of object matching a set of names or class names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
filename of hdf file |
required |
names_or_classes
|
str
|
object names or NXclass names to search for |
()
|
attributes
|
tuple[str]
|
list of attr fields to check against names |
('NX_class', 'local_name')
|
Returns:
Type | Description |
---|---|
str | None
|
hdf_path or None if no match |
Source code in src/hdfmap/hdf_loader.py
hdf_format(filenames, expression, hdf_map=None, default=None, fixed_output=False)
¶
Evaluate string format expression using dataset names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filenames
|
str | list[str]
|
str or list of str - file paths |
required |
expression
|
str
|
str expression to evaluate in each file, e.g. "the energy is {en:.2f} keV" |
required |
hdf_map
|
HdfMap
|
HdfMap object, or None to generate from first file |
None
|
default
|
value to give if dataset doesn't exist in file |
None
|
|
fixed_output
|
if True, always returns list len(filenames) |
False
|
Returns:
Type | Description |
---|---|
list, len(filenames) |
Source code in src/hdfmap/file_functions.py
hdf_image(filenames, index=None, hdf_map=None, fixed_output=False)
¶
Evaluate string format expression using dataset names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filenames
|
str | list[str]
|
str or list of str - file paths |
required |
index
|
slice
|
index or slice of dataset volume, or None to use middle index |
None
|
hdf_map
|
HdfMap
|
HdfMap object, or None to generate from first file |
None
|
fixed_output
|
if True, always returns list len(filenames) |
False
|
Returns:
Type | Description |
---|---|
list, len(filenames) |
Source code in src/hdfmap/file_functions.py
hdf_linked_files(hdf_filename, group='/')
¶
Return a list of files linked to the current file, looking for all external links.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
filename of hdf file |
required |
group
|
str
|
only look at links within this group (default root) |
'/'
|
Returns:
Type | Description |
---|---|
list[str]
|
list of str filenames (usually relative file paths) |
Source code in src/hdfmap/hdf_loader.py
hdf_tree_dict(hdf_filename)
¶
Generate summary dict of the hdf tree structure The structure is: {'group': {'@attrs': str, 'sub-group': {}, 'dataset': str}, ...}
Group attributes are stored with names pre-fixed with '@'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
filename of hdf file |
required |
Returns:
Type | Description |
---|---|
dict
|
{'entry': {'dataset': value}...} |
Source code in src/hdfmap/hdf_loader.py
hdf_tree_string(hdf_filename, all_links=True, group='/', attributes=True)
¶
Generate string of the hdf file structure, similar to h5ls. Uses h5py.visititems
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hdf_filename
|
str
|
filename of hdf file |
required |
all_links
|
bool
|
bool, if True, also show links |
True
|
group
|
str
|
only display tree structure of this group (default root) |
'/'
|
attributes
|
bool
|
if True, display the attributes of groups and datasets |
True
|
Returns:
Type | Description |
---|---|
str
|
str |
Source code in src/hdfmap/hdf_loader.py
list_files(folder_directory, extension=DEFAULT_EXTENSION)
¶
Return list of files in directory with extension, returning list of full file paths
Source code in src/hdfmap/file_functions.py
load_hdf(hdf_filename, **kwargs)
¶
nexus_data_block(filenames, hdf_map=None, fixed_output=False)
¶
Create classic dict like dataloader objects from nexus files E.G. d = nexus_data_block('filename') d.scannable -> array d.metadata.filename -> value d.keys() -> list of items
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filenames
|
str | list[str]
|
str or list of str - file paths |
required |
hdf_map
|
HdfMap
|
HdfMap object, or None to generate from first file |
None
|
fixed_output
|
if True, always returns list len(filenames) |
False
|
Returns:
Type | Description |
---|---|
list, len(filenames) |
Source code in src/hdfmap/file_functions.py
set_all_logging_level(level)
¶
Set logging level of all loggers Logging Levels (see builtin module logging) 'notset' | 0 'debug' | 10 'info' | 20 'warning' | 30 'error' | 40 'critical' | 50
Parameters:
Name | Type | Description | Default |
---|---|---|---|
level
|
str | int
|
str level name or int level |
required |
Returns:
Type | Description |
---|---|
None |