YAML Checker - Why use it?

YAML Checker - Why use it?#

YAML checker will help you to validate your process list (see What is a process list?) saved as a YAML file. Before running your pipeline with HTTomo, we highly recommend that you validate your process list using this utility. The checker will help you to identify errors in your process list and avoid problems during the run.

Usage#

$ python -m httomo check YAML_CONFIG IN_DATA

Note

  • Use this check command before you use the run command to run your pipeline.

  • The YAML_CONFIG is the path to your YAML file and IN_DATA is the path to your input data.

  • IN_DATA is optional, but if you provide it, the yaml checker will be checking that the paths to the data and keys in the YAML_CONFIG file match the paths and keys in the input file (IN_DATA).

For example, if you have the following as a YAML_CONFIG file saved as example.yaml:

- method: standard_tomo
  module_path: httomo.data.hdf.loaders
  parameters:
   data_path: entry1/tomo_entry/data/data
   image_key_path: entry1/tomo_entry/instrument/detector/image_key
   rotation_angles:
     data_path: /entry1/tomo_entry/data/rotation_angle
- method: normalize
  module_path: tomopy.prep.normalize
  parameters:
    cutoff: null
    averaging: mean
- method: minus_log
  module_path: tomopy.prep.normalize
  parameters: {}
- method: save_to_images
  module_path: httomolib.misc.images
  parameters:
    subfolder_name: images
    axis: auto
    file_format: tif
    bits: 8
    perc_range_min: 0.0
    perc_range_max: 100.0
    jpeg_quality: 95

And you run the YAML checker with:

$ python -m httomo check example.yaml

You will get the following output:

Checking that the YAML_CONFIG is properly indented and has valid mappings and tags...
Sanity check of the YAML_CONFIG was successfully done...

Checking that the first method in the pipeline is a loader...
Loader check successful!!


YAML validation successful!! Please feel free to use the `run` command to run the pipeline.

The Yaml check was successful here because your yaml file was properly indented and had valid mappings and tags. It also included valid parameters for each method used from TomoPy, HTTomolib, or other backends.

But if you had the following as a YAML_CONFIG file saved as incorrect_method.yaml:

- method: standard_tomo
  module_path: httomo.data.hdf.loaders
  parameters:
    data_path: entry1/tomo_entry/data/data
    image_key_path: entry1/tomo_entry/instrument/detector/image_key
    rotation_angles:
      data_path: /entry1/tomo_entry/data/rotation_angle
    preview:
      -
      - start: 30
        stop: 60
      -
- method: median_filters # incorrect method name
  module_path: tomopy.misc.corr
  parameters:
    size: tomo # incorrect size parameter
    axis: 0
- method: normalize
  module_path: tomopy.prep.normalize
  parameters:
    cutoff: null
    averaging: mean

And then you run the YAML checker, you get:

$ python -m httomo check incorrect_method.yaml
Checking that the YAML_CONFIG is properly indented and has valid mappings and tags...
Sanity check of the YAML_CONFIG was successfully done...

Checking that the first method in the pipeline is a loader...
Loader check successful!!

'tomopy.misc.corr/median_filters' is not a valid method. Please recheck the yaml file.

This is because median_filters is not a valid method in TomoPy – should be median_filter. To make sure you pass the correct method, refer to the documentation of the package you are using (TomoPy, HTTomoLib, etc.)

What else do we check with the YAML checker?#

  • We do a sanity check first, to make sure that the YAML_CONFIG is properly indented and has valid mappings.

For instance, we cannot have the following in a YAML file:

- method: standard_tomo
  module_path: httomo.data.hdf.loaders
  parameters:
      data_path: /entry1/tomo_entry/data/data
    image_key_path: /entry1/tomo_entry/instrument/detector/image_key
    rotation_angles:
      data_path: /entry1/tomo_entry/data/rotation_angle
    preview: [None, None, None]

This will raise a warning because data_path is not at the same indentation level as the other fields directly under the parameters field.

  • We check that the first method in the pipeline is always a loader from 'httomo.data.hdf.loaders'.

  • We check methods exist for the given module path.

  • We check that the parameters for each method are valid. For example, find_center_vo method from tomopy.recon.rotation takes ratio as a parameter with a float value. If you pass a string instead, it will raise an error. Again the trick is to refer the documentation always.

  • We check the required parameters for each method are present.

  • If you pass IN_DATA (path to the data) along with the yaml config, as:

$ python -m httomo check config.yaml IN_DATA

That will check that the paths to the data and keys in the YAML_CONFIG file match the paths and keys in the input file (IN_DATA).

If you have the following loader in your yaml file:

- method: standard_tomo
  module_path: httomo.data.hdf.loaders
  parameters:
    data_path: entry1/tomo_entry/data/data
    image_key_path: entry1/tomo_entry/instrument/detect/image_key #incorrect path
    rotation_angles:
      data_path: /entry1/tomo_entry/data/rotation_angle
    preview: [None, None, None]

And you provide that, together with the standard tomo data, it will raise an error because the image path does not match:

Checking that the YAML_CONFIG is properly indented and has valid mappings and tags...
Sanity check of the YAML_CONFIG was successfully done...

Checking that the first method in the pipeline is a loader...
Loader check successful!!

Checking that the paths to the data and keys in the YAML_CONFIG file match the paths and keys in the input file (IN_DATA)...
'entry1/tomo_entry/instrument/detect/image_key' is not a valid path to a dataset in YAML_CONFIG. Please recheck the yaml file.

We have many other checks and are constantly improving the YAML checker to make it more robust, verbose, and user-friendly. This is a user-interface so suggestions are always welcome.