Mount the Filesystem¶
Preface¶
This guide will describe how to mount the session directory of the Diamond filesystem to a workflow Task. Filesystem access can be achieved by performing a hostPath mount on your task container.
HostPath Mount¶
Session directories can be accessed using a hostPath volume mount. To configure this we must:
- Declare a
hostPathvolume by creating aspec.templates[].volumesentry where thepathpoints to the session directory we intend to mount and thetypeis set toDirectory, as:spec: templates: - volumes: - name: session hostPath: path: /dls/i03/data/2024/cm37235-2 type: Directory - Mount the volume by creating a
spec.templates[].container.volumeMountsentry which selects the volume bynameand declares amountPathto which the volume will appear, as:spec: templates: - container: volumeMounts: - name: session mountPath: /dls/session
GPFS Access
This access will be via the Network File System (NFS), see the GPFS section if you require greater bandwidth or consistency.
NFS Access
A Workflow executing a busybox instance which prints the directory tree at /dls/session is shown below:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: hostpath-mount
spec:
entrypoint: hostpath-mount-example
templates:
- name: hostpath-mount-example
container:
image: docker.io/library/busybox:latest
command:
- tree
- /dls/session
volumeMounts:
- name: session
mountPath: /dls/session
volumes:
- name: session
hostPath:
path: /dls/i03/data/2024/cm37235-2
type: Directory
GPFS¶
Some cluster nodes allow access to the filesystem via the General Parallel File System (GPFS), this allows for a greater bandwidth and consistency than nodes using NFS. These nodes are marked with a set of taints and labels; Hence to schedule our workflow on a node using with GPFS filesystem access, we must:
- Specify a toleration for the taint to
spec.templates[].tolerations[]with akeyofnodetype, anoperatorofEqual, and avalueofcs05r_gpfs, as:spec: templates: - tolerations: - key: nodetype operator: Equal value: cs05r_gpfs
Reserved Nodes
You may need to apply an additional toleration for science_group in order to access a node with GPFS filesystem access. Please only do so if you have been given appropriate permission
- Provide an affinity for nodes which have GPFS by adding a label selector to
spec.templates[].affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[]with akeyofhas_gpfs03orhas_gpfs04, anoperatorofIn, and with an entry invaluesoftrue, as:spec: templates: - affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: has_gpfs03 operator: In values: - "true"
Prefer GPFS
If GPFS is not a strict requirement for your task but is prefered you can use preferredDuringSchedulingIgnoredDuringExecution, as:
spec:
templates:
- affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
- matchExpressions:
- key: has_gpfs03
operator: In
values:
- "true"
GPFS Access
A Workflow executing a busybox instance which prints the directory tree at /dls/session with access via GPFS is shown below:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: hostpath-mount
spec:
entrypoint: hostpath-mount-example
templates:
- name: hostpath-mount-example
container:
image: docker.io/library/busybox:latest
command:
- tree
- /dls/session
volumeMounts:
- name: session
mountPath: /dls/session
volumes:
- name: session
hostPath:
path: /dls/i03/data/2024/cm37235-2
type: Directory
tolerations:
- key: nodetype
operator: Equal
value: cs05r_gpfs
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: has_gpfs03
operator: In
values:
- "true"