Mount the Filesystem¶
Preface¶
This guide will describe how to mount the session directory of the Diamond filesystem to a workflow Task. Filesystem access can be achieved by performing a hostPath
mount on your task container.
HostPath Mount¶
Session directories can be accessed using a hostPath
volume mount. To configure this we must:
- Declare a
hostPath
volume by creating aspec.templates[].volumes
entry where thepath
points to the session directory we intend to mount and thetype
is set toDirectory
, as:spec: templates: - volumes: - name: session hostPath: path: /dls/i03/data/2024/cm37235-2 type: Directory
- Mount the volume by creating a
spec.templates[].container.volumeMounts
entry which selects the volume byname
and declares amountPath
to which the volume will appear, as:spec: templates: - container: volumeMounts: - name: session mountPath: /dls/session
GPFS Access
This access will be via the Network File System (NFS), see the GPFS section if you require greater bandwidth or consistency.
NFS Access
A Workflow executing a busybox
instance which prints the directory tree at /dls/session
is shown below:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: hostpath-mount
spec:
entrypoint: hostpath-mount-example
templates:
- name: hostpath-mount-example
container:
image: docker.io/library/busybox:latest
command:
- tree
- /dls/session
volumeMounts:
- name: session
mountPath: /dls/session
volumes:
- name: session
hostPath:
path: /dls/i03/data/2024/cm37235-2
type: Directory
GPFS¶
Some cluster nodes allow access to the filesystem via the General Parallel File System (GPFS), this allows for a greater bandwidth and consistency than nodes using NFS. These nodes are marked with a set of taints and labels; Hence to schedule our workflow on a node using with GPFS filesystem access, we must:
- Specify a toleration for the taint to
spec.templates[].tolerations[]
with akey
ofnodetype
, anoperator
ofEqual
, and avalue
ofcs05r_gpfs
, as:spec: templates: - tolerations: - key: nodetype operator: Equal value: cs05r_gpfs
Reserved Nodes
You may need to apply an additional toleration for science_group
in order to access a node with GPFS filesystem access. Please only do so if you have been given appropriate permission
- Provide an affinity for nodes which have GPFS by adding a label selector to
spec.templates[].affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[].matchExpressions[]
with akey
ofhas_gpfs03
orhas_gpfs04
, anoperator
ofIn
, and with an entry invalues
oftrue
, as:spec: templates: - affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: has_gpfs03 operator: In values: - "true"
Prefer GPFS
If GPFS is not a strict requirement for your task but is prefered you can use preferredDuringSchedulingIgnoredDuringExecution
, as:
spec:
templates:
- affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
- matchExpressions:
- key: has_gpfs03
operator: In
values:
- "true"
GPFS Access
A Workflow executing a busybox
instance which prints the directory tree at /dls/session
with access via GPFS is shown below:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: hostpath-mount
spec:
entrypoint: hostpath-mount-example
templates:
- name: hostpath-mount-example
container:
image: docker.io/library/busybox:latest
command:
- tree
- /dls/session
volumeMounts:
- name: session
mountPath: /dls/session
volumes:
- name: session
hostPath:
path: /dls/i03/data/2024/cm37235-2
type: Directory
tolerations:
- key: nodetype
operator: Equal
value: cs05r_gpfs
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: has_gpfs03
operator: In
values:
- "true"