Run SmartEM Agent (EPU Agent)#

The SmartEM Agent is a data collection service that monitors EPU (Electron Physical User) output directories and communicates acquisition data to the backend service in real-time.

Overview#

The EPU agent runs on EPU workstations either as a Python script or bundled Windows binary. EPU workstations are typically Windows machines isolated from the main network, where specific connectivity is achieved through a proxy and configured via an allow-list. The primary purpose of the EPU agent is to parse EPU software output from the filesystem and communicate data and events to the core backend component.

An EPU data directory is generated by closed-source EPU software and represents an acquisition session using a cryo-electron microscope. The SmartEM Agent provides comprehensive capabilities for processing this data:

Core Capabilities#

  • Real-time monitoring: Watch EPU directories for file changes during active acquisitions

  • Comprehensive parsing: Extract data from all EPU file types (sessions, atlases, grid squares, foil holes, micrographs)

  • Data validation: Verify EPU directory structure and completeness

  • Backend integration: Communicate with SmartEM backend via REST API and Server-Sent Events (SSE)

  • Connection health: Automatic heartbeat monitoring for reliable data transmission

  • Flexible deployment: Run in development, testing, or production modes

Agent Modes#

The agent operates in several modes depending on the timing of EPU data acquisition:

  1. Pre-acquisition mode: Watcher launched before EPU starts writing - real-time monitoring only

  2. Mid-acquisition mode: Watcher launched after EPU starts writing - combines parsing existing files with real-time monitoring

  3. Post-acquisition mode: Watcher launched after EPU finishes - parses complete dataset then monitors for changes

Quick Start#

For comprehensive parameter documentation, see the CLI Reference. For troubleshooting, see the CLI Troubleshooting Guide.

Basic Directory Monitoring#

# Monitor a directory with default settings
python -m smartem_agent watch /path/to/epu/data

# Monitor with verbose output
python -m smartem_agent watch /path/to/epu/data --verbose

# Dry run for testing (no API calls)
python -m smartem_agent watch /path/to/epu/data --dry-run --verbose

Production Deployment with Backend Integration#

# Full production setup with real-time communication
python -m smartem_agent watch /data/microscopy/active_session \
  --api-url https://smartem-backend.facility.ac.uk \
  --agent-id microscope-titan-01 \
  --session-id session-20240115-001 \
  --heartbeat-interval 45 \
  --verbose

Command Categories#

The SmartEM Agent CLI is organised into three main command categories:

1. Parse Commands#

Extract and analyse data from EPU files without backend communication. Useful for development, debugging, and data validation.

2. Validate Commands#

Check EPU directory structure for completeness and compliance with expected formats.

3. Watch Commands#

Monitor directories in real-time for file changes with full backend integration.

Parsing Operations#

Parse commands extract and analyse data from EPU files without communicating with the backend API. These commands are ideal for development, debugging, data validation, and understanding EPU data structures.

Complete Directory Parsing#

Parse entire EPU directories containing multiple grids or complete acquisition sessions:

# Parse complete EPU session directory
python -m smartem_agent parse dir \
  ../smartem-decisions-test-datasets/metadata_Supervisor_20250108_101446_62_cm40593-1_EPU

# Parse different session types
python -m smartem_agent parse dir \
  ../smartem-decisions-test-datasets/metadata_Supervisor_20250114_220855_23_epuBSAd20_GrOxDDM \
  --verbose

python -m smartem_agent parse dir \
  ../smartem-decisions-test-datasets/metadata_Supervisor_20241220_140307_72_et2_gangshun \
  --verbose --verbose  # Debug level output

Individual Component Parsing#

Parse specific EPU file types to understand data structures and debug issues:

Session Files#

# Parse EPU session manifest
python -m smartem_agent parse session \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250129_134723_36_bi37708-28_grid7_EPU/EpuSession.dm \
  --verbose

Atlas Files#

# Parse atlas overview data
python -m smartem_agent parse atlas \
  ../smartem-decisions-test-datasets/bi37708-28-copy/atlas/Supervisor_20250129_111544_bi37708-28_atlas/Atlas/Atlas.dm \
  --verbose

Grid Square Files#

# Parse grid square manifest (XML format)
python -m smartem_agent parse gridsquare \
  ../smartem-decisions-test-datasets/epu-Supervisor_20250404_164354_31_EPU_nr27313-442/metadata_Supervisor_20250404_164354_31_EPU_nr27313-442/Images-Disc1/GridSquare_3568837/GridSquare_20250404_171012.xml

# Parse grid square metadata (DM format)
python -m smartem_agent parse gridsquare-metadata \
  ../smartem-decisions-test-datasets/epu-Supervisor_20250404_164354_31_EPU_nr27313-442/metadata_Supervisor_20250404_164354_31_EPU_nr27313-442/Metadata/GridSquare_3568837.dm \
  --verbose

# Alternative dataset example
python -m smartem_agent parse gridsquare-metadata \
  ./tests/testdata/bi37708-28/Supervisor_20250129_134723_36_bi37708-28_grid7_EPU/Metadata/GridSquare_29273435.dm

Foil Hole Files#

# Parse foil hole positioning data
python -m smartem_agent parse foilhole \
  tests/testdata/epu-dir-example/Images-Disc1/GridSquare_8999138/FoilHoles/FoilHole_9015889_20250108_154725.xml \
  --verbose

# Parse foil hole acquisition data (alternative location)
python -m smartem_agent parse micrograph \
  ../smartem-decisions-test-datasets/epu-Supervisor_20250404_164354_31_EPU_nr27313-442/metadata_Supervisor_20250404_164354_31_EPU_nr27313-442/Images-Disc1/GridSquare_3568837/Data/FoilHole_3595930_Data_3590445_56_20250405_084025.xml

Validation Operations#

Validation commands check EPU directory structure for completeness and compliance with expected formats.

Examples with Expected Outcomes#

Invalid Directories (Expected to Fail)#

These examples demonstrate directories with structural issues:

# Incomplete or malformed directories
python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250129_114842_73_bi37708-28_grid7_EPU \
  --verbose

python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250130_105058_11 \
  --verbose

python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250130_145409_68

python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250130_150924_1grid3

Valid Directories (Expected to Pass)#

These examples show properly structured EPU directories:

# Complete, well-formed directories
python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250129_134723_36_bi37708-28_grid7_EPU \
  --verbose

python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250130_133418_68apoferritin \
  --verbose

python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250130_143856_44Practice

python -m smartem_agent validate \
  ../smartem-decisions-test-datasets/bi37708-28-copy/Supervisor_20250130_145409_68practice2

Understanding Validation Results#

Successful validation returns exit code 0 and confirms the directory structure is valid:

EPU project dir is structurally valid

Failed validation returns exit code 1 and lists specific issues:

Invalid EPU project dir. Found the following issues:
- Missing required file: EpuSession.dm
- Invalid directory structure: Images-Disc1 not found
- Incomplete atlas data: Atlas/Atlas.dm missing

Real-Time Monitoring (Watch Operations)#

The watch command provides real-time monitoring of EPU directories, automatically processing new files and communicating with the SmartEM backend. This is the primary operational mode for production deployments.

Basic Watch Operations#

# Simple directory monitoring (development mode)
python -m smartem_agent watch ../test-dir --log-file output.log

# Monitor with detailed logging
python -m smartem_agent watch /data/microscopy/active_session \
  --log-file /var/log/smartem/session.log \
  --log-interval 5.0 \
  --verbose

# Dry run for testing (no backend communication)
python -m smartem_agent watch ../test-dir \
  --dry-run \
  --verbose --verbose

Production Monitoring with Backend Integration#

# Full production deployment
python -m smartem_agent watch /data/microscopy/active_session \
  --api-url https://smartem-backend.facility.ac.uk \
  --agent-id microscope-titan-01 \
  --session-id session-20240115-001 \
  --heartbeat-interval 45 \
  --sse-timeout 60 \
  --log-interval 10.0 \
  --verbose

# High-frequency monitoring setup
python -m smartem_agent watch /data/high_throughput \
  --api-url http://backend:8000 \
  --agent-id facility-workstation-03 \
  --session-id batch-processing-session \
  --heartbeat-interval 30 \
  --log-interval 5.0 \
  --sse-timeout 120

Watch Operation Modes#

The watch command is designed to gracefully handle different timing scenarios relative to EPU data acquisition:

  1. Pre-acquisition mode: Watcher launched before EPU starts writing

    • Only real-time monitoring is necessary

    • Files are processed as they are created

    • Most efficient mode for active acquisitions

  2. Mid-acquisition mode: Watcher launched after EPU starts writing

    • Combines initial parsing of existing files with real-time monitoring

    • Automatically detects and processes pre-existing data

    • Seamlessly transitions to real-time monitoring

  3. Post-acquisition mode: Watcher launched after EPU finishes

    • Parses complete dataset then monitors for any changes

    • Useful for processing archived or completed datasets

    • Continues monitoring for potential updates

Testing with Simulated EPU Data#

For development and testing, use the fsrecorder tool to simulate realistic EPU file writing patterns:

Recording EPU Patterns#

# Record filesystem events from an existing EPU dataset
python tools/fsrecorder/fsrecorder.py record \
  ../smartem-decisions-test-datasets/epu-Supervisor_20250326_145351_30_nt33824-10_grid2_1in5dil \
  ../test-recording.tar.gz

Replaying EPU Patterns#

# Replay recorded events with accelerated timing
python tools/fsrecorder/fsrecorder.py replay \
  ../test-recording.tar.gz \
  ../test-dir \
  --fast

# Monitor the replayed data
python -m smartem_agent watch ../test-dir \
  --dry-run \
  --verbose --verbose \
  --log-interval 2.0

Quick Testing Alternative#

# For rapid testing, copy data manually (less realistic timing)
cp -r "../smartem-decisions-test-datasets/epu-Supervisor_20250326_145351_30_nt33824-10_grid2_1in5dil/"* ../test-dir/

# Monitor the copied data
python -m smartem_agent watch ../test-dir --dry-run

Important Considerations#

Critical File: The EpuSession.dm file is essential for proper operation as it:

  • Provides references to atlas data

  • Triggers instantiation of new grid entities in the datastore

  • Contains acquisition metadata required for processing

Missing EpuSession.dm: Will prevent proper grid instantiation and data processing.

fsrecorder Tool: The tools/fsrecorder/ utility provides accurate simulation of EPU file writing patterns with proper timing and ordering, making it ideal for development and testing scenarios.

Real-Time Communication Features#

When using --agent-id and --session-id parameters, the agent establishes real-time communication with the backend:

  • Server-Sent Events (SSE): Receives instructions and commands from the backend

  • Heartbeat Monitoring: Sends periodic heartbeats to maintain connection health

  • Automatic Reconnection: Handles connection failures with exponential backoff

  • Instruction Acknowledgment: Confirms receipt and processing of backend instructions

Performance Tuning#

Adjust parameters based on your deployment requirements:

  • High-frequency acquisitions: Lower --log-interval (1-5 seconds), lower --heartbeat-interval (15-30 seconds)

  • Stable networks: Standard --sse-timeout (30-60 seconds)

  • Unstable networks: Higher --sse-timeout (120+ seconds), higher --heartbeat-interval (60+ seconds)

  • Development/testing: Use --dry-run to avoid backend communication