Task drivers in ESTEEM¶

The task drivers in ESTEEM provide a means of looping over multiple sets of arguments for a given task, such that a set of calculations can be performed consistently over, for example, many combinations of molecule and solvent, or many different temperatures, or a range of cluster sizes.

There is also a main driver, drivers.main(), which determines which of the other drivers should be run, based on the command-line arguments. Invoking this main driver is the standard way of using ESTEEM in scripts.

However, for more advanced functionality, you can call the other drivers manually from your scripts. This page documents the main driver and the individual task drivers.

Drivers to run ESTEEM tasks in serial or parallel on a range of inputs.

These inputs will consist of solutes, solvents or pairs of solutes and solvent, depending on the task.

esteem.drivers.main(all_solutes, all_solvents, all_solutes_tasks={}, all_solvate_tasks={}, all_clusters_tasks={}, all_spectra_tasks={}, all_qmd_tasks={}, all_mltrain_tasks={}, all_mltest_tasks={}, all_mltraj_tasks={}, make_script=<function make_sbatch>)[source]¶

Main driver routine which chooses other tasks to run as appropriate.

Control of what driver is actually called depends on command-line arguments with which the script was invoked: ‘python <seed>.py <task> <seed> <target>’

all_solutes: dict of strings: Keys are shortnames of the solutes. Entries are the full names.
all_solvents: dict of strings: Keys are shortnames of the solvents. Entries are the full names.
all_solutes_tasks: namespace or class: Argument list for the Solutes task - see solutes_driver documentation below for more detail.
all_solvate_tasks: namespace or class: Argument list for the Solvate task - see solvate_driver documentation below for more detail.
all_clusters_tasks: namespace or class: Argument list for the Clusters task - see clusters_driver documentation below for more detail.
all_spectra_tasks: namespace or class: Argument list for the Spectra task - see spectra_driver documentation below for more detail.
make_script: dict of strings: Routine that can write a job submission script. Usually parallel.make_sbatch.

esteem.drivers.solutes_driver(all_solutes, all_solvents, task)[source]¶

Driver to run a range of DFT/TDDFT calculations on a range of solutes. If the script is run as an array task, it performs the task for the requested solute only.

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents (implicit only, here). Entries are the full names.

task: SolutesTask class

Argument list for the whole Solutes job - see Solutes module documentation for more detail.

Arguments used only within the driver routine include:

task.directory: Directory prefix for where output of this particular ‘target’ of the Solutes calculation will be written

esteem.drivers.solvents_driver(all_solvents, task)[source]¶

Driver to run a range of DFT/TDDFT calculations on a range of solvent molecules. See the Solutes module documentation for more info on what tasks are run.

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents. Entries are the full names.

task: SolutesTask class

Argument list for the whole Solutes job - see Solutes module documentation for more detail.

Arguments used only within the driver routine include:

task.directory: Directory prefix for where output of this particular ‘target’ of the Solutes calculation will be written

wrapper: namespace or class

Functions that will be used in the Solutes task - see Solutes module documentation for more detail.

esteem.drivers.solvate_driver(all_solutes, all_solvents, seed, task, make_sbatch=None)[source]¶

Driver to run set up and run MD on solvated boxes containing a solute molecule and solvent. If the script is run as an array task, it performs the task for the requested solute/solute pair only, out of the nsolutes * nsolvents possible combinations.

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents. Entries are the full names.

task: SolvateTask class

Argument list for the whole clusters job - see Solvate module documentation for more detail.

Arguments used within this routine include:

task.md_suffix: Directory suffix for where the MD runs will take place

task.md_geom_prefix: Directory prefix for where the geometries from the Solutes calculation should be obtained.

wrapper: class

Wrapper that will be used in the Solvate task - see Solvate module documentation for more detail.

esteem.drivers.clusters_driver(all_solutes, all_solvents, seed, task, make_sbatch=None, dryrun=False)[source]¶

Driver to extract isolated clusters from solvated models, for a range of solute/solvent pairs.

Takes MD results from the directory {task.md_prefix} and performs excitation calculation in the directory {solute}_{solvent}_{task.exc_suffix}.

If invoked from the base directory, rather than the excitation directory, it sets up the excitation directory and writes a job script then exits.

If invoked from the excitation directory, it performs the excitation calculations for all the extracted clusters. If the script is run as an array task, it performs the task for the requested cluster only.

Arguments

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents. Entries are the full names.

seed: str

Overall ‘seed’ name for the run - used in creation of job scripts for the calculation

task: ClustersTask class

Argument list for the whole clusters job - see Clusters module documentation for more detail.

Arguments used predominantly in the driver rather than the task main routine include:

task.md_prefix: Directory where MD outputs are to be found.

task.md_suffix: Suffix of MD output trajectories.

task.exc_suffix: Directory where results of Cluster excitation calculations will go.

make_sbatch: function

Function that writes a job submission script for the clusters jobs

Output:

On first run, from the base directory of the project, the script will create subdirectories for each solute-solvent pair, with path ‘{solute}_{solvent}_{exc_suffix}’. The default value of exc_suffix is ‘exc’.

To each directory, this routine will copy the trajectory file ‘{md_prefix}/{solute}_{solvent}_{md_suffix}.traj’ which consists of task.nsnaps snapshots.

It will then create a job script using make_sbatch and settings with the name ‘{solute}_{solvent}_{exc_suffix}_sub’ and exit.

The user then needs to run the individual job scripts from each subdirectory, probably as a job array: the array task ID should range from 0 to task.nsnaps - 1

The output of those runs will be the excited state energies of the clusters, which can be averaged over in the Spectra task.

esteem.drivers.qmd_driver(qmdtraj, all_solutes, all_solvents)[source]¶

Driver to run calculations to generate a range of Quantum Molecular Dynamics trajectories. If the script is run as an array task, it performs the task for the requested trajectory only.

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents (implicit only, here). Entries are the full names.

task: QMDTrajTask class

Argument list for the whole QMD_Trajectories job - see QMD_Trajectories module documentation for more detail.

Arguments used only within the driver routine include:

task.directory: Directory prefix for where output of this particular ‘target’ of the Solutes calculation will be written

Arguments for which the driver routine will perform solute & solvent name substitution:

task.solvent:

esteem.drivers.mltrain_driver(mltrain_task, all_solutes={}, all_solvents={})[source]¶

Driver to train a Neural Network (or other machine-learning approach) for a dataset. Key arguments are the wrapper, which supplies the interface to the ML model, and the specification of the trajectory data.

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents (implicit only, here). Entries are the full names.

task: MLTrainTask class

Argument list for the whole MLTrainTask job - see ML_Training module documentation for more detail.

Arguments used only within the driver routine include:

mltrain_task.traj_links: Expressed as a dictionary containing trajectory labels and full paths of trajectory data files, relative to the path from which the script was invokes

Arguments for which the driver routine will perform solute & solvent name substitution:

mltrain_task.seed:

esteem.drivers.mltest_driver(mltest, all_solutes, all_solvents)[source]¶

Driver to run tests of a Neural Network (or other machine-learning approach) by comparing it to results from a dataset, which may have been calculated already using eg a Clusters task. Key arguments are the wrapper, which supplies the interface to the ML model, and the specification of the test data.

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents (implicit only, here). Entries are the full names.

mltest_task: MLTestTask class

Argument list for the whole MLTestTask job - see ml_testing module documentation for more detail.

Arguments used only within the driver routine include:

mltest_task.traj_links: Expressed as a dictionary containing trajectory labels and full paths of trajectory data files, relative to the path from which the script was invokes

Arguments for which the driver routine will perform solute & solvent name substitution:

mltest_task.calc_seed:

mltest_task.traj_prefix:

esteem.drivers.spectral_warp_driver(all_solutes, all_solvents, task, annotation=None)[source]¶

Driver to calculate spectral warping parameters for a range of solute/solvent pairs

all_solutes: dict of strings

Keys are shortnames of the solutes. Entries are the full names.

all_solvents: dict of strings

Keys are shortnames of the solvents. Entries are the full names.

task: SpectraTask class

Argument list for the whole spectra job - see Spectra module documentation for more detail.

Arguments used only in the driver include:

task.exc_suffix: Directory in which results of excitation calculations performed by the Clusters task can be found. The pattern used to find matches is: ‘{solute}_{solvent}_{exc_suffix}/{solute}_{solvent}_solv*.out’

task.warp_origin_ref_peak_range: Peak range searched when looking for ‘reference’ peaks. in the origin spectrum for spectral warping.

task.warp_dest_ref_peak_range: Peak range searched when looking for ‘reference’ peaks. in the destination spectrum for spectral warping.

task.warp_broad: Broadening to be applied to origin and destination spectra.

task.warp_inputformat: Format of the files to be loaded for origin and destination spectra. [TODO: May need to be adjusted to allow separate task.warp_ref_inputformat and task.warp_dest_inputformat]

task.warp_files: File pattern to search for when looking for origin and destination spectra for spectral warping.

task.merge_solutes: Dictionary: each entry should be a list of solute names that will be merged into the corresponding key