Examples of using ESTEEM¶
In Using ESTEEM we wrote the following short script, cate.py
, to run catechol in water:
from esteem import drivers, parallel
from esteem.wrappers import nwchem, amber, onetep
# List solutes and solvents and get default arguments
all_solutes = {'cate': 'catechol'}
all_solvents = {'cycl': 'cyclohexane', 'meth': 'methanol'}
from esteem.tasks.solutes import SolutesTask
from esteem.tasks.solvate import SolvateTask
from esteem.tasks.clusters import ClustersTask
from esteem.tasks.spectra import SpectraTask
solutes_task = SolutesTask()
solvate_task = SolvateTask()
clusters_task = ClustersTask()
spectra_task = SpectraTask()
# Some simple overrides for a quick job
solutes_task.basis = '6-31G'
solutes_task.func = 'PBE'
solutes_task.directory = 'PBE'
solvate_task.boxsize = 18
solvate_task.ewaldcut = 10
solvate_task.nsnaps = 20
clusters_task.radius = 3
# Setup parallel execution of tasks
solutes_task.wrapper = nwchem.NWChemWrapper()
solutes_task.script_settings = parallel.get_default_script_settings(solutes_task.wrapper)
solutes_task.wrapper.setup()
solvate_wrapper = amber.AmberWrapper()
solvate_task.script_settings = parallel.get_default_script_settings(solvate_task.wrapper)
clusters_wrapper = onetep.OnetepWrapper()
clusters_task.script_settings = parallel.get_default_script_settings(clusters_task.wrapper)
# Run main driver
drivers.main(all_solutes,all_solvents,
solutes_task,solvate_task,clusters_task,spectra_task,
make_script=parallel.make_sbatch)
exit()
The following section addresses how we would use that script to run a complete explicit solvent workflow.
First, we generate job submission scripts:
$ python cate.py scripts cate
scripts
is the ‘task’, and cate
is the ‘seed’ for the calculation (which should match the name of the script).
This will produce five job submission scripts: cate_solutes_sub
, cate_solvents_sub
, cate_solvate_sub
, cate_clusters_sub
and cate_spectra_sub
.
Solutes and Solvents Tasks¶
Let’s run the first two at the same time. I will assume we are using SLURM rather than PBS throughout this example:
$ sbatch cate_solutes_sub
$ sbatch cate_solvents_sub
As directed in the script above, this will launch the Solutes task for the list of solutes (1 entry: cate
) and the list solvents (2 entries: cycl
and meth
).
These will create a new directory PBE_6-31G
and run NWChem geometry optimisation and TDDFT calculations in there, first in gas phase then in implicit solvent. If your cluster does not allow wget
to access the Chemical Structure Resolver, you will need to supply initial guess structures in PBE_6-31G/xyz
for all three molecules. The output file for the geometry optimisations will be PBE_6-31G/geom/cate/geom.nwo
and the resulting structure will be in PBE_6-31G/opt
. Calculations will be repeated in water using the COSMO implicit solvent model to produce the final geometry PBE_6-31G/is_opt_watr
.
Solvate Task¶
We now want to run the MD task. If we just ran:
$ sbatch cate_solvate_sub
This would launch the Solvate task twice, one after the other, for the two different solvents. A more likely scenario is that we want to run these as two separate jobs, so we can use an array task:
$ sbatch --array=0-1 cate_solvate_sub
Which launches two jobs, one for catechol in cyclohexane, one for catechol in ethanol. The results will go in separate directories, cate_cycl_md
and cate_meth_md
.
There are many output files in each directory, for the different steps of setup and different MD runs, but the most important ones are the ‘trajectory’ of snapshots: cate_cycl_solv.traj
and cate_meth_solv.traj
which get used by the next step.
Clusters Task¶
The clusters task is next. We can run its setup task from the command line:
$ python cate.py clusters cate
This is a short task that sets up directories - it will set up cate_cycl_exc
and cate_meth_exc
in this case. In each it will put a script. Change into the first directory and list the contents:
$ cd cate_cycl_exc
$ ls
You should see four files: cate.xyz
, cycl.xyz
, cate_cycl_solv.traj
and cate_cycl_exc_sub
. The latter is the job script which can be used to launch a calculation for each of the snapshot clusters.
We would launch 10 calculations on equally-spaced snapshots from 0 to 90 (ie 0,10,20,30,…,90) as follows:
$ sbatch –array=0-90:10 cate_cycl_exc_sub
The result will be a ONETEP calculation for each cluster. You can check their progress with squeue
and tail *.out
.
Once they have finished, they will produce files with names such as cate_cycl_solv000.out
(and any other output files produced by the code) you can run the Spectra task to generate plots.
You may want to inspect a few of these to check the behaviour is as expected.
Spectra Task¶
This is where the raw results from the Clusters task get turned into spectra for plotting purposes. You can run this task interactively at the command line to generate .png
files on your compute cluster:
$ python cate.py spectra cate
or, if you prefer, you can transfer all the .out
files from the clusters run back to another machine for interactive analysis in a notebook, by calling routines such as spectra_driver()