Executing STEMsalabim¶
STEMsalabim is executed on the command line and configured via input configuration files in libConfig syntax. To learn about the structure of the configuration files, please read Parameter files.
Note
Some of configuration parameters can be changed via command line parameters, which are described in Command line arguments.
STEMsalabim supports both threaded (shared memory) and MPI (distributed memory) parallelization. For most efficient resource usage we recommend a hybrid approach, where one MPI task is run per node that spawns a bunch of threads to parallelize the work within the node. (See Hybrid Parallelization model for more information on how STEMsalabim is parallelized.)
Thread-only parallelization¶
You can execute STEMsalabim on a single multi-core computer as follows:
$ stemsalabim --params=./my_config_file.cfg --num-threads=32
This will run the simulation configured in my_config_file.cfg
on 32 cores, of which 31 are used as workers.
MPI only parallelization¶
For pure MPI parallelization without spawning additional threads, STEMsalabim must be called via mpirun
or
mpiexec
, depending on the MPI implementation available on your machine:
$ mpirun -n 32 stemsalabim --params=./my_config_file.cfg --num-threads=1 --package-size=10
This command will run the simulation in parallel on 32 MPI processors without spawning additional threads.
Note
We chose a work package size ten times the number of threads on each MPI processor (which is 1 here). This is so that each thread calculates (on average) ten pixels until results are communicated via the network. This reduces management overhead but increases the amount of data sent via the network.
Hybrid parallelization¶
Hybrid parallelization is the recommended mode to run STEMsalabim.
For hybrid parallelization, make sure that on each node only a single MPI process is spawned and that there is no CPU pinning active, i.e., STEMsalabim needs to be able to spawn threads on different cores.
For example, if we wanted to run a simulation in parallel on 32 machines using OpenMPI and on each machine use 16 cores, we would run
$ mpirun -n 32 --bind-to none --map-by ppr:1:node:pe=16 \
stemsalabim \
--params=./my_config_file.cfg \
--num-threads=16 \
--package-size=160
The options --bind-to none --map-by ppr:1:node:pe=16
tell OpenMPI not to bind the process to anything and to reserve
16 threads for each instance. Please refer to the manual of your MPI implementation to figure out how exactly to run the
software. On HPC clusters it is wise to contact the admin team for optimizing the simulation performance.
Running the Si 001
example¶
In the source code archive you find an examples/Si_001
folder that contains a simple example that you can
execute to get started. The file Si_001.xyz
describes a 2x2x36 unit cell Si sample. Please see
Crystal file format for the format description.
In the file Si_001.cfg
we find the simulation configuration / parameters. The file contains
all available parameters, regardless of whether they have their default value. We recommend to always
specify a complete set of simulation parameters in the configuration files.
You can now run the simulation:
$ /path/to/stemsalabim --params Si_001.cfg --num-threads=8
After the simulation finished (about 3 hours on an Intel i7 CPU with 8 cores) you can analyze the
results found in Si_001.nc
. Please see the next page (Visualization of crystals and results) for details.