Instructor Notes
This is a placeholder file. Please add content here.
Running commands with Snakemake
Use of the --forceall flag
In the first few episodes we always run Snakemake with the
--forceall
flag, and it’s not explained what this does
until Ep. 04. The rationale is that the default Snakemake behaviour when
pruning the DAG leads to learners seeing different output (typically the
message “nothing to be done”) when repeating the exact same command.
This can seem strange to learners who are used to scripting and
imperative programming.
The internal rules used by Snakemake to determine which jobs in the
DAG are to be run, and which skipped, are pretty complex, but the
behaviour seen under --forceall
is much more simple and
consistent; Snakemake simply runs every job in the DAG every time. You
can think of --forceall
as disabling the lazy evaluation
feature of Snakemake, until we are ready to properly introduce and
understand it.
Running Python code with Snakemake
Placeholders and wildcards
Chaining rules
Metadata and parameters
Multiple inputs and outputs
How Snakemake plans jobs
Optimising workflow performance
Running on cluster and cloud
Running workflows on HPC or Cloud systems could be a whole course in itself. The topic is too important not to be mentioned here, but also complex to teach because you need a cluster to work on.
If you are teaching this lesson and have institutional HPC then ideally you should liaise with the administrators of the system to make a suitable installation of a recent Snakemake version and a profile to run jobs on the cluster job scheduler. In practise this may be easier said than done!
If you are able to demonstrate Snakemake running on cloud as part of a workshop then we’d much appreciate any feedback on how you did this and how it went.