Instructor Notes

This is a placeholder file. Please add content here.

Running commands with Snakemake


Use of the --forceall flag

In the first few episodes we always run Snakemake with the --forceall flag, and it’s not explained what this does until Ep. 04. The rationale is that the default Snakemake behaviour when pruning the DAG leads to learners seeing different output (typically the message “nothing to be done”) when repeating the exact same command. This can seem strange to learners who are used to scripting and imperative programming.

The internal rules used by Snakemake to determine which jobs in the DAG are to be run, and which skipped, are pretty complex, but the behaviour seen under --forceall is much more simple and consistent; Snakemake simply runs every job in the DAG every time. You can think of --forceall as disabling the lazy evaluation feature of Snakemake, until we are ready to properly introduce and understand it.



Running Python code with Snakemake


Placeholders and wildcards


Chaining rules


Metadata and parameters


Multiple inputs and outputs


How Snakemake plans jobs


Optimising workflow performance


Running on cluster and cloud

Running workflows on HPC or Cloud systems could be a whole course in itself. The topic is too important not to be mentioned here, but also complex to teach because you need a cluster to work on.

If you are teaching this lesson and have institutional HPC then ideally you should liaise with the administrators of the system to make a suitable installation of a recent Snakemake version and a profile to run jobs on the cluster job scheduler. In practise this may be easier said than done!

If you are able to demonstrate Snakemake running on cloud as part of a workshop then we’d much appreciate any feedback on how you did this and how it went.