Working with Modules and Installing Software
Overview
Teaching: 15 min
Exercises: 30 minQuestions
How do I get access to additional software?
Objectives
Understand how to load a module
Understand how to install pip packages for Python
Using & installing software
On most cluster systems, all the software installed is not immediately available for use; instead, it is loaded into your environment incrementally using a module system.
The module
command controls this.
You can get a list of available software with the module avail
command. This should return a long list of available software.
One common piece of software that isn’t installed on the Supercomputing Wales hubs (without a module) is R. If we attempt to run R
from the command line it will respond with an error:
[s.jane.doe@sl1 ~]$ R
-bash: R: command not found
The login nodes and the compute nodes have identical configurations in terms of what software is available, so you can discover if your program will run from the login nodes. (The only exception to this is if your software requires GPUs; since these are not available on the login nodes, you cannot test GPU software there.)
From the output of the module avail
command there should have been
an entry R/3.6.2
in the /apps/modules/langauges
section
near the top.
[s.jane.doe@sl1 ~]$ module load R/3.6.2
[s.jane.doe@sl1 ~]$ R
WARNING: ignoring environment value of R_HOME
R version 3.6.2 (2019-12-12) -- "Dark and Stormy Night"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> q()
Save workspace image? [y/n/c]: n
[s.jane.doe@sl1 ~]$
Legacy HPC Wales and Raven Modules
All of the old modules which were available on HPC Wales and the old Cardiff Raven system are available by running either
module load hpcw
ormodule load raven
. (Raven modules are only available on Hawk.) Please note that much of this software is outdated and may by suboptimal as it is not compiled to take advantage of the newer CPU architectures on Supercomputing Wales.
Installing Python modules
Python is a popular language for researchers to use and has modules
(aka libraries) that do all kinds of useful things, but sometimes the
module you need won’t be installed. Many modules can be installed
using the pip
(or pip3
with Python 3) command. Since we don’t have
administrative rights to the supercomputer, we can’t install these
into the shared base environment; instead, we need to create
a local one.
We recommend using Conda for this. Conda is a package manager that can create self-contained Python environments, and install various versions of Python and other related dependencies into them. Anaconda is a popular distribution of Python for scientific computing that is based on Conda, including a large number of frequently-used packages. Anaconda is available on Supercomputing Wales, and can then be used to build new Conda environments.
To load Anaconda, two commands are necessary:
> module load anaconda/2021.05
> source activate
The first of these we saw above; the second initialises Conda within your current shell.
Since we are using Conda for the first time, it is useful to add
conda-forge
to our list of Conda channels. This will give us access
to a wider range of pre-built Conda packages.
$ conda config --add channels conda-forge
With this done, we can now use Conda to create a new environment to install the packages that we want for our own work:
$ conda create -n scw_test python=3.9 mamba
This tells the conda
command to create
a new environment, to
give it the name scw_test
, and to install Python 3.9 and Mamba into
it. Mamba is an alternative version of Conda, that can solve the
complicated problem of getting compatible versions of all of your
requested packages simultaneously more quickly than Conda can.
Conda will take a little time to work out what it needs to install,
and once you confirm by typing y
, then place it in a new directory
in ~/.conda/envs
.
Once your environment is created, you can activate it so you can use it to work in with the command
$ conda activate scw_test
The prefix at the start of your prompt will now change from base
to scw_test
, to indicate the environment that you have active.
Similarly, which python
now returns
~/.conda/envs/scw_test/bin/python
, indicating that this is now
where Python will run from if you run python
.
So far we have created a relatively bare environemnt, but we can install packages into the new environment just as we can on our own machines. Let’s now install Matplotlib and a recent Tensorflow version into this environment:
$ mamba install matplotlib tensorflow-gpu\>=2.5
Mamba automatically works out which extra packages need to be present for Matplotlib and Tensorflow to work, and then prompts to install them. In the latter case, since we requested the GPU version of Tensorflow, it automatically installs all of the requisite CUDA libraries as well.
You could alternatively have specified the packages directly to the
conda create
command, and it would have been installed when the
environment was created. (This would use Conda’s solver, though,
which would take longer to work out what needed installing.)
If you wanted to create a full Anaconda Python installation to base
your environment on, you can do this with
conda create -n [name] anaconda
; we don’t do this here because
it creates a very large number of files and takes a substantial
time to download and install. In general, it’s better to only
install packages that you need, since they are less likely to
conflict with each other.
Using pip
works very similarly to Python; provided the Anaconda
module is loaded and your Conda environment is activated, then pip
will automatically install into your Conda environment and not try to
install at the system level.
For more detail on using Python with Supercomputing Wales, see the training on High Performance Python.
How to install other software yourself
For Perl modules and R packages, we encourage you to set up directories in your home and/or lab folders for installing your own copies locally. Instructions on how to install these are available from the How To pages on the Supercomputing Wales portal.
Requesting additional software
If software you need is not installed, we encourage you to do local installs in your home or project shared directory for bleeding-edge releases, software you are testing, or software used only by your project. For programs that are commonly used by your domain, field, or department, please submit a software install request. Note that due to demand and the complex nature of software installs, it may take a while for us to complete these requests.
Commercial software will require the appropriate licenses.
Exercises
Running a Python script
- Create a new file using
nano
and call itplot.py
. Give it the following contents:import matplotlib as mpl mpl.use('Agg') #set the backend to Agg to make a png file instead of displaying on screen import matplotlib.pyplot as plt plt.plot(range(10)) plt.savefig('temp.png')
- Create a new job submission script called
plot.sh
containing the following:#!/bin/bash --login ### # job name #SBATCH --job-name=plotgraph # job stdout file #SBATCH --output=plotgraph.out.%J # job stderr file #SBATCH --error=plotgraph.err.%J # maximum job time in D-HH:MM #SBATCH --time=0-00:01 # number of tasks you are requesting #SBATCH --ntasks=1 # memory per process in MB #SBATCH --mem=2 # number of nodes needed #SBATCH --nodes=1 # specify our current project # change this for your own work #SBATCH --account=scwXXXX # specify the reservation we have for the training workshop # remove this for your own work # replace XX with the code provided by your instructor #SBATCH --reservation=scwXXXX_YY ### module load anaconda/2019.03 source activate scw_test python3 plot.py
- Run the job with
sbatch plot.sh
. Did the job complete successfully? Are there any errors in the error file?- Fix the cause of the errors by adjusting (or removing) the appropriate parameter.
- Copy back the resulting file,
temp.png
using SCP or SFTP and view it on your computer.- Add the following lines to the Python script (between the
plt.plot
line and theplt.savefig
line) to make it embed the job ID into the title of the image:import os jobid = str(os.environ.get('SLURM_JOBID')) plt.title(f'Job id {jobid}')
Run the job again and look at the output image. Try using some of the other Slurm environment variables from this list, too.
Solution
- You will get errors about the amount of memory being used. These might show up as a segmentation fault rather than a memory error. For some reason this script can be very slow to run sometimes, so increasing the timeout can help too—only do this if you are getting timeout errors. Around 50MB of memory and 5 minutes should be enough.
- Your Python code should now read:
import matplotlib as mpl mpl.use('Agg') #set the backend to Agg to make a png file instead of displaying on screen import os import matplotlib.pyplot as plt plt.plot(range(10)) jobid=str(os.environ.get('SLURM_JOBID')) plt.title(f'Job id {jobid}') plt.savefig('temp.png')
Key Points
A lot of software is only available by loading extra modules. This helps prevent problems where two packages are incompatible.
Python 3 is one such package.
If you want to install pip packages use the
--user
option to store the packages in your home directory.