Putting it all together
Overview
Teaching: 5 min
Exercises: 90 minQuestions
How can I apply all of these techniques at once to a real application?
Objectives
Be able to apply testing and CI techniques to a piece of research software.
Now we have developed a range of skills relating to testing and continuous integration, we can try putting them into practice on a real piece of research software.
The software
We are going to work with pl_curves
, a piece of research software developed by
Dr Colin Sauzé at Aberystwyth University, which calculates Pareto–Lorenz (PL)
curves for calculating the relative abundance of different bacteria in a
community. It also calculates a Gini coefficient to show how evenly distributed
the different bacteria are. It already has tests written for most functions.
Your task
- Fork the repository. You don’t have push access to the original repository, so you will need to use your own copy of it.
- Enable GitHub actions on your fork, as GitHub disables Actions for forks by default.
- Update the CI and Codecov badges to point to your copy of the repository. Pushing these changes should automatically run the test suite and update the badges.
- Create a virtual environment on your computer for the project, and install the project’s requirements, so you can run the test suite locally.
- Upgrade to the most recent version of Pandas, Matploblib and Pytest. Again, see if this breaks anything. If it does, then fix the issues, and ensure that the test suite passes again.
- Currently, some of the tests for the repository fail. Work out why this is happening, and fix the issues. Check that they are fixed in the CI workflow as well.
- Currently, the code is only tested for Python versions up to 3.9. Since Python has moved on now, add 3.10 and 3.11 as targets for the CI. Do the tests pass now? If not, identify what has caused them to fail, and fix the issues you identify. This is an important reason for having a test suite: sometimes changes entirely external to your code will break your code. Without a test suite, you don’t know whether this has happened until someone points out that your new results don’t match your older ones! Having CI set up allows easy testing of multiple different versions.
- Currently the code is being tested against Ubuntu 20.04 (released April 2020). A new long term support release of Ubuntu came out in April 2022 (version 22.04). Upgrade the operating system being tested from Ubuntu 18.04 to Ubuntu 20.04. As with upgrading Python, the test suite helps us check that the code still runs on a newer operating system.
Hint: In general, before changes are made to libraries that will break existing software using those libraries, they are “deprecated” for some period of time. During this time, the software will issue a warning of the impending breakage of the function, and give advice on how to modify your code so that a) the warning will go away, and b) the software will not break when the breaking change is made in a future version.
Key Points
Testing and CI work well together to identify problems in research software and allow them to be fixed quickly.
If anything is unclear, or you get stuck, please ask for help!