This lesson is still being designed and assembled (Pre-Alpha version)

Publishing your data analysis code

This lesson will discuss how you (and why) you can publish the software used to enable your research.


This lesson assumes familiarity with the Git version control system, the Python programming language, and will also touch on using the Unix shell. If these are not familiar to you, then it would be worth trying to go to a Software Carpentry workshop before trying to work through this one. If you are unable to go, you can work through the notes linked for each topic above, which is the same material as a Software Carpentry workshop would cover.

This lesson incorporates material from Research Software Engineering in Python by Damien Irving, Kate Hertweck, Luke Johnston, Joel Ostblom, Charlotte Wickham, and Greg Wilson, and the supporting repository


Setup Download files required for the lesson
00:00 1. Get it in Git Why should I make my code public?
How do I get files into a GitHub repository?
What should I include in a repository?
01:10 2. Structuring your repository How should files be organised in a repository?
What metadata should be included, and how?
How do I adjust the structure of a repository once it is created?
02:10 3. Documentation and automation How do I tell other researchers how to use my code?
How can I make it easier for others (or me) to run my full analysis?
03:35 4. Jupyter Notebooks and automation How does using Jupyter Notebooks affect automation and reproducibility?
How do I put a Jupyter Notebook into a repository?
What changes can I make to a Jupyter Notebook to improve automation?
04:20 5. Data How do I get data for my code to work on?
Where should I store data?
Should data be published?
05:05 6. Reproducible software environments Why do I need to document the software environment?
How do I document what packages and versions are needed to reproduce my work?
How can I use environment definitions to get started on a new machine?
05:45 7. Verifying your analysis How can I check that my analysis is working?
How do I verify that the environment definition is correct?
06:25 8. Publishing in open science repositories How is an open science repository different to something like GitHub?
How do I create a permanent version of record of my code?
How can I create a DOI for my code and cite it?
07:05 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.