Course introduction


FAIR research software


  • Open research means the outputs of publicly funded research are publicly accessible with no or minimal restrictions.
  • Reproducible research means the data and software is available to recreate the analysis.
  • FAIR data and software is Findable, Accessible, Interoperable, Reusable.
  • These principles support research and researchers by saving time, reducing barriers to discovery, and increasing impact of the research output.

Tools and practices for research software development


  • Automating your analysis with shell scripts allows you to save and reproduce your methods.
  • Version control helps you back up your work, see how data and code change over time and identify which analysis used which data and code.
  • Programming languages each have advantages and disadvantages in different situations. Use the correct tools for your own work.
  • Integrated development environments (IDEs) automate many coding tasks, provide easy access to documentation, and can identify common errors.
  • Testing helps you check that your code is behaving as expected and will continue to do so in the future or when used by someone else.

Version control


  • A version control system is software that tracks and manages changes to a project over time
  • Using version control aids reproducibility since the exact state of the software that produced an output can be recovered
  • A commit represents the smallest unit of change to a project
  • Commit messages describe what each commit contains and should be descriptive
  • Logs can be used to overview the history of a project

Code readability


  • Readable code is easier to understand, maintain, debug and extend!
  • Creating functions from the smallest, reusable units of code will help compartmentalise which parts of the code are doing what actions
  • Choosing descriptive variable and function names will communicate their purpose more effectively
  • Using inline comments and docstrings to describe parts of the code will help transmit understanding, and verify that the code is correct

Code testing


  1. Code testing supports the FAIR principles by improving the accessibility and re-usability of research code.
  2. Unit testing is crucial as it ensures each functions works correctly.
  3. Using the pytest framework, you can write basic unit tests for Python functions to verify their correctness.
  4. Identifying and handling edge cases in unit tests is essential to ensure your code performs correctly under a variety of conditions.
  5. Test coverage can help you to identify parts of your code that require additional testing.

Documenting code


  • Documentation allows users to run and understand software without having to work things out for themselves directly from the source code.
  • Software documentation supports the FAIR principles by improving the reusability of research code.
  • A (good) README, CITATION file and LICENSE file are the minimum documentation elements required to support FAIR research code.
  • Documentation can be provided to users in a variety of formats including a docs folder of Markdown files, a repository Wiki and static webpages.
  • A static documentation site can be created using the tool mkdocs.
  • Documentation frameworks such as Diataxis provide content and style guidelines that can helps us write high quality documentation.

Open project collaboration & managementLicensingSharing your code to encourage collaborationWorking with collaborators


  • Open source applies Copyright licenses permitting others to reuse and adapt your code or data.
  • Permissive licenses allow code to be used in other products providing the copyright statement is displayed.
  • Copyleft licenses require the source code of any modifications to be released under a copyleft license.
  • Creative commons licenses are suitable for non-code files such as documentation and images.
  • Open source software can be sold, but you must supply the source code and the people you sell it to can give it away to somebody else.
  • Add license file to your repository and add a license to each file in case it gets detached.
  • Zenodo can be used to archive a Github repository and obtain a DOI for it.
  • We can include a CITATION file to tell people how to cite our code.
  • Github can track bugs or issues with a program.
  • Git branches can be used to allow multiple developers to work on the same part of a program in parallel.
  • The git branch command shows the list of branches and can create new branches.
  • The git switch command changes which branch we are working on.
  • The git merge command merges another branch into the current one.
  • Pull requests allow developers to work on their own branch/fork and then request other developers review their changes before they are merged.

Ethical considerations for research software


  • To act ethically, we have to be both responsible producers and consumers of research software.
  • Free and open source research software and data, along with FAIR, ethical and environmental considerations, offer viable ways to make more transparent, reliable and accountable science that not only advances scientific knowledge but also respects and protects the rights and well-being of all users.

Wrap-up


  • When developing software for your research, think about how it will help you and your team, your peers and domain/community and the world.