Learner Profiles

Xiu


Xiu just completed a PhD in microbiology and has taken up a postdoc role in a new lab. During her PhD she mostly worked in the wet lab but also performed the data analysis for her project. Initially she tried to do this with Excel but after attending an introductory Python course she continued self-learning and wrote some analysis scripts that ran nicely on the department Linux server. Her new head-of-lab sees that Xiu has an aptitude for programming and has asked her to adopt and extend some pipeline code that the lab uses to process their mouse fecal metabarcoding results.

At present the code only runs on one old Mac in the corner of the lab and is hard to configure as several input parameters must be copied from a spreadsheet and then three steps need to be started manually over the course of several hours. The code consists of a mixture of BASH scripts and Python written by a previous member of the team who has now left. Xiu is enthusiastic but after two months wrestling with the codebase it just seems to be getting messier, and the last time she tried to change something significant the code crashed part way through processing and had to be reverted. She has adopted the approach of leaving the old code untouched wherever possible and writing new Python wrappers to add functionality. Xiu fears she is out of her depth working with this codebase, but she really wants to impress her new team and properly get to grips with the system.

The Snakemake Pipelines course will show Xiu how a data processing pipeline can be designed and constructed around Snakemake workflows. It will teach how to configure those workflows in a logical way, making them robust and portable. It will demonstrate how her existing Python knowledge can be applied in these new workflows. Ultimately, Xiu should have the confidence to replace old problematic code with redesigned modules, and be able to articulate to her team the benefits of this approach.

Xiu would also benefit from other lessons in The Carpentries, specifically on version control with git.

Ahmad


Ahmad has worked for 12 years in the genotyping lab at Prestonfield University. His manager is encouraging him to develop new skills in data analysis in order for Ahmad to progress his career and help the lab adapt to work with more data-intensive techniques like RAD-seq. Ahmad has recently been on an introductory Linux course. He enjoyed this and was able to work through the course material and complete most of the exercises, but felt himself to be slow and struggling compared to others on the course, who seemed to be more computer-savvy.

Ahmad is still rather wary of putting his newfound Linux skills into practise. He mostly sticks to running the standard commands which are part of the established process within the facility. He does not yet see himself as a prospective bioinformatician or programmer but does recognise that there is an opportunity to be grasped and is a willing learner.

The Snakemake course will reinforce and build on the Linux knowledge that Ahmad gained from the earlier course. While the example data used in the course in not exactly the type that he works with, the general approach should translate to ideas that can be applied in the genotyping lab. After installing Snakemake onto the facility server, Ahmad will gain the confidence to propose these new ideas to his manager and to start trying them out.

Zoe


Zoe is a PI of a small research group that studies pesticide resistance in weevils. The group sends RNA samples for Illumina sequencing and receives back raw results. Zoe has decent experience with the tools used for data analysis, having worked hands-on with the first five datasets that were sequenced, and is known as a bit of an “R wizard”.

Following the release of a new weevil genome, Zoe is looking at the feasibility of re-analysing her previous datasets against the new genome build. A colleague suggested that Snakemake might help, and being a hands-on person she has decided to attend the course herself, along with a postdoc from her lab. She’s also considering that the reanalysis task could form the basis of a project for a bioinformatics masters student, to be supervised by the postdoc. Either way, she would like to understand how it all works, even if the life of a PI doesn’t give much time for hands-on coding.