Introduction and setup


  • Participants will only be able to install the version of Bioconductor packages described in this lesson and reproduce their exact outputs if they use the correct version of R.
  • The files used in this lesson should be downloaded in a local path that is easily accessible from an R session.

Introduction to Bioconductor


  • R packages are but one aspect of the Bioconductor project.
  • The Bioconductor project extends and complements the CRAN repository.
  • Different types of packages provide not only software, but also annotations, experimental data, and demonstrate the use of multiple packages in integrated workflows.
  • Interoperability beteen Bioconductor packages facilitates the writing of integrated workflows and minimizes the cognitive burden on users.
  • Educational materials from courses and conferences are archived and accessible on the Bioconductor website and YouTube channel.
  • Different channels of communication enable community members to converse and help each other, both as users and package developers.
  • The Bioconductor project is governed by scientific, technical, and advisory boards, as well as a Code of Conduct committee.

Installing Bioconductor packages


  • The BiocManager package is available from the CRAN repository.
  • BiocManager::install() is used to install and update Bioconductor packages (but also from CRAN and GitHub).
  • BiocManager::valid() is used to check for available package updates.
  • BiocManager::version() reports the version of Bioconductor currently installed.
  • BiocManager::install() can also be used to update an entire R library to a specific version of Bioconductor.

Getting help


  • The browseVignettes() function is recommended to access the vignette(s) installed with each package.
  • Vignettes can also be accessed on the Bioconductor website, but beware of differences between package versions!
  • The Bioconductor main website contains general information, package documentation, and course materials.
  • The Bioconductor support site is the recommended place to contact developers and ask questions.

S4 classes in Bioconductor


  • S4 classes store information in slots, and check the validity of the information every an object is updated.
  • To ensure the continued integrity of S4 objects, users should not access slots directly, but using dedicated functions.
  • S4 generics invoke different implementations of the method depending on the class of the object that they are given.
  • The S4 class DataFrame extends the functionality of base data.frame, for instance with the capacity to hold information about each column in metadata columns.
  • The S4 class Rle extends the functionality of the base vector, for instance with the capacity to encode repetitive vectors in a memory-efficient format.

Working with biological sequences


  • The Biostrings package defines classes to represent sequences of nucleotides and amino acids.
  • The Biostrings package also defines methods to efficiently process biological sequences.
  • The BSgenome package provides genome sequences for a range of model organisms immediately available as Bioconductor objects.

Working with genomics ranges


  • The GenomicRanges package defines classes to represent ranges of coordinates on a genomic scale.
  • The GenomicRanges package also defines methods to efficiently process genomic ranges.
  • The rtracklayer package provides functions to import and export genomic ranges from and to common genomic file formats.