Tools for Reproducible Science
Introducing a series of online courses to enable core skills in Biological Data Science.
Course type
R-Markdown (June 10)
Script, video streaming, video download
R-Markdown lets you write reports, papers, web pages and slides from Rstudio while combining R-code, R-output, text, figures, tables… In this session we cover
- The Markdown syntax
- Writing R-chunks, controlling their behaviour
- Control of the layout and output options with the YAML
- Writing a technical report output as a HTML or Word document
Git and Github (June 17)
Script, Video streaming, video download
Git is a powerful version control (~ “track change” but much more powerful!) that lets you record all changes made to a project, go back to a previous state of your work, share your work and collaborate with others.
Making R-packages (June 24)
Script, video streaming, video download
In this session we build a little package from scratch, with time-series temperature data and function to analyse them. We practice the cycle of writing code, generating documentation, running checks and installing the package. We then post a public version of the package on Github so that anyone can install the package.
Workflow Management and Snakemake (July 1)
Script, video streaming, video download
Tomorrow morning I'll be giving a lesson on workspace and workflow management for bioinformatics. Despite the boring title, this is a true computational biology superpower: instead of a disorganised mess of scripts and strangely named data files, you'll have a tidy, organised, shareable, and reproducible workspace for each project. We cover:
- Workspace organisation and workflows
- snakemake: each step is a rule
- snakemake: rule graphs
- snakemake: a toy bioinformatic workflow
Snakemake Part 2 (July 8)
Script, video streaming video download
We'll finish the last little bit of last week's content on workflow management basics, and then cover some more advanced and extremely useful features of the snakemake ecosystem:
- Snakemake: config files and metadata
- Snakemake: automatic interaction with clusters and queuing systems
- Versioning workspaces with git
- Managing installation and versioning of software per-workspace with conda environments
Making Maps in R with ggplot (July 15)
Script video streaming, video download
This workshop will be a practical introduction to the basics of making maps in R. Maps can be complex and often seem annoying to work with, but with a little familiarity with R you can create useful and attractive maps quickly and easily. We will focus on some common use cases and not assume anything beyond ggplot2 basics covered previously in this series.
If you need a refresher on ggplot2 basics you can check our April course content here and video recording (HR download or streaming )
Prerequisites: For these workshops we assume a basic knowledge of R-programming and bash scripting. Please see our introduction to R coding and to bioinformatics if needed.