Peter Alping
February 19, 2020
Basics
Start using these tools in your own projects today!
Advanced (if there is time)
Any system that keeps track of changes in files and folders,
making it possible to review these changes and roll back to previous versions
Find out more about Git at: https://git-scm.com
Any system that let’s you specify and run a workflow,
e.g. running a set of instructions in a specific order
Find out more about Snakemake at: https://snakemake.readthedocs.io
(SAS Enterprise Guide has something similar)
A colleague has asked you to review their work
and gives you access to their Git repository
# Clone repository
git clone "path/to/repository.git"
# Enter project directory
cd repository
# Run Snakemake
snakemake
Commit changes and create Snakemake rules as we go
# Initialize a new repository
git init
# Check if any files have been modified
git status
# Check the difference to previous versions
git diff
# Stage changes before commit
git add
# Commit with a short message
git commit -m "Commit message"
# If you have a remote repo, start by getting any changes
git pull
# Push to remote repository
git push
# A basic rule inside Snakefile
rule my_rule:
input:
"path/to/input_1.txt",
"path/to/input_2.txt",
output:
"path/to/output.txt"
script:
"path/to/script.py"
Run rule from the command line:
# Check rule with a dry run
snakemake my_rule -n
# Run rule
snakemake my_rule
# Force the rule to run
snakemake my_rule -f
# Run in parallel (-j), keep going (-k), give reason (-r)
snakemake my_rule -rkj
Version control with Git can help keep track of changes in code
and text without having to manually keep an archive of old files
Workflow management with Snakemake can help organize the steps
from data to final report, and share these with collaborators or reviewers
You can easily start using both today in your existing projects