Newer
Older
\documentclass{article}
\usepackage{fullpage}
\usepackage{listings}
\usepackage{graphicx}
\frenchspacing
\setlength{\parindent}{0pt}
\pagestyle{empty}
\renewcommand\_{\textunderscore\,}
\let\tempitemize=\itemize
\renewcommand\itemize{
\vspace{-5pt}
\tempitemize
\setlength{\itemsep}{0pt}
}
\let\tempenditemize=\enditemize
\renewcommand\enditemize{
\tempenditemize
}
\title{NGS Introduction Course\\
\author{Jeroen Laros, Michiel van Galen}
\begin{document}
\maketitle
\thispagestyle{empty}
\begin{figure}[h!]
\centering
\includegraphics[width=0.55\textwidth]{pipelines-font}
\end{figure}
% Pipeline
\section{A simple analysis pipeline}
Log in to the virtual machine as described in the first practical. Finish these
exercises first then continue with this one.
Analysis pipelines can vary from a couple of very simple lines of code, to
complex frameworks working on multiple computing nodes. However, the idea
behind it is always the same. Make your analysis easier, while at the same time
it becomes reproducible, documented and easier to share.
Today we will use bash to create a basic pipeline. Unknowingly, you have
already worked with bash. Everything you've typed into the terminal so far is
basically part of bash. To create a pipeline, all we need to do is create a
file with the ``.sh'' extension and simply sum up the steps you want to include.
\medskip
First, let's see how to make your script executable. We need to change the
permissions. This way you can simply run the file directly without having to
specify the interpreter. In Linux we us chmod for this.
\medskip
\begin{lstlisting}
$ chmod +x file.sh
\end{lstlisting}
\medskip
One last additional piece of information your system is missing, is which
interpreter it should use. We store this in the very first line of the script
and is written like this ``\#!/bin/bash''.
\medskip
This is all you need to know to write your first pipeline. Bash offers a lot
more possibilities than just using Linux power tools, navigation and starting
other software. For example, to make the pipeline easier to re-use with other
data, we can give a bash script parameters. Just like Bowtie, where you can
supply your fastq file. This is done by using the reserved variable \$1 inside
the script. Everywhere in your script this variable will be replace with the
first argument you give on the command line.
\medskip
The next exercise will give you an idea on how to structure a simple bash
pipeline. Complete the exercise and try to understand what is going on in the
script:
\medskip
\begin{itemize}
\item Write the lines you see below to a file ``Hello.sh''
\item Make the file executable ``chmod +x Hello.sh''
\item Run it with an argument ``./Hello.sh yourname'' and see what it does
\item Comments start with \# and are used to document your work
\end{itemize}
\medskip
\begin{lstlisting}
#!/bin/bash
echo Hello \$1, the date is:
# The next line will print the date.
date
\end{lstlisting}
\medskip
With this knowledge you should be able to write your own short bash pipeline.
Combine this with what you have learned in the first practical and try to write
a script that combines the tools from the first practical into a pipeline:
\medskip
\begin{itemize}
\begin{itemize}
\item Align a fastq file to the mtDNA
\item Convert the SAM to a sorted BAM
\item Call variants
\item Annotate the variants
\end{itemize}
\item Keep these things in mind when writing the script:
\begin{itemize}
\item The fastq file should be the first argument
\item Use comments to explain what happens
\item Make it executable
\item Bonus: Can you think of adding more arguments? Maybe for annotation?
\end{itemize}
\end{itemize}
\medskip
Hints and summary:
\begin{itemize}
\item A comment line starts with an \#
\item \$1 is the first argument, \$2 the second, etc...
\item Make your pipeline executable with ``chmod +x file''
\end{itemize}
\medskip
Now you know how to write a basic pipeline. In bash you can also work with
conditions, loops and error handling but this is beyond the scope of this
course. If you like to know more there is plenty to read on the web or visit
one of our other courses. This is the end of the practical.
\medskip
Thank you for participating!