Commit 8ffe0da0 authored by Laros's avatar Laros
Browse files

Finished optimisation lecture.

parent ec0dd1c2
......@@ -43,6 +43,15 @@
\item First write \emph{readable} and \emph{maintainable} code.
\item Only optimise when needed.
\end{itemize}
\bigskip
\onslide<2->{
However:
\begin{itemize}
\item Some things should \emph{always} be considered.
\item Inefficient code is usually not more readable.
\end{itemize}
}
\vfill
\epigraph{Premature optimization is the root of all evil.}{Donald Knuth}
......@@ -50,26 +59,26 @@
\subsection{Popular view}
\begin{pframe}
Use more computers, buy more hardware.
Buy faster computers, buy more computers.
\bigskip
But consider this:
However:
\begin{itemize}
\item Only basic speedup.
\item Add a lot of complexity to your program.
\item Costly.
\item Costly (\euro\,5,000 for one new node).
\end{itemize}
\bigskip
\pause
Related opinion: ``Don't worry about complexity, everything will be solved
once computers become faster.''
Related opinion: ``\textit{Don't worry about speed, everything will be solved
in ten years (once computers become faster).}''
\end{pframe}
\section{Complexity theory}
\subsection{Theory}
\subsection{Terminology}
\begin{pframe}
Study of the \emph{time} and \emph{memory} complexity of algorithms.
\bigskip
\begin{table}[]
\begin{center}
......@@ -77,6 +86,7 @@
Correlation & Complexity\\
\hline
Constant & $\mathcal{O}(1)$\\
Logarithmic & $\mathcal{O}(\mathrm{log\ } n)$\\
Linear & $\mathcal{O}(n)$\\
Quadratic & $\mathcal{O}(n^2)$\\
Exponential & $\mathcal{O}(2^n)$
......@@ -139,15 +149,15 @@
\begin{lstlisting}[language=python, caption={Get the first letter of a
word.}]
def maximum_pair(list_of_values):
maximum = 0
def optimal_pair(f, list_of_values):
optimum = 0
for index, value_1 in enumerate(list_of_values):
for value_2 in list_of_values[index + 1:]:
if f(value_1, value_2) > maximum:
maximum = f(value_1, value_2)
for value_1 in list_of_values:
for value_2 in list_of_values:
if f(value_1, value_2) > optimum:
optimum = f(value_1, value_2)
return maximum
return optimum
\end{lstlisting}
The amount of time grows faster than the length of the input.
......@@ -160,6 +170,7 @@
Algorithm & Complexity\\
\hline
Intersecting sorted regions & $\mathcal{O}(n)$\\
Searching in a sorted list & $\mathcal{O}(\mathrm{log\ }n)$\\
Sorting & $\mathcal{O}(n\mathrm{\ log\ }n)$\\
Pairwise alignment & $\mathcal{O}(n^2)$\\
\textit{De novo} assembly & $\mathcal{O}(2^n)$\\
......@@ -167,46 +178,35 @@
\end{center}
\caption{Known complexities.}
\end{table}
\end{pframe}
\section{Bottlenecks}
\section{}
\begin{pframe}
The $90/10$ law: \textit{$90\%$ of the execution time of a computer program
is spent executing 10\% of the code.}
\bigskip
Finding \emph{bottlenecks} in your code (focus on the $10\%$).
For a lot of algorithms, the complexity is known.
\begin{itemize}
\item If your implementation behaves differently, something is wrong.
\end{itemize}
\end{pframe}
\subsection{Profilers}
\section{Applications of complexity theory}
\subsection{Data structures and bottlenecks}
\begin{pframe}
Use a \emph{profiler} to see which part takes up the most time.
Some things we should always try to do:
\begin{itemize}
\item Find an algorithm with minimal complexity.
\item Choose the right \emph{data structures}.
\end{itemize}
\bigskip
\begin{lstlisting}[language=none, caption={Profiling output.}]
ncalls tottime cumtime filename:line(function)
1 0.002 35.276 tssv:3(<module>)
1 0.000 34.776 tssv.py:437(main)
1 0.058 34.767 tssv.py:336(tssv)
7272 0.185 34.599 tssv.py:121(alignPair)
14544 0.100 34.408 sg_align.py:71(align)
14544 25.350 33.433 sg_align.py:22(_align)
11092934 7.192 7.192 {min}
895601 0.958 0.958 {range}
\end{lstlisting}
This is difficult, even for experienced programmers.
\begin{itemize}
\item Do not hesitate to ask.
\item Discuss your solution (in detail).
\end{itemize}
\bigskip
The culprit in this example is the function ``\lstinline{_align}''.
These things will give you more speedup than parallelisation.
\end{pframe}
\section{Complexity of bottlenecks}
% Investigate complexity.
% - Example of quadratic algorithm that has linear counterpart.
% - This gives more speedup than parallelisation.
% - Very hard, even for experienced programmers, do not hesitate to ask.
\subsection{Complexity reduction}
\begin{pframe}
Overlap of regions.
\bigskip
\begin{figure}[]
\begin{center}
\fbox{
......@@ -233,51 +233,223 @@
\end{center}
\caption{Intersection of two sets of regions.}
\end{figure}
lajlkfj
Calculate the \emph{overlap} of two lists of regions.
\begin{itemize}
\item Pops up in many different forms.
\end{itemize}
\end{pframe}
\begin{pframe}
\begin{table}[]
\begin{tabular}{llrll}
& Region 1 & Region 2 & & Overlap\\
\hline
\onslide<1-5>{$\rightarrow$} & $(10, 20)$ & $(5, 30)$ &
\onslide<1,2,6,10,14>{$\leftarrow$} & \onslide<2->{$(10, 20)$}\\
\onslide<6-9>{$\rightarrow$} & $(25, 40)$ & $(35, 45)$ &
\onslide<3,7,11,15>{$\leftarrow$} & \onslide<6->{$(25, 30)$}
\onslide<7->{\ $(35, 40)$}\\
\onslide<10-13>{$\rightarrow$} & $(55, 70)$ & $(60, 65)$ &
\onslide<4,8,12,16>{$\leftarrow$} & \onslide<12->{$(60, 65)$}\\
\onslide<14-17>{$\rightarrow$} & $(85, 95)$ & $(75, 80)$ &
\onslide<5,9,13,17>{$\leftarrow$} & \\
\end{tabular}
\caption{Intersection algorithm of $\mathcal{O}(n^2)$.}
\end{table}
Notice that interesting things happen only when the arrows are close.
\end{pframe}
\begin{pframe}
Based on the previous observation:
\begin{itemize}
\item Remote regions will never give an overlap.
\end{itemize}
\bigskip
Algorithm that tries to ``keep close''.
\begin{enumerate}
\item Consider the first regions in both lists.
\item See whether the two regions have an overlap.\label{item:loop}
\item Take the list which region has the smallest end coordinate and go to
the next region.
\item Proceed to Rule~\ref{item:loop}.
\end{enumerate}
\end{pframe}
\begin{pframe}
\begin{table}[]
\begin{tabular}{llrll}
& Region 1 & Region 2 & & Overlap\\
\hline
\onslide<1,2>{$\rightarrow$} & $(10, 20)$ & $(5, 30)$ &
\onslide<1,2,3>{$\leftarrow$} & \onslide<2->{$(10, 20)$}\\
\onslide<3,4>{$\rightarrow$} & $(25, 40)$ & $(35, 45)$ &
\onslide<4,5>{$\leftarrow$} & \onslide<3->{$(25, 30)$}
\onslide<4->{\ $(35, 40)$}\\
\onslide<5,6,7>{$\rightarrow$} & $(55, 70)$ & $(60, 65)$ &
\onslide<6>{$\leftarrow$} & \onslide<6->{$(60, 65)$}\\
\onslide<8->{$\rightarrow$} & $(85, 95)$ & $(75, 80)$ &
\onslide<7,8->{$\leftarrow$} & \\
\end{tabular}
\caption{Intersection algorithm of $\mathcal{O}(n)$.}
\end{table}
\onslide<9>{
This optimisation will outweigh parallelisation.
\begin{itemize}
\item One computer running this algorithm will outperform a cluster
running the other algorithm.
\end{itemize}
}
\end{pframe}
\subsection{Data structures}
\begin{pframe}
Closely related to previous discussion.
\bigskip
Use the simplest data structure that suits your needs.
\begin{itemize}
\item Complicated data structures have more overhead.
\end{itemize}
\bigskip
Note that simplest is not always the \emph{easiest}.
\bigskip
This can give (way) more speedup than parallelisation.
\end{pframe}
\begin{pframe}
\begin{figure}[]
\begin{center}
\begin{tabular}{lllll}
& Region 1 & Region 2 & & Overlap\\
\hline
\onslide<2-5>{$\rightarrow$} & $(10, 20)$ & $(5, 30)$ &
\onslide<2,6,10,14>{$\leftarrow$} & \onslide<2->{$(10, 20)$}\\
\onslide<6-9>{$\rightarrow$} & $(25, 40)$ & $(35, 45)$ &
\onslide<3,7,11,15>{$\leftarrow$} & \onslide<6->{$(25, 30)$}
\onslide<7->{$(35, 40)$}\\
\onslide<10-13>{$\rightarrow$} & $(55, 70)$ & $(60, 65)$ &
\onslide<4,8,12,16>{$\leftarrow$} & \onslide<12->{$(60, 65)$}\\
\onslide<14-17>{$\rightarrow$} & $(85, 95)$ & $(75, 80)$ &
\onslide<5,9,13,17>{$\leftarrow$} & \\
\end{tabular}
\fbox{
\begin{picture}(80, 30)
\put(5, 4){Germany}
\put(5, 9){Japan}
\put(5, 14){Ghana}
\put(5, 19){Peru}
\put(5, 24){Canada}
\put(60, 4){Tokyo}
\put(60, 9){Ottawa}
\put(60, 14){Lima}
\put(60, 19){Berlin}
\put(60, 24){Accra}
\put(37, 24){$f()$}
\qbezier(25, 6)(25, 6)(55, 21)
\qbezier(25, 11)(25, 11)(55, 6)
\qbezier(25, 16)(25, 16)(55, 26)
\qbezier(25, 21)(25, 21)(55, 16)
\qbezier(25, 26)(25, 26)(55, 11)
\end{picture}
}
\end{center}
\caption{$\mathcal{O}(n^2)$: $16$ steps.}
\caption{Hashing function.}
\end{figure}
Notice that interesting things happen only when the arrows are close.
A \emph{dictionary} or \emph{hash}.
\begin{itemize}
\item Very fast data structure for \emph{key-value pairs}.
\end{itemize}
\end{pframe}
\begin{pframe}
\begin{figure}[]
\begin{center}
\begin{tabular}{lllll}
& Region 1 & Region 2 & & Overlap\\
\hline
\onslide<2>{$\rightarrow$} & $(10, 20)$ & $(5, 30)$ &
\onslide<2,3>{$\leftarrow$} & \onslide<2->{$(10, 20)$}\\
\onslide<3,4>{$\rightarrow$} & $(25, 40)$ & $(35, 45)$ &
\onslide<4,5>{$\leftarrow$} & \onslide<3->{$(25, 30)$}
\onslide<4->{$(35, 40)$}\\
\onslide<5,6,7>{$\rightarrow$} & $(55, 70)$ & $(60, 65)$ &
\onslide<6>{$\leftarrow$} & \onslide<6->{$(60, 65)$}\\
\onslide<8>{$\rightarrow$} & $(85, 95)$ & $(75, 80)$ &
\onslide<7,8>{$\leftarrow$} & \\
\end{tabular}
\begin{picture}(80, 10)
\put(0, 0){\line(1, 0){80}}
\put(0, 10){\line(1, 0){80}}
\put(0, 0){\line(0, 1){10}}
\put(10, 0){\line(0, 1){10}}
\put(20, 0){\line(0, 1){10}}
\put(30, 0){\line(0, 1){10}}
\put(40, 0){\line(0, 1){10}}
\put(50, 0){\line(0, 1){10}}
\put(60, 0){\line(0, 1){10}}
\put(70, 0){\line(0, 1){10}}
\put(80, 0){\line(0, 1){10}}
\put(4, 3.5){4}
\put(14, 3.5){5}
\put(24, 3.5){8}
\put(34, 3.5){1}
\put(44, 3.5){0}
\put(54, 3.5){7}
\put(64, 3.5){2}
\put(74, 3.5){3}
\end{picture}
\end{center}
\caption{$\mathcal{O}(n)$: $7$ steps.}
\caption{Direct memory indexing.}
\end{figure}
But if you have keys ranging from $0$ to $n$, use a simple \emph{list} or
\emph{array}.
\bigskip
Linear speedup.
\end{pframe}
\begin{pframe}
Using a \emph{set} datas tructure.
\bigskip
\begin{lstlisting}[language=python, caption={Find gene names in some text.}]
for word in text:
for gene in genes:
if word == gene:
print word
\end{lstlisting}
\begin{lstlisting}[language=python, caption={Find gene names in some text
using sets.}]
print set(text) & set(genes)
\end{lstlisting}
Quadratic speedup.
\end{pframe}
\subsection{Bottlenecks}
\begin{pframe}
The $90/10$ law: \textit{$90\%$ of the execution time of a computer program
is spent executing 10\% of the code.}
\bigskip
Finding \emph{bottlenecks} in your code (focus on the $10\%$).
\bigskip
You might have a good idea already.
\begin{itemize}
\item The most complicated part of your program.
\item The most used part of your program.
\end{itemize}
\bigskip
But it often is hard to find.
\end{pframe}
\subsection{Profilers}
\begin{pframe}
Use a \emph{profiler} to see which part takes up the most time.
\bigskip
\begin{lstlisting}[language=none, caption={Profiling output.}]
ncalls tottime cumtime filename:line(function)
1 0.002 35.276 tssv:3(<module>)
1 0.000 34.776 tssv.py:437(main)
1 0.058 34.767 tssv.py:336(tssv)
7272 0.185 34.599 tssv.py:121(alignPair)
14544 0.100 34.408 sg_align.py:71(align)
14544 25.350 33.433 sg_align.py:22(_align)
11092934 7.192 7.192 {min}
895601 0.958 0.958 {range}
\end{lstlisting}
The culprit in this example is the function ``\lstinline{_align}''.
\end{pframe}
\section{Combining languages}
......@@ -302,7 +474,8 @@
\subsection{Example: TSSV}
% - Example TSSV (one day of work saves 2 nodes a 20.000 euro.
\begin{pframe}
\begin{lstlisting}[language=python, caption={Python version.}]
\begin{lstlisting}[language=python, caption={Semi-global alignment in
Python.}]
def _align(matrix, xSize, ySize, seq1, seq2):
for x in range(1, xSize):
for y in range(1, ySize):
......@@ -312,10 +485,12 @@
matrix[x - 1][y - 1] +
int(seq1[x - 1] != seq2[y - 1]))
\end{lstlisting}
The complexity is already minimal.
\end{pframe}
\begin{pframe}
\begin{lstlisting}[language=python, caption={C version.}]
\begin{lstlisting}[language=python, caption={Semi-global alignment in C.}]
void _align(int **matrix, int x_size, int y_size,
char *seq1, char *seq2) {
int x,
......@@ -355,47 +530,6 @@
\end{figure}
\end{pframe}
\section{Data structures}
\subsection{Simple solutions for simple problems}
\begin{pframe}
In general, use the simplest data structure that suits your needs.
\begin{itemize}
\item Complicated data structures have more overhead.
\end{itemize}
\bigskip
Note that simplest is not always the \emph{easiest}.
\end{pframe}
% Hash vs. matrix.
\begin{pframe}
A \emph{dictionary} or \emph{hash}.
\begin{itemize}
\item Very fast data structure for \emph{key-value pairs}.
\end{itemize}
But if you have keys ranging from $0$ to $n$, use a simple \emph{list} or
\emph{array}.
\end{pframe}
\subsection{Sets}
\begin{pframe}
\begin{lstlisting}[language=python, caption={Find gene names in some text.}]
for word in text:
for gene in genes:
if word == gene:
print word
\end{lstlisting}
\begin{lstlisting}[language=python, caption={Find gene names in some text.}]
print set(text) & set(genes)
\end{lstlisting}
\end{pframe}
\section{Parallelisation}
% Parallelisation (last resort).
%
\section{Questions?}
\lastpagetemplate
\begin{pframe}
......@@ -404,7 +538,7 @@
\bigskip
\bigskip
Johan den Dunnen
\end{center}
\end{pframe}
\end{document}
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment