Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Git course
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
2
Issues
2
List
Boards
Labels
Service Desk
Milestones
Merge Requests
0
Merge Requests
0
Operations
Operations
Incidents
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
courses
Git course
Commits
7a45299c
Commit
7a45299c
authored
Oct 13, 2013
by
Laros
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Updated the skeleton presentation, added handouts.
parent
8331c471
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
332 additions
and
4 deletions
+332
-4
skeleton/skeleton.tex
skeleton/skeleton.tex
+259
-4
skeleton/skeleton_handouts.tex
skeleton/skeleton_handouts.tex
+73
-0
No files found.
skeleton/skeleton.tex
View file @
7a45299c
\documentclass
[slidestop]
{
beamer
}
\title
{
Introduction to Version Control
}
\title
{
Analysis projects skeleton
}
\providecommand
{
\myConference
}{
Git course
}
\providecommand
{
\myDate
}{
Monday, October 14, 2013
}
\author
{
Jeroen F. J. Laros
}
...
...
@@ -32,12 +32,263 @@
% First page of the presentation.
\section
{
Introduction
}
\begin{frame}
\frametitle
{}
\frametitle
{
Shared projects.
}
Most of us work on multiple projects with multiple people.
\bigskip
That is why is is convenient to:
\begin{itemize}
\item
Have everything in one place.
\begin{itemize}
\item
Data.
\item
Code.
\item
Documentation.
\end{itemize}
\item
Have the same structure for all projects.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle
{
Project skeleton.
}
That is why we made a
\emph
{
skeleton
}
project.
\bigskip
Usage:
\begin{itemize}
\item
Make a
\emph
{
fork
}
(copy) of the skeleton project.
\item
Rename the project.
\item
Do a checkout.
\item
Start working with it.
\end{itemize}
\end{frame}
\section
{
Starting a project
}
\begin{fframe}
\frametitle
{
Forking
}
Make a new analysis project.
\begin{itemize}
\item
Go to the ``Project skeleton'' project page on our GitLab server.
\item
Click ``Fork'' to fork it to a new project.
\item
Go to ``Settings'' to rename the new project.
\begin{itemize}
\item
Change both the project as well as the repository path.
\end{itemize}
\end{itemize}
\vfill
\permfoot
{
https://git.lumc.nl/lgtc-bioinformatics/project-skeleton
}
\end{fframe}
\begin{frame}
\frametitle
{
Configuration
}
Configure your project.
\begin{itemize}
\item
Choose to make your project public or not.
\begin{itemize}
\item
Public by default.
\item
Public really means public.
\end{itemize}
\item
Add the people that work on this project.
\end{itemize}
\end{frame}
\section
{
Project structure
}
\begin{frame}
\frametitle
{
Global overview.
}
Project layout:
\begin{itemize}
\item
analysis
\item
data
\item
doc
\item
src
\end{itemize}
\bigskip
Ideally, every directory in the project has a
\bt
{
README
}
file.
\end{frame}
\begin{frame}
\frametitle
{
The toplevel
\bt
{
\,
README
\,\,
}
file.
}
This file contains general information about the project, for example:
\begin{itemize}
\item
Who leads the project.
\item
Who participates in the project.
\item
The amount of hours people have spent on this project.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle
{
The
\bt
{
\,
doc
\,\,
}
directory.
}
Documentation on the project:
\begin{itemize}
\item
Annotated sample lists.
\item
Goal of the project.
\item
Related work and literature.
\begin{itemize}
\item
You may want to note who provided the documentation.
\end{itemize}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle
{
The
\bt
{
\,
data
\,\,
}
directory.
}
Used to store all raw data.
\bigskip
The
\bt
{
README
}
contains:
\begin{itemize}
\item
Description of the delivered data.
\begin{itemize}
\item
Sequencing centre.
\item
Platform.
\item
Molecular type.
\item
Owner.
\item
Gatherer.
\end{itemize}
\item
Description of other data.
\begin{itemize}
\item
Perhaps you already got BAM files.
\begin{itemize}
\item
Who aligned it?
\item
Which aligner?
\end{itemize}
\end{itemize}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle
{
The
\bt
{
\,
analysis
\,\,
}
directory.
}
All analysis related files are stored here:
\begin{itemize}
\item
Run scripts.
\item
Make files.
\item
Result files.
\end{itemize}
\bigskip
Try to separate self-contained parts of the analysis in their own
subdirectories and document dependencies in a
\bt
{
README
}
file.
\begin{itemize}
\item
Normal data analysis.
\item
$
k
$
-mer analysis.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle
{
The
\bt
{
\,
src
\,\,
}
directory.
}
Any custom scripts and specific software versions for this project.
\bigskip
When these scripts are useful for other projects, move them to their own
repository.
\end{frame}
\section
{
Working with large files
}
\begin{fframe}
\frametitle
{
Git is not designed for massive files.
}
Some problems with large files:
\begin{itemize}
\item
Limited storage on the server.
\item
Checking out a repository would take a long time.
\end{itemize}
\bigskip
We do want to have some way to track our input and output data. This can be
done with
\bt
{
git-annex
}
.
\vfill
\permfoot
{
http://git-annex.branchable.com/
}
\end{fframe}
\begin{frame}
[fragile]
\frametitle
{
Git annex.
}
\begin{itemize}
\item
Manage large files without storing them.
\item
Store file checksums.
\item
Prevent files from being deleted accidentally.
\end{itemize}
\bigskip
\pause
You first have to enable this for your repository.
\begin{lstlisting}
[language=none, caption=Enable git-annex.]
$
git annex init "<name>"
\end
{
lstlisting
}
\end
{
frame
}
\begin
{
frame
}
[
fragile
]
\frametitle
{
Adding big files.
}
\begin
{
lstlisting
}
[
language
=
none, caption
=
Adding files.
]
$
git annex add <filename>
$
git commit
\end
{
lstlisting
}
\bigskip
In a clone, this file will visible, but not really present.
\begin
{
lstlisting
}
[
language
=
none, caption
=
Make a file available.
]
$
file <filename>
<filename>: broken symbolic link to ...
$
git annex get <filename>
\end
{
lstlisting
}
\end
{
frame
}
\begin
{
frame
}
[
fragile
]
\frametitle
{
Removing files.
}
As long as there are enough copies available, you can remove files.
\begin
{
lstlisting
}
[
language
=
none, caption
=
A failing drop command.
]
$
git annex drop <filename>
drop bigfile (unsafe)
git-annex: drop: 1 failed
\end{lstlisting}
\bigskip
It is actually quite well protected.
\begin{lstlisting}
[language=none, caption=rm fails too.]
$
rm
-
rf <repository>
rm: cannot remove <repository>
/
.git
/
annex
/
objects
/
...
\end
{
lstlisting
}
\end
{
frame
}
\begin
{
frame
}
[
fragile
]
\frametitle
{
Sync your results.
}
Let the other repositories know what you have done.
\begin
{
lstlisting
}
[
language
=
none, caption
=
.
]
$
git annex sync
\end{lstlisting}
\end{frame}
\begin{frame}
[fragile]
\frametitle
{
Working together on the same clone.
}
If you need to work with other people on the same repository clone on the
Shark cluster, you can use the following command to give group access:
\begin{lstlisting}
[language=none, caption=.]
$
find
-
type d
-
exec chmod
775
{}
\;
$
find -type f -exec chmod 664
{}
\;
\end{lstlisting}
\end{frame}
\section
{
Questions?
}
\lastpagetemplate
\begin{frame}
\begin{f
f
rame}
\begin{center}
Acknowledgements:
\bigskip
...
...
@@ -48,6 +299,10 @@
Zuotian Tatum
\end{center}
\end{frame}
\vfill
\permfoot
{
http://git-annex.branchable.com/
}
\permfoot
{
https://git.lumc.nl/lgtc-bioinformatics/project-skeleton
}
\end{fframe}
\end{document}
skeleton/skeleton_handouts.tex
0 → 100644
View file @
7a45299c
\documentclass
{
article
}
\usepackage
{
fullpage
}
\usepackage
{
listings
}
\frenchspacing
\setlength
{
\parindent
}{
0pt
}
\pagestyle
{
empty
}
\begin{document}
\begin{center}
{
\bf
Git Introduction Course
}
Project skeleton practical.
\end{center}
\bigskip
\subsubsection*
{
Git annex.
}
First, we make an empty repository:
\begin{lstlisting}
$
mkdir annex
_
project
$
cd annex
_
project
$
git init
$
git annex init "Original repository."
\end{lstlisting}
\bigskip
Now add a ``big file'' and annex it.
\begin{lstlisting}
$
dd if
=/
dev
/
urandom of
=
bigfile.dat count
=
1024
$
git annex add bigfile.dat
$
git commit
-
m "Added big file."
\end
{
lstlisting
}
\bigskip
Now, we clone the project and let this repository know where the clone is.
\begin
{
lstlisting
}
$
git clone . ../annex
_
clone
$
git remote add annex
_
clone ..
/
annex
_
clone
$
cd ../annex
_
clone
$
git annex init "Cloned repository."
\end
{
lstlisting
}
\bigskip
With the ``
\texttt
{
file
}
'' command you can now see that
``
\texttt
{
bigdata.dat
}
'' is not present in this repository.
\begin
{
lstlisting
}
$
git pull
$
git annex get bigfile.dat
\end
{
lstlisting
}
\bigskip
You can now remove the big data file from the original repository.
\begin
{
lstlisting
}
$
cd ../annex
_
project
$
git annex drop bigfile.dat
\end
{
lstlisting
}
But you can not remove it from the clone.
\subsubsection
*
{
Project skeleton.
}
Search for the ``Project skeleton''.
\begin
{
itemize
}
\item
\emph
{
Hint:
}
Click the ``Public area'' icon and use the search option.
\end
{
itemize
}
\bigskip
Suppose you are going to do an RNASeq analysis. You have the following files:
\begin
{
itemize
}
\item
read
\_
1
.fq
\item
read
\_
2
.fq
\item
Makefile
\end
{
itemize
}
\end
{
document
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment