index.md 3.95 KB
Newer Older
1
# Welcome to Biopet
sajvanderzeeuw's avatar
sajvanderzeeuw committed
2
3
4
###### (Bio Pipeline Execution Tool)

## Introduction
5
6
7
8
9
10
11
12

Biopet is an abbreviation of ( Bio Pipeline Execution Tool ) and packages several functionalities:

 1. Tools for working on sequencing data
 1. Pipelines to do analysis on sequencing data
 1. Running analysis on a computing cluster ( Open Grid Engine )
 1. Running analysis on your local desktop computer

sajvanderzeeuw's avatar
sajvanderzeeuw committed
13
14
### System Requirements

15
Biopet is build on top of GATK Queue, which requires having `java` installed on the analysis machine(s).
sajvanderzeeuw's avatar
sajvanderzeeuw committed
16

17
18
For end-users:

19
 * [Java 7 JVM](http://www.oracle.com/technetwork/java/javase/downloads/index.html) or [OpenJDK 7](http://openjdk.java.net/install/) 
sajvanderzeeuw's avatar
sajvanderzeeuw committed
20
 * [Cran R 3.1.1](http://cran.r-project.org/)
21
 * [GATK](https://www.broadinstitute.org/gatk/download)
22
23
24

For developers:

25
 * [OpenJDK 7](http://openjdk.java.net/install/) 
sajvanderzeeuw's avatar
sajvanderzeeuw committed
26
 * [Cran R 3.1.1](http://cran.r-project.org/)
27
 * [Maven 3.2](http://maven.apache.org/download.cgi)
28
 * [GATK + Queue](https://www.broadinstitute.org/gatk/download)
29
 * [IntelliJ](https://www.jetbrains.com/idea/) or [Netbeans > 8.0](https://netbeans.org/)
30
31

## How to use
sajvanderzeeuw's avatar
sajvanderzeeuw committed
32
33

### Running a pipeline
34

35
36
37
38
39
40
41
42
43
44
45
46
47
48
- Help:
~~~
java -jar Biopet(version).jar (pipeline of interest) -h
~~~
- Local:
~~~
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -run
~~~
- Cluster:
    - Note that `-qsub` is cluster specific (SunGrid Engine)
~~~
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -qsub* -jobParaEnv YoureParallelEnv -run
~~~
- DryRun:
sajvanderzeeuw's avatar
sajvanderzeeuw committed
49
50
51
    - A dry run can be performed to see if the scheduling and creating of the pipelines jobs performs well. Nothing will be executed only the job commands are created. If this succeeds it's a good indication you actual run will be successful as well.
    - Each pipeline can be found as an options inside the jar file Biopet[version].jar which is located in the target directory and can be started with `java -jar <pipelineJarFile>`

52
53
54
55
56
~~~
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) 
~~~
    

57
58
59
60
61
62
### Shark Compute Cluster specific

In the SHARK compute cluster, a module is available to load the necessary dependencies.

    $ module load biopet/v0.2.0

sajvanderzeeuw's avatar
sajvanderzeeuw committed
63
Using this option, the `java -jar Biopet-<version>.jar` can be ommited and `biopet` can be started using:
64
65
66

    $ biopet

sajvanderzeeuw's avatar
sajvanderzeeuw committed
67
68


69
### Running pipelines
sajvanderzeeuw's avatar
sajvanderzeeuw committed
70

71
72
73
74
75
    $ biopet pipeline <pipeline_name>


- [Flexiprep](pipelines/flexiprep)
- [Mapping](pipelines/mapping)
sajvanderzeeuw's avatar
sajvanderzeeuw committed
76
77
78
79
80
81
82
83
- [Gatk Variantcalling](https://git.lumc.nl/biopet/biopet/wikis/GATK-Variantcalling-Pipeline)
- BamMetrics
- Basty
- GatkBenchmarkGenotyping
- GatkGenotyping
- GatkPipeline
- GatkVariantRecalibration
- GatkVcfSampleCompare
84
85
- [Gentrap](pipelines/gentrap)
- [Sage](pipelines/sage)
sajvanderzeeuw's avatar
sajvanderzeeuw committed
86
87
- Yamsvp (Under development)

sajvanderzeeuw's avatar
sajvanderzeeuw committed
88
__Note that each pipeline needs a config file written in JSON format see [config](config.md) & [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config) __
sajvanderzeeuw's avatar
sajvanderzeeuw committed
89
90
91
92
93
94
95
96
97


There are multiple configs that can be passed to a pipeline, for example the sample, settings and executables wherefrom sample and settings are mandatory.

- [Here](config) one can find how to create a sample and settings config
- More info can be found here: [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config)



sajvanderzeeuw's avatar
sajvanderzeeuw committed
98

99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125

### Running a tool

    $ biopet tool <tool_name>

  - BedToInterval
  - BedtoolsCoverageToCounts
  - BiopetFlagstat
  - CheckAllelesVcfInBam
  - ExtractAlignedFastq
  - FastqSplitter
  - FindRepeatsPacBio
  - MpileupToVcf
  - SageCountFastq
  - SageCreateLibrary
  - SageCreateTagCounts
  - VcfFilter
  - VcfToTsv
  - WipeReads

## Developers

### Compiling Biopet

1. Clone biopet with `git clone git@git.lumc.nl:biopet/biopet.git biopet`
2. Go to biopet directory
3. run mvn_install_queue.sh, this install queue jars into the local maven repository
sajvanderzeeuw's avatar
sajvanderzeeuw committed
126
127
4. alternatively download the `queue.jar` from the GATK website
5. run `mvn verify` to compile and package or do `mvn install` to install the jars also in local maven repository
128
129


130
## About 
sajvanderzeeuw's avatar
sajvanderzeeuw committed
131
132
133
134
Go to the [about page](about)

## License

135
See: [License](license.md)