Commit 72936783 authored by sajvanderzeeuw's avatar sajvanderzeeuw

Changes in documentation of GATK BASTY adn BastyGenerateFasta

parent 026b2015
......@@ -16,32 +16,44 @@ Biopet is build on top of GATK Queue, which requires having `java` installed on
For end-users:
* Java 7 JVM
* Minimum 2 GB RAM, more when analysis is also run on this machine.
* [Java 7 JVM](http://www.oracle.com/technetwork/java/javase/downloads/index.html) or [OpenJDK 7](http://openjdk.java.net/install/)
* [Cran R 3.1.1](http://cran.r-project.org/)
* [GATK](https://www.broadinstitute.org/gatk/download)
For developers:
* OpenJDK 7 or Oracle-Java JDK 7
* Minimum of 4 GB RAM {todo: provide more accurate estimation on building}
* [OpenJDK 7](http://openjdk.java.net/install/)
* [Cran R 3.1.1](http://cran.r-project.org/)
* Maven 3
* [Maven 3.2](http://maven.apache.org/download.cgi)
* [GATK + Queue](https://www.broadinstitute.org/gatk/download)
* IntelliJ or Netbeans 8.0 for development
* [IntelliJ](https://www.jetbrains.com/idea/) or [Netbeans > 8.0](https://netbeans.org/)
## How to use
### Running a pipeline
- Help: `java -jar Biopet(version).jar (pipeline of interest) -h`
- Local: `java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -run`
- Cluster: `java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -qsub -jobParaEnv BWA -run`
- DryRun: `java -jar Biopet(version).jar (pipeline of interest) (pipeline options)`
- DryRun (shark): `java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -qsub -jobParaEnv BWA`
- Help:
~~~
java -jar Biopet(version).jar (pipeline of interest) -h
~~~
- Local:
~~~
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -run
~~~
- Cluster:
- Note that `-qsub` is cluster specific (SunGrid Engine)
~~~
java -jar Biopet(version).jar (pipeline of interest) (pipeline options) -qsub* -jobParaEnv YoureParallelEnv -run
~~~
- DryRun:
- A dry run can be performed to see if the scheduling and creating of the pipelines jobs performs well. Nothing will be executed only the job commands are created. If this succeeds it's a good indication you actual run will be successful as well.
- Each pipeline can be found as an options inside the jar file Biopet[version].jar which is located in the target directory and can be started with `java -jar <pipelineJarFile>`
~~~
java -jar Biopet(version).jar (pipeline of interest) (pipeline options)
~~~
### Shark Compute Cluster specific
In the SHARK compute cluster, a module is available to load the necessary dependencies.
......@@ -104,8 +116,6 @@ There are multiple configs that can be passed to a pipeline, for example the sam
- VcfToTsv
- WipeReads
## Developers
### Compiling Biopet
......@@ -117,10 +127,9 @@ There are multiple configs that can be passed to a pipeline, for example the sam
5. run `mvn verify` to compile and package or do `mvn install` to install the jars also in local maven repository
## About
Go to the [about page](about)
## License
See: [License](license)
See: [License](license.md)
This diff is collapsed.
......@@ -3,7 +3,7 @@
## Introduction
The GATK-pipeline is build for variant calling on NGS data (preferably Illumina data).
It uses the <a href="https://www.broadinstitute.org/gatk/guide/best-practices" target="_blank">best practices</a>) of GATK in terms of there approach to variant calling.
It is based on the <a href="https://www.broadinstitute.org/gatk/guide/best-practices" target="_blank">best practices</a>) of GATK in terms of there approach to variant calling.
The pipeline accepts ```.fastq & .bam``` files as input.
## Tools for this pipeline
......@@ -46,16 +46,69 @@ To run the pipeline:
~~~
java -jar Biopet.0.2.0.jar pipeline gatkPipeline -run -config MySamples.json -config MySettings.json -outDir myOutDir
~~~
For LUMC/researchSHARK users there is a module available that sets all your environment settings and default executables/settings.
To check if your pipeline can create all the jobs (dry run) remove the `-run`:
~~~
module load Biopet/0.2.0
biopet pipeline gatkPipeline -run -config MySamples.json -config MySettings.json -outDir myOutDir
java -jar Biopet.0.2.0.jar pipeline gatkPipeline -config MySamples.json -config MySettings.json -outDir myOutDir
~~~
## Examine results
## Results
### Result files
~~~
.
└── samples
├── my_sample1
│   ├── run_lib1
│   │   ├── chunks
│   │   │   ├── 1
│   │   │      └── flexiprep
│   │   │
│   │   │
│   │   │
│   │   ├── flexiprep
│   │   │   ├── input.R1.fastqc
│   │   │   │   └── input.R1_fastqc
│   │   │   │   ├── Icons
│   │   │   │   └── Images
│   │   │   ├── input.R1.qc.fastqc
│   │   │   │   └── input.R1.qc_fastqc
│   │   │   │   ├── Icons
│   │   │   │   └── Images
│   │   │   ├── input.R2.fastqc
│   │   │   │   └── input.R2_fastqc
│   │   │   │   ├── Icons
│   │   │   │   └── Images
│   │   │   └── input.R2.qc.fastqc
│   │   │   └── input.R2.qc_fastqc
│   │   │   ├── Icons
│   │   │   └── Images
│   │   └── metrics
│   ├── run_lib2
│   │   ├── chunks
│   │   │   ├── 1
│   │   │   └── flexiprep
│   │   │
│   │   ├── flexiprep
│   │   │   ├── input.R1.fastqc
│   │   │   │   └── input.R1_fastqc
│   │   │   │   ├── Icons
│   │   │   │   └── Images
│   │   │   ├── input.R1.qc.fastqc
│   │   │   │   └── input.R1.qc_fastqc
│   │   │   │   ├── Icons
│   │   │   │   └── Images
│   │   │   ├── input.R2.fastqc
│   │   │   │   └── input.R2_fastqc
│   │   │   │   ├── Icons
│   │   │   │   └── Images
│   │   │   └── input.R2.qc.fastqc
│   │   │   └── input.R2.qc_fastqc
│   │   │   ├── Icons
│   │   │   └── Images
│   │   └── metrics
│   └── variantcalling
~~~
### Best practice
......
......@@ -10,15 +10,19 @@ Which makes it very easy to look at the variations between certain species or st
* <a href="http://sco.h-its.org/exelixis/software.html" target="_blank">RAxml</a>
* <a href="https://github.com/sanger-pathogens/Gubbins" target="_blank">Gubbins</a>
## Example
To run for a specific species, please do not forget to create the proper index files:
## Requirements
To run for a specific species, please do not forget to create the proper index files.
The index files are created from the supplied reference:
* ```.dict``` (can be produced with <a href="http://broadinstitute.github.io/picard/" target="_blank">Picard tool suite</a>)
* ```.fai``` (can be produced with <a href="http://samtools.sourceforge.net/samtools.shtml" target="_blank">Samtools faidx</a>
* ```.idxSpecificForAligner``` (depending on which aligner is used one should create a suitable index specific for that aligner.
Each aligner has his own way of creating index files. Therefore the options for creating the index files can be found inside the aligner itself)
For the help screen:
## Example
#### For the help screen:
~~~
java -jar Biopet.0.2.0.jar pipeline basty -h
~~~
......@@ -30,14 +34,6 @@ Note that one should first create the appropriate [configs](../config.md).
java -jar Biopet.0.2.0.jar pipeline basty -run -config MySamples.json -config MySettings.json -outDir myOutDir
~~~
For LUMC/researchSHARK users there is a module available that sets all your environment settings and default executables/settings.
~~~
module load Biopet/0.2.0
biopet pipeline basty -run -config MySamples.json -config MySettings.json -outDir myOutDir
~~~
## Result files
The output files this pipeline produces are:
......
......@@ -57,25 +57,6 @@ java -jar Biopet-0.2.0.jar tool BastyGenerateFasta --inputVcf myVCF.vcf --bamFil
--outputName NiceTool --outputConsensusVariants myConsensusVariants.fasta
~~~
For LUMC/researchSHARK users there is a module available that sets all your environment settings and default executables/settings.
~~~
module load Biopet/0.2.0
# Minimal example for option: outputVariants (VCF based)
biopet tool BastyGenerateFasta --inputVcf myVCF.vcf \
--outputName NiceTool --outputVariants myVariants.fasta
# Minimal example for option: outputConsensus (BAM based)
biopet tool BastyGenerateFasta --bamFile myBam.bam \
--outputName NiceTool --outputConsensus myConsensus.fasta
# Minimal example for option: outputConsensusVariants
biopet tool BastyGenerateFasta --inputVcf myVCF.vcf --bamFile myBam.bam \
--outputName NiceTool --outputConsensusVariants myConsensusVariants.fasta
~~~
## Output
* FASTA containing variants only
......
......@@ -2,13 +2,11 @@ site_name: Biopet user manual
pages:
- ['index.md', 'Home']
- ['config.md', 'Config']
- ['pipelines/basty.md', 'pipelines', 'Basty']
- ['pipelines/GATK-pipeline.md', 'pipelines', 'GATK-pipeline']
- ['pipelines/flexiprep.md', 'pipelines', 'Flexiprep']
- ['pipelines/gentrap.md', 'pipelines', 'Gentrap']
- ['pipelines/mapping.md', 'pipelines', 'Mapping']
- ['pipelines/sage.md', 'pipelines', 'Sage']
- ['pipelines/yamsvp.md', 'pipelines', 'Yamsvp']
- ['pipelines/basty.md', 'Pipelines', 'Basty']
- ['pipelines/GATK-pipeline.md', 'Pipelines', 'GATK-pipeline']
- ['pipelines/flexiprep.md', 'Pipelines', 'Flexiprep']
- ['pipelines/mapping.md', 'Pipelines', 'Mapping']
- ['pipelines/sage.md', 'Pipelines', 'Sage']
- ['tools/SamplesTsvToJson.md','tools','SamplesTsvToJson']
- ['tools/BastyGenerateFasta.md','tools','BastyGenerateFasta']
- ['cluster/oge.md', 'OpenGridEngine']
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment