Commit 25d3f569 authored by bow's avatar bow

Last docs update

parent 55fc13c4
......@@ -27,7 +27,7 @@ $ biopet
This will show you a list of tools and pipelines that you can use straight away. You can also execute `biopet pipeline` to show only available pipelines or `biopet tool` to show only the tools. What you should be aware of, is that this is actually a shell function that calls `java` on the system-wide available Biopet JAR file.
~~~
$ java -jar /path/to/current/biopet/release.jar
$ java -jar <path/to/current/biopet/release.jar>
~~~
The actual path will vary from version to version, which is controlled by which module you loaded.
......@@ -35,15 +35,15 @@ The actual path will vary from version to version, which is controlled by which
Almost all of the pipelines have a common usage pattern with a similar set of flags, for example:
~~~
$ biopet pipeline <name of pipeline> -config myconfig.json -qsub -jobParaEnv BWA -retry 2
$ biopet pipeline <pipeline_name> -config <path/to/config.json> -qsub -jobParaEnv BWA -retry 2
~~~
The command above will do a *dry* run of a pipeline using the config file `myconfig.json` as if the command would be submitted to the SHARK cluster (the `-qsub` flag) to the `BWA` parallel environment (the `-jobParaEnv BWA` flag). We also set the maximum retry of failing jobs to two times (via the `-retry 2` flag). Doing a good run is a good idea to ensure that your real run proceeds smoothly. It may not catch all the errors, but if the dry run fails you can be sure that the real run will never succeed.
The command above will do a *dry* run of a pipeline using a config file as if the command would be submitted to the SHARK cluster (the `-qsub` flag) to the `BWA` parallel environment (the `-jobParaEnv BWA` flag). We also set the maximum retry of failing jobs to two times (via the `-retry 2` flag). Doing a good run is a good idea to ensure that your real run proceeds smoothly. It may not catch all the errors, but if the dry run fails you can be sure that the real run will never succeed.
If the dry run proceeds without problems, you can then do the real run by using the `-run` flag:
~~~
$ biopet pipeline shiva -config myconfig.json -qsub -jobParaEnv BWA -retry 2 -run
$ biopet pipeline <pipeline_name> -config <path/to/config.json> -qsub -jobParaEnv BWA -retry 2 -run
~~~
It is usually a good idea to do the real run using `screen` or `nohup` to prevent the job from terminating when you log out of SHARK. In practice, using `biopet` as it is is also fine. What you need to keep in mind, is that each pipeline has their own expected config layout. You can check out more about the general structure of our config files [here](general/config.md). For the specific structure that each pipeline accepts, please consult the respective pipeline page.
......
# Bam2Wig
## Introduction
Bam2Wig is a small pipeline consisting of three steps that is used to convert BAM files into track coverage files: bigWig, wiggle, and TDF. While this seems like a task that should be tool, at the time of writing, there are no command line tools that can do such conversion in one go. Thus, the Bam2Wig pipeline was written.
## Configuration
The required configuration file for Bam2Wig is really minimal, only a single JSON file containing an `output_dir` entry:
~~~
{"output_dir": "/path/to/output/dir"}
~~~
## Running Bam2Wig
As with other pipelines, you can run the Bam2Wig pipeline by invoking the `pipeline` subcommand. There is also a general help available which can be invoked using the `-h` flag:
~~~
$ java -jar /path/to/biopet.jar pipeline sage -h
~~~
If you are on SHARK, you can also load the `biopet` module and execute `biopet pipeline` instead:
~~~
$ module load biopet/v0.3.0
$ biopet pipeline bam2wig
~~~
To run the pipeline:
~~~
biopet pipeline bam2wig -config </path/to/config.json> -qsub -jobParaEnv BWA -run
~~~
## Output Files
The pipeline generates three output track files: a bigWig file, a wiggle file, and a TDF file.
......@@ -7,7 +7,7 @@ A pipeline for aligning bacterial genomes and detect structural variations on th
Which makes it very easy to look at the variations between certain species or strains.
### Tools for this pipeline
* [GATK-pipeline](GATK-pipeline.md)
* [Shiva](../pipelines/shiva.md)
* [BastyGenerateFasta](../tools/BastyGenerateFasta.md)
* <a href="http://sco.h-its.org/exelixis/software.html" target="_blank">RAxml</a>
* <a href="https://github.com/sanger-pathogens/Gubbins" target="_blank">Gubbins</a>
......@@ -45,7 +45,7 @@ The output files this pipeline produces are:
* The output from the tool [BastyGenerateFasta](../tools/BastyGenerateFasta.md)
* FASTA containing variants only
* FASTA containing all the consensus sequences based on min. coverage (default:8) but can be modified in the config
* A phylogenetic tree based on the variants called with the GATK-pipeline generated with the tool [BastyGenerateFasta](../tools/BastyGenerateFasta.md)
* A phylogenetic tree based on the variants called with the Shiva pipeline generated with the tool [BastyGenerateFasta](../tools/BastyGenerateFasta.md)
~~~
......
# Carp
## Introduction
Carp is a pipeline for analyzing ChIP-seq NGS data. It uses the BWA MEM aligner and the MACS2 peak caller by default to align ChIP-seq data and call the peaks and allows you to run all your samples (control or otherwise) in one go.
## Configuration File
### Sample Configuration
The layout of the sample configuration for Carp is basically the same as with our other multi sample pipelines, for example:
~~~
{
"samples": {
"sample_X": {
"control": ["sample_Y"],
"libraries": {
"lib_one": {
"R1": "/absolute/path/to/first/read/pair.fq",
"R2": "/absolute/path/to/second/read/pair.fq"
}
}
},
"sample_Y": {
"libraries": {
"lib_one": {
"R1": "/absolute/path/to/first/read/pair.fq",
"R2": "/absolute/path/to/second/read/pair.fq"
}
"lib_two": {
"R1": "/absolute/path/to/first/read/pair.fq",
"R2": "/absolute/path/to/second/read/pair.fq"
}
}
}
}
}
~~~
What's important there is that you can specify the control ChIP-seq experiment(s) for a given sample. These controls are usually ChIP-seq runs from input DNA and/or from treatment with nonspecific binding proteins such as IgG. In the example above, we are specifying `sample_Y` as the control for `sample_X`.
### Pipeline Settings Configuration
For the pipeline settings, there are some values that you need to specify while some are optional. Required settings are:
1. `output_dir`: path to output directory (if it does not exist, Carp will create it for you).
2. `reference`: this must point to a reference FASTA file and in the same directory, there must be a `.dict` file of the FASTA file.
While optional settings are:
1. `aligner`: which aligner to use (`bwa` or `bowtie`)
## Running Carp
As with other pipelines in the Biopet suite, Carp can be run by specifying the pipeline after the `pipeline` subcommand:
~~~
java -jar </path/to/biopet.jar> pipeline carp -config </path/to/config.json> -qsub -jobParaEnv BWA -run
~~~
If you already have the `biopet` environment module loaded, you can also simply call `biopet`:
~~~
biopet pipeline carp -config </path/to/config.json> -qsub -jobParaEnv BWA -run
~~~
It is also a good idea to specify retries (we recomend `-retry 3` up to `-retry 5`) so that cluster glitches do not interfere with your pipeline runs.
## Getting Help
If you have any questions on running Carp, suggestions on how to improve the overall flow, or requests for your favorite ChIP-seq related program to be added, feel free to post an issue to our issue tracker at [https://git.lumc.nl/biopet/biopet/issues](https://git.lumc.nl/biopet/biopet/issues).
......@@ -119,9 +119,3 @@ The results from this pipeline will be a fastq file which is depending on the op
├── mySample_01.R2.qc.fastq.gz
└── mySample_01.R2.qc.fastq.gz.md5
~~~
## Best practice
# References
......@@ -112,16 +112,16 @@ Thus, an example settings configuration is as follows:
## Running Gentrap
As with other pipelines in the Biopet suite, gentrap can be run by specifying the pipeline after the `pipeline` subcommand:
As with other pipelines in the Biopet suite, Gentrap can be run by specifying the pipeline after the `pipeline` subcommand:
~~~
java -jar /path/to/biopet.jar pipeline gentrap -config /path/to/config.json -qsub -jobParaEnv BWA -run
java -jar </path/to/biopet.jar> pipeline gentrap -config </path/to/config.json> -qsub -jobParaEnv BWA -run
~~~
If you already have the `biopet` environment module loaded, you can also simply call `biopet`:
~~~
biopet pipeline gentrap -config /path/to/config.json -qsub -jobParaEnv BWA -run
biopet pipeline gentrap -config </path/to/config.json> -qsub -jobParaEnv BWA -run
~~~
It is also a good idea to specify retries (we recomend `-retry 3` up to `-retry 5`) so that cluster glitches do not interfere with your pipeline runs.
......
# Introduction
# Mapping
## Introduction
The mapping pipeline has been created for NGS users who want to align there data with the most commonly used alignment programs.
The pipeline performs a quality control (QC) on the raw fastq files with our [Flexiprep](flexiprep.md) pipeline.
After the QC, the pipeline simply maps the reads with the chosen aligner. The resulting BAM files will be sorted on coordinates and indexed, for downstream analysis.
----
## Tools for this pipeline:
* [Flexiprep](flexiprep.md)
......@@ -16,14 +17,13 @@ After the QC, the pipeline simply maps the reads with the chosen aligner. The re
* <a href="https://github.com/alexdobin/STAR" target="_blank">Star-2pass</a>
* <a href="http://broadinstitute.github.io/picard/" target="_blank">Picard tool suite</a>
----
## Example
Note that one should first create the appropriate [configs](../general/config.md).
For the help menu:
~~~
java -jar Biopet-0.2.0.jar pipeline mapping -h
java -jar </path/to/biopet.jar> pipeline mapping -h
Arguments for Mapping:
-R1,--input_r1 <input_r1> R1 fastq file
......@@ -52,7 +52,7 @@ Arguments for Mapping:
To run the pipeline:
~~~
java -jar Biopet.0.2.0.jar pipeline mapping -run --config mySettings.json \
java -jar </path/to/biopet.jar> pipeline mapping -run --config mySettings.json \
-R1 myReads1.fastq -R2 myReads2.fastq -outDir myOutDir -OutputName myReadsOutput \
-R hg19.fasta -RGSM mySampleName -RGLB myLib1
~~~
......@@ -62,8 +62,6 @@ To perform a dry run simply remove `-run` from the commandline call.
----
## Examine results
## Result files
~~~
├── OutDir
......@@ -73,8 +71,3 @@ To perform a dry run simply remove `-run` from the commandline call.
   ├── flexiprep
└── metrics
~~~
## Best practice
## References
\ No newline at end of file
TOUCAN
===========
Toucan
======
Introduction
-----------
......
......@@ -3,8 +3,10 @@ pages:
- ['index.md', 'Home']
- ['general/config.md', 'General', 'Config']
- ['pipelines/basty.md', 'Pipelines', 'Basty']
- ['pipelines/bam2wig.md', 'Pipelines', 'Bam2Wig']
- ['pipelines/carp.md', 'Pipelines', 'Carp']
- ['pipelines/gentrap.md', 'Pipelines', 'Gentrap']
- ['pipelines/GATK-pipeline.md', 'Pipelines', 'GATK-pipeline']
- ['pipelines/shiva.md', 'Pipelines', 'Shiva']
- ['pipelines/flexiprep.md', 'Pipelines', 'Flexiprep']
- ['pipelines/mapping.md', 'Pipelines', 'Mapping']
- ['pipelines/sage.md', 'Pipelines', 'Sage']
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment