Bam2Wig is a small pipeline consisting of three steps that is used to convert BAM files into track coverage files: bigWig, wiggle, and TDF. While this seems like a task that should be tool, at the time of writing, there are no command line tools that can do such conversion in one go. Thus, the Bam2Wig pipeline was written.
Bam2Wig is a small pipeline consisting of three steps that are used to convert BAM files into track coverage files: bigWig, wiggle, and TDF. While this seems like a task that should be tool, at the time of writing, there are no command line tools that can do such conversion in one go. Thus, the Bam2Wig pipeline was written.
## Configuration
The required configuration file for Bam2Wig is really minimal, only a single JSON file containing an `output_dir` entry:
~~~
{"output_dir": "/path/to/output/dir"}
~~~
For technical reasons, single sample pipelines, such as this mapping pipeline do **not** take a sample config.
Input files are in stead given on the command line as a flag.
Bam2wig requires a one to set the `--bamfile` command line argument to point to the to-be-converted BAM file.
## Running Bam2Wig
As with other pipelines, you can run the Bam2Wig pipeline by invoking the `pipeline` subcommand. There is also a general help available which can be invoked using the `-h` flag:
@@ -78,7 +78,7 @@ For the pipeline settings, there are some values that you need to specify while
1.`output_dir`: path to output directory (if it does not exist, Gentrap will create it for you).
2.`aligner`: which aligner to use (`gsnap` or `tophat`)
3.`reference`: this must point to a reference FASTA file and in the same directory, there must be a `.dict` file of the FASTA file.
3.`reference_fasta`: this must point to a reference FASTA file and in the same directory, there must be a `.dict` file of the FASTA file.
4.`expression_measures`: this entry determines which expression measurement modes Gentrap will do. You can choose zero or more from the following: `fragments_per_gene`, `bases_per_gene`, `bases_per_exon`, `cufflinks_strict`, `cufflinks_guided`, and/or `cufflinks_blind`. If you only wish to align, you can set the value as an empty list (`[]`).
5.`strand_protocol`: this determines whether your library is prepared with a specific stranded protocol or not. There are two protocols currently supported now: `dutp` for dUTP-based protocols and `non_specific` for non-strand-specific protocols.
6.`annotation_refflat`: contains the path to an annotation refFlat file of the entire genome
...
...
@@ -100,7 +100,7 @@ Thus, an example settings configuration is as follows:
As with other pipelines, you can run the Sage pipeline by invoking the `pipeline` subcommand. There is also a general help available which can be invoked using the `-h` flag:
...
...
@@ -27,13 +38,12 @@ As with other pipelines, you can run the Sage pipeline by invoking the `pipeline
This pipeline is build for variant calling on NGS data (preferably Illumina data).
It is based on the <ahref="https://www.broadinstitute.org/gatk/guide/best-practices"target="_blank">best practices</a>) of GATK in terms of there approach to variant calling.
It is based on the <ahref="https://www.broadinstitute.org/gatk/guide/best-practices"target="_blank">best practices</a>) of GATK in terms of their approach to variant calling.
The pipeline accepts ```.fastq & .bam``` files as input.
----
...
...
@@ -26,9 +26,9 @@ Note that one should first create the appropriate [configs](../general/config.md
### Full pipeline
The full pipeline can start from fastq or from bam file. This pipeline will include preprocess steps for the bam files.
The full pipeline can start from fastq or from bam file. This pipeline will include pre-process steps for the bam files.