Commit ab38bbe2 authored by sajvanderzeeuw's avatar sajvanderzeeuw

putted config in general folder

parent 3f15ac31
# How to create configs
### The sample config
The sample config should be in [__JSON__](http://www.json.org/) format
- First field should have the key __"samples"__
- Second field should contain the __"libraries"__
- Third field contains __"R1" or "R2"__ or __"bam"__
- The fastq input files can be provided zipped and un zipped
#### Example sample config
~~~
{
"samples":{
"Sample_ID1":{
"libraries":{
"MySeries_1":{
"R1":"Your_R1.fastq.gz",
"R2":"Your_R2.fastq.gz"
}
}
}
}
}
~~~
- For BAM files as input one should use a config like this:
~~~
{
"samples":{
"Sample_ID_1":{
"libraries":{
"Lib_ID_1":{
"bam":"MyFirst.bam"
},
"Lib_ID_2":{
"bam":"MySecond.bam"
}
}
}
}
}
~~~
Note that there is a tool called [SamplesTsvToJson](../tools/SamplesTsvToJson.md) this enables a user to get the sample config without any chance of creating a wrongly formatted JSON file.
### The settings config
The settings config enables a user to alter the settings for almost all settings available in the tools used for a given pipeline.
This config file should be written in JSON format. It can contain setup settings like references for the tools used,
if the pipeline should use chunking or setting memory limits for certain programs almost everything can be adjusted trough this config file.
One could set global variables containing settings for all tools used in the pipeline or set tool specific options one layer deeper into the JSON file.
E.g. in the example below the settings for Picard tools are altered only for Picard and not global.
~~~
"picard": { "validationstringency": "LENIENT" }
~~~
Global setting examples are:
~~~
"java_gc_timelimit": 98,
"numberchunks": 25,
"chunking": true
~~~
----
#### Example settings config
~~~
{
"reference": "/data/LGTC/projects/vandoorn-melanoma/data/references/hg19_nohap/ucsc.hg19_nohap.fasta",
"dbsnp": "/data/LGTC/projects/vandoorn-melanoma/data/references/hg19_nohap/dbsnp_137.hg19_nohap.vcf",
"joint_variantcalling": false,
"haplotypecaller": { "scattercount": 100 },
"multisample": { "haplotypecaller": { "scattercount": 1000 } },
"picard": { "validationstringency": "LENIENT" },
"library_variantcalling_temp": true,
"target_bed_temp": "/data/LGTC/projects/vandoorn-melanoma/analysis/target.bed",
"min_dp": 5,
"bedtools": {"exe":"/share/isilon/system/local/BEDtools/bedtools-2.17.0/bin/bedtools"},
"bam_to_fastq": true,
"baserecalibrator": { "memory_limit": 8, "vmem":"16G" },
"samtofastq": {"memory_limit": 8, "vmem": "16G"},
"java_gc_timelimit": 98,
"numberchunks": 25,
"chunking": true,
"haplotypecaller": { "scattercount": 1000 }
}
~~~
### JSON validation
To check if the JSON file created is correct we can use multiple options the simplest way is using [this](http://jsonformatter.curiousconcept.com/)
website. It is also possible to use Python or Scala for validating but this requires some more knowledge.
\ No newline at end of file
......@@ -85,12 +85,12 @@ Using this option, the `java -jar Biopet-<version>.jar` can be ommited and `biop
- [Sage](pipelines/sage)
- Yamsvp (Under development)
__Note that each pipeline needs a config file written in JSON format see [config](config.md) & [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config) __
__Note that each pipeline needs a config file written in JSON format see [config](general/config.md) & [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config) __
There are multiple configs that can be passed to a pipeline, for example the sample, settings and executables wherefrom sample and settings are mandatory.
- [Here](config) one can find how to create a sample and settings config
- [Here](general/config.md) one can find how to create a sample and settings config
- More info can be found here: [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config)
### Running a tool
......
......@@ -28,7 +28,7 @@ The pipeline accepts ```.fastq & .bam``` files as input.
## Example
Note that one should first create the appropriate [configs](../config.md).
Note that one should first create the appropriate [configs](../general/config.md).
To get the help menu:
~~~
......
......@@ -30,7 +30,7 @@ java -jar Biopet.0.2.0.jar pipeline basty -h
~~~
#### Run the pipeline:
Note that one should first create the appropriate [configs](../config.md).
Note that one should first create the appropriate [configs](../general/config.md).
~~~
java -jar Biopet.0.2.0.jar pipeline basty -run -config MySamples.json -config MySettings.json -outDir myOutDir
......
......@@ -26,7 +26,7 @@ Arguments for Flexiprep:
As we can see in the above example we provide the options to skip trimming or clipping
since sometimes you want to have the possibility to not perform these tasks e.g.
if there are no adapters present in your .fastq. Note that the pipeline also works on unpaired reads where one should only provide R1.+
if there are no adapters present in your .fastq. Note that the pipeline also works on unpaired reads where one should only provide R1.
To start the pipeline (remove `-run` for a dry run):
......@@ -36,11 +36,91 @@ java -jar Biopet-0.2.0.jar pipeline Flexiprep -run -outDir myDir \
-library myLibname -config mySettings.json
~~~
## Result files
The results from this pipeline will be a fastq file which is depending on the options either clipped and trimmed, only clipped,
only trimmed or no quality control at all. The pipeline also outputs 2 Fastqc runs one before and one after quality control.
### Example output
~~~
.
├── mySample_01.qc.summary.json
├── mySample_01.qc.summary.json.out
├── mySample_01.R1.contams.txt
├── mySample_01.R1.fastqc
│   ├── mySample_01.R1_fastqc
│   │   ├── fastqc_data.txt
│   │   ├── fastqc_report.html
│   │   ├── Icons
│   │   │   ├── error.png
│   │   │   ├── fastqc_icon.png
│   │   │   ├── tick.png
│   │   │   └── warning.png
│   │   ├── Images
│   │   │   └── warning.png
│   │   ├── Images
│   │   │   ├── duplication_levels.png
│   │   │   ├── kmer_profiles.png
│   │   │   ├── per_base_gc_content.png
│   │   │   ├── per_base_n_content.png
│   │   │   ├── per_base_quality.png
│   │   │   ├── per_base_sequence_content.png
│   │   │   ├── per_sequence_gc_content.png
│   │   │   ├── per_sequence_quality.png
│   │   │   └── sequence_length_distribution.png
│   │   └── summary.txt
│   └── mySample_01.R1.qc_fastqc.zip
├── mySample_01.R1.qc.fastq.gz
├── mySample_01.R1.qc.fastq.gz.md5
├── mySample_01.R2.contams.txt
├── mySample_01.R2.fastqc
│   ├── mySample_01.R2_fastqc
│   │   ├── fastqc_data.txt
│   │   ├── fastqc_report.html
│   │   ├── Icons
│   │   │   ├── error.png
│   │   │   ├── fastqc_icon.png
│   │   │   ├── tick.png
│   │   │   └── warning.png
│   │   ├── Images
│   │   │   ├── duplication_levels.png
│   │   │   ├── kmer_profiles.png
│   │   │   ├── per_base_gc_content.png
│   │   │   ├── per_base_n_content.png
│   │   │   ├── per_base_quality.png
│   │   │   ├── per_base_sequence_content.png
│   │   │   ├── per_sequence_gc_content.png
│   │   │   ├── per_sequence_quality.png
│   │   │   └── sequence_length_distribution.png
│   │   └── summary.txt
│   └── mySample_01.R2_fastqc.zip
├── mySample_01.R2.fastq.md5
├── mySample_01.R2.qc.fastqc
│   ├── mySample_01.R2.qc_fastqc
│   │   ├── fastqc_data.txt
│   │   ├── fastqc_report.html
│   │   ├── Icons
│   │   │   ├── error.png
│   │   │   ├── fastqc_icon.png
│   │   │   ├── tick.png
│   │   │   └── warning.png
│   │   ├── Images
│   │   │   ├── duplication_levels.png
│   │   │   ├── kmer_profiles.png
│   │   │   ├── per_base_gc_content.png
│   │   │   ├── per_base_n_content.png
│   │   │   ├── per_base_quality.png
│   │   │   ├── per_base_sequence_content.png
│   │   │   ├── per_sequence_gc_content.png
│   │   │   ├── per_sequence_quality.png
│   │   │   └── sequence_length_distribution.png
│   │   └── summary.txt
│   └── mySample_01.R2.qc_fastqc.zip
├── mySample_01.R2.qc.fastq.gz
└── mySample_01.R2.qc.fastq.gz.md5
~~~
# Examine results
## Result files
## Best practice
......
......@@ -3,7 +3,7 @@
# Invocation
# Example
Note that one should first create the appropriate [configs](../config.md).
Note that one should first create the appropriate [configs](../general/config.md).
# Testcase A
......
......@@ -19,7 +19,7 @@ After the QC, the pipeline simply maps the reads with the chosen aligner. The re
----
## Example
Note that one should first create the appropriate [configs](../config.md).
Note that one should first create the appropriate [configs](../general/config.md).
For the help menu:
~~~
......
......@@ -12,7 +12,7 @@ The Sage pipeline has been created to process SAGE data, which requires a differ
# Example
Note that one should first create the appropriate [configs](../config.md).
Note that one should first create the appropriate [configs](../general/config.md).
To get the help menu:
~~~
......
......@@ -3,7 +3,7 @@
# Invocation
# Example
Note that one should first create the appropriate [configs](../config.md).
Note that one should first create the appropriate [configs](../general/config.md).
# Testcase A
......
......@@ -9,7 +9,7 @@ those regions with the BAM file. On those extracted regions the tool will perfor
## Example
To get the help menu:
~~~
java -jar Biopet-0.2.0-DEV-801b72ed.jar tool FindRepeatsPacBio -h
java -jar Biopet-0.2.0.jar tool FindRepeatsPacBio -h
Usage: FindRepeatsPacBio [options]
-l <value> | --log_level <value>
......@@ -26,7 +26,7 @@ Usage: FindRepeatsPacBio [options]
To run the tool:
~~~
java -jar Biopet-0.2.0.jar tool FindRepeatsPacBio --inputBam myInputbam.bam \
java -jar Biopet-0.2.0.jar tool FindRepeatsPacBio --inputBam myInputbam.bam \
--inputBed myRepeatRegions.bed > mySummary.txt
~~~
Since the default output of the program is printed in stdout we can use > to write the output to a text file.
......
site_name: Biopet user manual
site_name: Biopet User Manual
pages:
- ['index.md', 'Home']
- ['config.md', 'Config']
- ['general/config.md', 'General', 'Config']
- ['pipelines/basty.md', 'Pipelines', 'Basty']
- ['pipelines/GATK-pipeline.md', 'Pipelines', 'GATK-pipeline']
- ['pipelines/flexiprep.md', 'Pipelines', 'Flexiprep']
......@@ -20,6 +20,7 @@ pages:
- ['tools/MpileupToVcf.md', 'Tools', 'MpileupToVcf']
- ['tools/sagetools.md', 'Tools', 'Sagetools']
- ['tools/WipeReads.md', 'Tools', 'WipeReads']
#- ['developing/Setup.md', 'Developing', 'Setting up your local development environment']
- ['about.md', 'About']
- ['license.md', 'License']
#theme: readthedocs
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment