putted config in general folder

ab38bbe2 · sajvanderzeeuw · 3f15ac31 · ab38bbe2 · ab38bbe2 · ab38bbe2
Commit ab38bbe2 authored 10 years ago by sajvanderzeeuw
--- a/docs/general/config.md
+++ b/docs/general/config.md
+# How to create configs
+
+### The sample config
+
+The sample config should be in [__JSON__](http://www.json.org/) format
+
+- First field should have the key __"samples"__
+- Second field should contain the __"libraries"__
+- Third field contains __"R1" or "R2"__ or __"bam"__
+- The fastq input files can be provided zipped and un zipped
+
+#### Example sample config
+~~~
+    {  
+       "samples":{  
+          "Sample_ID1":{  
+             "libraries":{  
+                "MySeries_1":{  
+                   "R1":"Your_R1.fastq.gz",
+                   "R2":"Your_R2.fastq.gz"
+                }
+             }
+          }
+       }
+    }
+~~~
+
+- For BAM files as input one should use a config like this:
+  
+~~~
+    {
+       "samples":{  
+          "Sample_ID_1":{  
+             "libraries":{  
+                "Lib_ID_1":{  
+                   "bam":"MyFirst.bam"
+                },
+                "Lib_ID_2":{  
+                   "bam":"MySecond.bam"
+                }
+             }
+          }
+       }
+    }
+~~~
+
+
+Note that there is a tool called [SamplesTsvToJson](../tools/SamplesTsvToJson.md) this enables a user to get the sample config without any chance of creating a wrongly formatted JSON file.
+
+
+### The settings config
+The settings config enables a user to alter the settings for almost all settings available in the tools used for a given pipeline.
+This config file should be written in JSON format. It can contain setup settings like references for the tools used,
+if the pipeline should use chunking or setting memory limits for certain programs almost everything can be adjusted trough this config file.
+One could set global variables containing settings for all tools used in the pipeline or set tool specific options one layer deeper into the JSON file.
+E.g. in the example below the settings for Picard tools are altered only for Picard and not global. 
+
+~~~
+"picard": { "validationstringency": "LENIENT" } 
+~~~
+
+Global setting examples are:
+~~~
+"java_gc_timelimit": 98,
+"numberchunks": 25,
+"chunking": true
+~~~
+
+
+----
+
+#### Example settings config
+~~~
+{
+        "reference": "/data/LGTC/projects/vandoorn-melanoma/data/references/hg19_nohap/ucsc.hg19_nohap.fasta",
+        "dbsnp": "/data/LGTC/projects/vandoorn-melanoma/data/references/hg19_nohap/dbsnp_137.hg19_nohap.vcf",
+        "joint_variantcalling": false,
+        "haplotypecaller": { "scattercount": 100 },
+        "multisample": { "haplotypecaller": { "scattercount": 1000 } },
+        "picard": { "validationstringency": "LENIENT" },
+        "library_variantcalling_temp": true,
+        "target_bed_temp": "/data/LGTC/projects/vandoorn-melanoma/analysis/target.bed",
+        "min_dp": 5,
+        "bedtools": {"exe":"/share/isilon/system/local/BEDtools/bedtools-2.17.0/bin/bedtools"},
+        "bam_to_fastq": true,
+        "baserecalibrator": { "memory_limit": 8, "vmem":"16G" },
+        "samtofastq": {"memory_limit": 8, "vmem": "16G"},
+        "java_gc_timelimit": 98,
+        "numberchunks": 25,
+        "chunking": true,
+        "haplotypecaller": { "scattercount": 1000 }
+}
+~~~
+
+### JSON validation
+
+To check if the JSON file created is correct we can use multiple options the simplest way is using [this](http://jsonformatter.curiousconcept.com/)
+website. It is also possible to use Python or Scala for validating but this requires some more knowledge.
\ No newline at end of file
--- a/docs/index.md
+++ b/docs/index.md
@@ -85,12 +85,12 @@ Using this option, the `java -jar Biopet-<version>.jar` can be ommited and `biop
 - [Sage](pipelines/sage)
 - Yamsvp (Under development)

-__Note that each pipeline needs a config file written in JSON format see [config](config.md) & [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config) __
+__Note that each pipeline needs a config file written in JSON format see [config](general/config.md) & [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config) __


 There are multiple configs that can be passed to a pipeline, for example the sample, settings and executables wherefrom sample and settings are mandatory.

- [Here](config) one can find how to create a sample and settings config
+- [Here](general/config.md) one can find how to create a sample and settings config
 - More info can be found here: [How To! Config](https://git.lumc.nl/biopet/biopet/wikis/Config)

 ### Running a tool

--- a/docs/pipelines/GATK-pipeline.md
+++ b/docs/pipelines/GATK-pipeline.md
@@ -28,7 +28,7 @@ The pipeline accepts ```.fastq & .bam``` files as input.

 ## Example

-Note that one should first create the appropriate [configs](../config.md).
+Note that one should first create the appropriate [configs](../general/config.md).

 To get the help menu:
 ~~~

--- a/docs/pipelines/basty.md
+++ b/docs/pipelines/basty.md
@@ -30,7 +30,7 @@ java -jar Biopet.0.2.0.jar pipeline basty -h
 ~~~

 #### Run the pipeline:
-Note that one should first create the appropriate [configs](../config.md).
+Note that one should first create the appropriate [configs](../general/config.md).

 ~~~
 java -jar Biopet.0.2.0.jar pipeline basty -run -config MySamples.json -config MySettings.json -outDir myOutDir

--- a/docs/pipelines/flexiprep.md
+++ b/docs/pipelines/flexiprep.md
@@ -26,7 +26,7 @@ Arguments for Flexiprep:

 As we can see in the above example we provide the options to skip trimming or clipping 
 since sometimes you want to have the possibility to not perform these tasks e.g.
-if there are no adapters present in your .fastq. Note that the pipeline also works on unpaired reads where one should only provide R1.+
+if there are no adapters present in your .fastq. Note that the pipeline also works on unpaired reads where one should only provide R1.


 To start the pipeline (remove `-run` for a dry run):
@@ -36,11 +36,91 @@ java -jar Biopet-0.2.0.jar pipeline Flexiprep -run -outDir myDir \
 -library myLibname -config mySettings.json
 ~~~

+## Result files
+The results from this pipeline will be a fastq file which is depending on the options either clipped and trimmed, only clipped,
+ only trimmed or no quality control at all. The pipeline also outputs 2 Fastqc runs one before and one after quality control.

+### Example output
+
+~~~
+.
+├── mySample_01.qc.summary.json
+├── mySample_01.qc.summary.json.out
+├── mySample_01.R1.contams.txt
+├── mySample_01.R1.fastqc
+│   ├── mySample_01.R1_fastqc
+│   │   ├── fastqc_data.txt
+│   │   ├── fastqc_report.html
+│   │   ├── Icons
+│   │   │   ├── error.png
+│   │   │   ├── fastqc_icon.png
+│   │   │   ├── tick.png
+│   │   │   └── warning.png
+│   │   ├── Images
+│   │   │   └── warning.png
+│   │   ├── Images
+│   │   │   ├── duplication_levels.png
+│   │   │   ├── kmer_profiles.png
+│   │   │   ├── per_base_gc_content.png
+│   │   │   ├── per_base_n_content.png
+│   │   │   ├── per_base_quality.png
+│   │   │   ├── per_base_sequence_content.png
+│   │   │   ├── per_sequence_gc_content.png
+│   │   │   ├── per_sequence_quality.png
+│   │   │   └── sequence_length_distribution.png
+│   │   └── summary.txt
+│   └── mySample_01.R1.qc_fastqc.zip
+├── mySample_01.R1.qc.fastq.gz
+├── mySample_01.R1.qc.fastq.gz.md5
+├── mySample_01.R2.contams.txt
+├── mySample_01.R2.fastqc
+│   ├── mySample_01.R2_fastqc
+│   │   ├── fastqc_data.txt
+│   │   ├── fastqc_report.html
+│   │   ├── Icons
+│   │   │   ├── error.png
+│   │   │   ├── fastqc_icon.png
+│   │   │   ├── tick.png
+│   │   │   └── warning.png
+│   │   ├── Images
+│   │   │   ├── duplication_levels.png
+│   │   │   ├── kmer_profiles.png
+│   │   │   ├── per_base_gc_content.png
+│   │   │   ├── per_base_n_content.png
+│   │   │   ├── per_base_quality.png
+│   │   │   ├── per_base_sequence_content.png
+│   │   │   ├── per_sequence_gc_content.png
+│   │   │   ├── per_sequence_quality.png
+│   │   │   └── sequence_length_distribution.png
+│   │   └── summary.txt
+│   └── mySample_01.R2_fastqc.zip
+├── mySample_01.R2.fastq.md5
+├── mySample_01.R2.qc.fastqc
+│   ├── mySample_01.R2.qc_fastqc
+│   │   ├── fastqc_data.txt
+│   │   ├── fastqc_report.html
+│   │   ├── Icons
+│   │   │   ├── error.png
+│   │   │   ├── fastqc_icon.png
+│   │   │   ├── tick.png
+│   │   │   └── warning.png
+│   │   ├── Images
+│   │   │   ├── duplication_levels.png
+│   │   │   ├── kmer_profiles.png
+│   │   │   ├── per_base_gc_content.png
+│   │   │   ├── per_base_n_content.png
+│   │   │   ├── per_base_quality.png
+│   │   │   ├── per_base_sequence_content.png
+│   │   │   ├── per_sequence_gc_content.png
+│   │   │   ├── per_sequence_quality.png
+│   │   │   └── sequence_length_distribution.png
+│   │   └── summary.txt
+│   └── mySample_01.R2.qc_fastqc.zip
+├── mySample_01.R2.qc.fastq.gz
+└── mySample_01.R2.qc.fastq.gz.md5
+~~~

-# Examine results

-## Result files

 ## Best practice


--- a/docs/pipelines/gentrap.md
+++ b/docs/pipelines/gentrap.md
@@ -3,7 +3,7 @@
 # Invocation

 # Example
-Note that one should first create the appropriate [configs](../config.md).
+Note that one should first create the appropriate [configs](../general/config.md).

 # Testcase A


--- a/docs/pipelines/mapping.md
+++ b/docs/pipelines/mapping.md
@@ -19,7 +19,7 @@ After the QC, the pipeline simply maps the reads with the chosen aligner. The re
 ----

 ## Example
-Note that one should first create the appropriate [configs](../config.md).
+Note that one should first create the appropriate [configs](../general/config.md).

 For the help menu:
 ~~~

--- a/docs/pipelines/sage.md
+++ b/docs/pipelines/sage.md
@@ -12,7 +12,7 @@ The Sage pipeline has been created to process SAGE data, which requires a differ


 # Example
-Note that one should first create the appropriate [configs](../config.md).
+Note that one should first create the appropriate [configs](../general/config.md).

 To get the help menu:
 ~~~

--- a/docs/pipelines/yamsvp.md
+++ b/docs/pipelines/yamsvp.md
@@ -3,7 +3,7 @@
 # Invocation

 # Example
-Note that one should first create the appropriate [configs](../config.md).
+Note that one should first create the appropriate [configs](../general/config.md).

 # Testcase A


--- a/docs/tools/FindRepeatsPacBio.md
+++ b/docs/tools/FindRepeatsPacBio.md
@@ -9,7 +9,7 @@ those regions with the BAM file. On those extracted regions the tool will perfor
 ## Example
 To get the help menu:
 ~~~
- java -jar Biopet-0.2.0-DEV-801b72ed.jar tool FindRepeatsPacBio -h
+java -jar Biopet-0.2.0.jar tool FindRepeatsPacBio -h
 Usage: FindRepeatsPacBio [options]

  -l <value> | --log_level <value>
@@ -26,7 +26,7 @@ Usage: FindRepeatsPacBio [options]

 To run the tool:
 ~~~
- java -jar Biopet-0.2.0.jar tool FindRepeatsPacBio --inputBam myInputbam.bam \
+java -jar Biopet-0.2.0.jar tool FindRepeatsPacBio --inputBam myInputbam.bam \
 --inputBed myRepeatRegions.bed > mySummary.txt
 ~~~
 Since the default output of the program is printed in stdout we can use > to write the output to a text file.

--- a/mkdocs.yml
+++ b/mkdocs.yml
-site_name: Biopet user manual
+site_name: Biopet User Manual
 pages:
 - ['index.md', 'Home']
- ['config.md', 'Config']
+- ['general/config.md', 'General', 'Config']
 - ['pipelines/basty.md', 'Pipelines', 'Basty']
 - ['pipelines/GATK-pipeline.md', 'Pipelines', 'GATK-pipeline']
 - ['pipelines/flexiprep.md', 'Pipelines', 'Flexiprep']
@@ -20,6 +20,7 @@ pages:
 - ['tools/MpileupToVcf.md', 'Tools', 'MpileupToVcf']
 - ['tools/sagetools.md', 'Tools', 'Sagetools']
 - ['tools/WipeReads.md', 'Tools', 'WipeReads']
+#- ['developing/Setup.md', 'Developing', 'Setting up your local development environment']
 - ['about.md', 'About']
 - ['license.md', 'License']
 #theme: readthedocs