Merge branch 'feature-docs-developers-0.5.0' into 'develop'

Feature docs developers 0.5.0 Some fixes to the docs (last minute edits) See merge request !273

Merge branch 'feature-docs-developers-0.5.0' into 'develop'
22bcad0f · Peter van 't Hof · 4c418037 · 82970657 · 22bcad0f · 22bcad0f
Commit 22bcad0f authored 9 years ago by Peter van 't Hof
--- a/docs/developer/example-pipeline.md
+++ b/docs/developer/example-pipeline.md
@@ -94,33 +94,65 @@ class HelloPipeline(val root: Configurable) extends QScript with SummaryQScript

  // This method is the actual pipeline
  def biopetScript: Unit = {
+    // Executing a tool like FastQC, calling the extension in `nl.lumc.sasc.biopet.extensions.Fastqc`

-    // Executing a tool like FastQC
-    val shiva = new Shiva(this)
-    shiva.init()
-    shiva.biopetScript()
-    addAll(shiva.functions)
+    val fastqc = new Fastqc(this)
+    fastqc.fastqfile = config("fastqc_input")
+    fastqc.output = new File(outputDir, "fastqc.txt")
+    add(fastqc)

-    /* Only required when using [[SummaryQScript]] */
-    addSummaryQScript(shiva)
-
-    // From here you can use the output files of shiva as input file of other jobs
  }
 }

-//TODO: Replace object Name, must be the same as the class of the pipeline
 object HelloPipeline extends PipelineCommand

 ```

+Looking at the pipeline, you can see that it inherits from `QScript`. `QScript` is the fundamental class which gives access to the Queue scheduling system. In addition `SummaryQScript` (trait) will add another layer of functions which provides functions to handle and create summary files from pipeline output.
+`class HelloPipeline(val root: Configurable`, our pipeline is called HelloPipeline and is taking a `root` with configuration options passed down to Biopet via a JSON specified on the commandline (--config).
+
+```
+  def biopetScript: Unit = {
+  }
+```
+
+One can start adding pipeline components in `biopetScript`, this is the programmatically equivalent to the `main` method in most popular programming languages. For example, adding a QC tool to the pipeline like `FastQC`. Look at the example shown above.
+Setting up the pipeline is done within the pipeline itself, fine-tuning is always possible by overriding in the following way:
+ 
+```
+    val fastqc = new Fastqc(this)
+    fastqc.fastqfile = config("fastqc_input")
+    fastqc.output = new File(outputDir, "fastqc.txt")
+    
+    // change kmers settings to 9, wrap with `Some()` because `fastqc.kmers` is a `Option` value.
+    fastqc.kmers = Some(9)
+    
+    add(fastqc)
+
+```




 ### Config setup

+For our new pipeline, one should setup the (default) config options.
+
+Since our pipeline is called `HelloPipeline`, the root of the configoptions will called `hellopipeline` (lowercaps).
+
+```json
+{
+    "output_dir": "/home/user/mypipelineoutpt",
+    "hellopipeline": {
+        
+    }
+}
+
+```
+
+
 ### Test pipeline

 ### Summary output

-### Reporting output (opt)
\ No newline at end of file
+### Reporting output (optional)
\ No newline at end of file
--- a/docs/general/config.md
+++ b/docs/general/config.md
@@ -8,12 +8,14 @@ The sample config should be in [__JSON__](http://www.json.org/) or [__YAML__](ht
 - Second field should contain the __"libraries"__
 - Third field contains __"R1" or "R2"__ or __"bam"__
 - The fastq input files can be provided zipped and unzipped
+- `output_dir` is a required setting that should be set either in a `config.json` or specified on the invocation command via -cv output_dir=<path/to/outputdir\>. The default value is to place the pipeline output in the current working directory.

 #### Example sample config

 ###### yaml:

 ``` yaml
+output_dir: /home/user/myoutputdir
 samples:
  Sample_ID1:
    libraries:
@@ -26,6 +28,7 @@ samples:

 ``` json
    {  
+       "output_dir": "/home/user/myoutputdir",
       "samples":{  
          "Sample_ID1":{  
             "libraries":{  

--- a/docs/pipelines/basty.md
+++ b/docs/pipelines/basty.md
@@ -52,7 +52,7 @@ Specific configuration options additional to Basty are:
 ```json

 {
-    output_dir: </path/to/out_directory>,
+    "output_dir": </path/to/out_directory>,
    "shiva": {
        "variantcallers": ["freeBayes"]
    },

--- a/docs/pipelines/flexiprep.md
+++ b/docs/pipelines/flexiprep.md
@@ -30,7 +30,7 @@ Note that the pipeline also works on unpaired reads where one should only provid
 To start the pipeline (remove `-run` for a dry run):

 ``` bash
-java -jar Biopet-0.2.0.jar pipeline Flexiprep -run -outDir myDir \
+biopet pipeline Flexiprep -run -outDir myDir \
 -R1 myFirstReadPair -R2 mySecondReadPair -sample mySampleName \
 -library myLibname -config mySettings.json
 ```

--- a/docs/pipelines/gears.md
+++ b/docs/pipelines/gears.md
-# Flexiprep
+# Gears

 ## Introduction
 Gears is a metagenomics pipeline. (``GE``nome ``A``nnotation of ``R``esidual ``S``equences). One can use this pipeline to identify contamination in sequencing runs on either raw FastQ files or BAM files.

--- a/docs/releasenotes/release_notes_0.5.0.md
+++ b/docs/releasenotes/release_notes_0.5.0.md
 # Release notes Biopet version 0.5.0

-* Our QC and mapping pipeline now use piping for the most used aligners and QC tools
- * This decreases the disk usage and run time
-* Improvements in the reporting framework
-* Added metagenomics pipeline: [Gears](../pipelines/gears.md)
-* Development envoirment within the LUMC now get tested with Jenkins
- * Added integration tests Flexiprep
- * Added integration tests Mapping
- * Added integration tests Shiva
- * Added integration tests Toucan
+## General Code changes
+
+* Upgrade to Queue 3.4, with this also the htsjdk library to 1.132
+* Our `QC` and `Mapping` pipeline now use piping for the most used aligners and QC tools
+    * Reducing I/O over the network
+    * Reducing the disk usage (storage) and run time
 * Added version command for Star
+* Seperation of the `biopet`-framework into: `Core`, `Extensions`, `Tools` and `Utils`
+* Optimized unit testing
+* Unit test coverage on `Tools` increased
+* Workaround: Added R-script files of Picard to biopet to fix picard jobs (files are not packaged in maven dependency)
+* Added external example for developers
+
+## Functionality
+
+* Retries of pipeline and tools is now enabled by default
+* Improvements in the reporting framework, allowing custom reporting elements for specific pipelines.
+* Fixed reports when metrics of Flexiprep is skipped
+* Added metagenomics pipeline: [Gears](../pipelines/gears.md)
 * Added single sample variantcalling with bcftools
-* Splitting the Framework into: Core, Extensions, Tools and Utils
-* Fixed reports when Metrics of Flexiprep is skipped
-* Upgrade to Queue 3.4, with this also the htsjdk library to 1.132
-* Added key support for GATK jobs
-* Optimizing unit testing
-* Unit test coverage on Tools increased
-* Retry is now default enabled
+* Added ET + key support for GATK job invocation, disable phone-home feature when key is supplied
 * Added more debug information in the `.log` directory when `-l debug` is enabled
-* Shiva: added support for GenotypeConcordance tool to check against a Golden Standard
-* Workaround: Added Rscript files of picard to biopet to fix picard jobs (files are not packaged in maven dependency)
-* Shiva: fixed a lot of small bugs when developing integration tests
-* Gentrap: Better error handeling on missing annotation files
-* Shiva: Workaround: Fixed a dependency on rerun, with this change there can be 2 bam files in the samples folder
-* Added external example for developers
+* [Shiva](../pipelines/shiva.md): added support for `GenotypeConcordance` tool to check against a Golden Standard
+* [Shiva](../pipelines/shiva.md): fixed a lot of small bugs when developing integration tests
+* [Shiva](../pipelines/shiva.md): Workaround: Fixed a dependency on rerun, with this change there can be 2 bam files in the samples folder
+* [Gentrap](../pipelines/gentrap.md): Improved error handling on missing annotation files
+
+## Infrastructure changes
+
+* Development environment within the LUMC now get tested with Jenkins
+    * Added integration tests Flexiprep
+    * Added integration tests Gears
+    * Added integration tests Mapping
+    * Added integration tests Shiva
+    * Added integration tests Toucan
--- a/external-example/src/main/scala/org/example/group/pipelines/HelloPipeline.scala
+++ b/external-example/src/main/scala/org/example/group/pipelines/HelloPipeline.scala
@@ -32,7 +32,6 @@ class HelloPipeline(val root: Configurable) extends QScript with SummaryQScript
    fastqc.output = new File(outputDir, "fastqc.txt")
    add(fastqc)

-    // From here you can use the output files of shiva as input file of other jobs
  }
 }


--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -46,7 +46,8 @@ pages:
    - Example tool: 'developer/example-tool.md'
    - Example pipeable: 'developer/example-pipeable.md'
    - Scala docs:
-      - 0.4.0: 'developer/code-style.md'
+      - 0.5.0: '/sasc/scaladocs/v0.5.0.0'
+      - 0.4.0: '/sasc/scaladocs/v0.4.0.0'
 #- ['developing/Setup.md', 'Developing', 'Setting up your local development environment']
 #theme: readthedocs
 repo_url: https://github.com/biopet/biopet