diff --git a/docs/cluster/oge.md b/docs/cluster/oge.md index 138b8f0041731eb0e10527d6e438c727d2f1150a..420bc52359b5bc959a43c78179d845e5aebc5d0a 100644 --- a/docs/cluster/oge.md +++ b/docs/cluster/oge.md @@ -1,10 +1,15 @@ # Introduction + Within the LUMC we have a compute cluster which runs on the Sun Grid Engine (SGE). This cluster currently consists of around 600 cores and several terabytes of memory. The Sun Grid Engine (SGE) enables the cluster to schedule all the jobs coming from different users in a fair way. So Resources are shared equally between multiple users. # Sun Grid Engine +Oracle Grid Engine or Sun Grid Engine is a computer cluster software sytem otherwise also known as a batch-queing system. These + systems help the computer cluster users to distribute and fairly schedule the jobs to the different computers in the cluster. +# Open Grid Engine -# Open Grid Engine \ No newline at end of file +The Open Grid Engine (OGE) is based on the SunGridEngine but completely open source. It does support commercially batch-queuing + systems. \ No newline at end of file diff --git a/docs/developer/example-pipeable.md b/docs/developer/example-pipeable.md index ad3a4597bc772f8c9fbdc08cc52d8115e2bc2252..b310008662e9c93451c37bc951c1101816824185 100644 --- a/docs/developer/example-pipeable.md +++ b/docs/developer/example-pipeable.md @@ -1,2 +1,29 @@ -## Pipeable commands +# Pipeable commands +## Introduction + +Since the release of Biopet v0.5.0 we support piping of programs/tools to decrease disk usage and run time. Here we make use of + [fifo piping](http://www.gnu.org/software/libc/manual/html_node/FIFO-Special-Files.html#FIFO-Special-Files). Which enables a + developer to very easily implement piping for most pipeable tools. + +## Example + +``` scala + val pipe = new BiopetFifoPipe(this, (zcatR1._1 :: (if (paired) zcatR2.get._1 else None) :: + Some(gsnapCommand) :: Some(ar._1) :: Some(reorderSam) :: Nil).flatten) + pipe.threadsCorrection = -1 + zcatR1._1.foreach(x => pipe.threadsCorrection -= 1) + zcatR2.foreach(_._1.foreach(x => pipe.threadsCorrection -= 1)) + add(pipe) + ar._2 +``` + +* In the above example we define the variable ***pipe***. This is the place to define which jobs should be piped together. In +this case + we perform a zcat on the input files. After that GSNAP alignment and Picard reordersam is performed. The final output of this + job will be a SAM file all intermediate files will be removed as soon as the job finished completely without any error codes. +* With the second command pipe.threadsCorrection = -1 we make sure the total number of assigned cores is not to high. This +ensures that the job can still be scheduled to the compute cluster. +* So we hope you can appreciate in the above example that we decrease the total number of assigned cores with 2. This is done +by the command ***zcatR1._1.foreach(x => pipe.threadsCorrection -= 1)*** + \ No newline at end of file diff --git a/docs/general/about.md b/docs/general/about.md index 7c90d291d7c0e45a5205fc2c99de5b9fceeb5b38..57b320514e3281d67cedd75923f07e1485a6c3b1 100644 --- a/docs/general/about.md +++ b/docs/general/about.md @@ -20,8 +20,8 @@ As of the 0.5.0 release, the following people (sorted by last name) have contrib - Wibowo Arindrarto - Sander Bollen - Peter van 't Hof -- Wai Yi Leung - Leon Mei +- Wai Yi Leung - Sander van der Zeeuw @@ -29,4 +29,4 @@ As of the 0.5.0 release, the following people (sorted by last name) have contrib Check our website at: [SASC](https://sasc.lumc.nl/) -We are also reachable through email: [SASC mail](mailto:sasc@lumc.nl) +Or send us an email: [SASC mail](mailto:sasc@lumc.nl) \ No newline at end of file diff --git a/docs/general/config.md b/docs/general/config.md index 8af9f2a23732b8050f61d3e1e749a8a8415f88f1..bff11cc661d7ce0b8bbb59f2b70ad0a6b63e5640 100644 --- a/docs/general/config.md +++ b/docs/general/config.md @@ -68,9 +68,9 @@ This config file should be written in either JSON or YAML format. It can contain deeper into the JSON file. E.g. in the example below the settings for Picard tools are altered only for Picard and not global. -~~~ +``` json "picard": { "validationstringency": "LENIENT" } -~~~ +``` Global setting examples are: ~~~ @@ -89,7 +89,7 @@ E.g. pipelines and tools which uses FASTA references should now set value `"refe Additionally, we can set `"reference_name"` for the name to be used (e.g. `"hg19"`). If unset, Biopet will default to `unknown`. It is also possible to set the `"species"` flag. Again, we will default to `unknown` if unset. #### Example settings config -~~~ +``` json { "reference_fasta": "/references/hg19_nohap/ucsc.hg19_nohap.fasta", "reference_name": "hg19_nohap", @@ -111,7 +111,7 @@ It is also possible to set the `"species"` flag. Again, we will default to `unkn "chunking": true, "haplotypecaller": { "scattercount": 1000 } } -~~~ +``` ### JSON validation diff --git a/docs/general/license.md b/docs/general/license.md index 69a97f3463e5238a2bd5b84633c8b4f1c5299634..99a1259533c0f0366e243c2dd5ce22f85b87aa03 100644 --- a/docs/general/license.md +++ b/docs/general/license.md @@ -17,7 +17,7 @@ license, please contact us to obtain a separate license. Private release: ~~~bash -Due to the license issue with GATK, this part of Biopet can only be used inside the +Due to a license issue with GATK, this part of Biopet can only be used inside the LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions on how to use this protected part of biopet or contact us at sasc@lumc.nl ~~~ diff --git a/docs/general/requirements.md b/docs/general/requirements.md index 0105f7ccc29dcbd5def4b6b49a6bb1235031d858..0ac500d7b5c95be5b4047e0c58c7b3302e63f59f 100644 --- a/docs/general/requirements.md +++ b/docs/general/requirements.md @@ -6,6 +6,8 @@ For end-users: * [Java 7 JVM](http://www.oracle.com/technetwork/java/javase/downloads/index.html) or [OpenJDK 7](http://openjdk.java.net/install/) * [Cran R 2.15.3](http://cran.r-project.org/) + * It strongly advised to run Biopet pipelines on a compute cluster since the amount of resources needed can not be achieved on + a local machine. Note that this does not mean it is not possible! For developers: diff --git a/docs/pipelines/carp.md b/docs/pipelines/carp.md index 6f5ab622a89956180ffb185083de6b817189242d..ff3aef353b8c2caa1ad3da0f2a089e80dfdda368 100644 --- a/docs/pipelines/carp.md +++ b/docs/pipelines/carp.md @@ -39,7 +39,8 @@ The layout of the sample configuration for Carp is basically the same as with ou } ~~~ -What's important there is that you can specify the control ChIP-seq experiment(s) for a given sample. These controls are usually ChIP-seq runs from input DNA and/or from treatment with nonspecific binding proteins such as IgG. In the example above, we are specifying `sample_Y` as the control for `sample_X`. +What's important here is that you can specify the control ChIP-seq experiment(s) for a given sample. These controls are usually +ChIP-seq runs from input DNA and/or from treatment with nonspecific binding proteins such as IgG. In the example above, we are specifying `sample_Y` as the control for `sample_X`. ### Pipeline Settings Configuration @@ -51,7 +52,8 @@ For the pipeline settings, there are some values that you need to specify while While optional settings are: 1. `aligner`: which aligner to use (`bwa` or `bowtie`) - +2. `macs2`: Here only the callpeak modus is implemented. But one can set all the options from [macs2 callpeak](https://github +.com/taoliu/MACS/#call-peaks) ## Running Carp As with other pipelines in the Biopet suite, Carp can be run by specifying the pipeline after the `pipeline` subcommand: