Skip to content
Snippets Groups Projects
Commit a04d1ce0 authored by Sander van der Zeeuw's avatar Sander van der Zeeuw
Browse files

changes in docs 0.5.0

parent 839c80aa
No related branches found
No related tags found
No related merge requests found
# Introduction
Within the LUMC we have a compute cluster which runs on the Sun Grid Engine (SGE). This cluster currently consists of around 600
cores and several terabytes of memory. The Sun Grid Engine (SGE) enables the cluster to schedule all the jobs coming from
different users in a fair way. So Resources are shared equally between multiple users.
# Sun Grid Engine
Oracle Grid Engine or Sun Grid Engine is a computer cluster software sytem otherwise also known as a batch-queing system. These
systems help the computer cluster users to distribute and fairly schedule the jobs to the different computers in the cluster.
# Open Grid Engine
# Open Grid Engine
\ No newline at end of file
The Open Grid Engine (OGE) is based on the SunGridEngine but completely open source. It does support commercially batch-queuing
systems.
\ No newline at end of file
## Pipeable commands
# Pipeable commands
## Introduction
Since the release of Biopet v0.5.0 we support piping of programs/tools to decrease disk usage and run time. Here we make use of
[fifo piping](http://www.gnu.org/software/libc/manual/html_node/FIFO-Special-Files.html#FIFO-Special-Files). Which enables a
developer to very easily implement piping for most pipeable tools.
## Example
``` scala
val pipe = new BiopetFifoPipe(this, (zcatR1._1 :: (if (paired) zcatR2.get._1 else None) ::
Some(gsnapCommand) :: Some(ar._1) :: Some(reorderSam) :: Nil).flatten)
pipe.threadsCorrection = -1
zcatR1._1.foreach(x => pipe.threadsCorrection -= 1)
zcatR2.foreach(_._1.foreach(x => pipe.threadsCorrection -= 1))
add(pipe)
ar._2
```
* In the above example we define the variable ***pipe***. This is the place to define which jobs should be piped together. In
this case
we perform a zcat on the input files. After that GSNAP alignment and Picard reordersam is performed. The final output of this
job will be a SAM file all intermediate files will be removed as soon as the job finished completely without any error codes.
* With the second command pipe.threadsCorrection = -1 we make sure the total number of assigned cores is not to high. This
ensures that the job can still be scheduled to the compute cluster.
* So we hope you can appreciate in the above example that we decrease the total number of assigned cores with 2. This is done
by the command ***zcatR1._1.foreach(x => pipe.threadsCorrection -= 1)***
\ No newline at end of file
......@@ -20,8 +20,8 @@ As of the 0.5.0 release, the following people (sorted by last name) have contrib
- Wibowo Arindrarto
- Sander Bollen
- Peter van 't Hof
- Wai Yi Leung
- Leon Mei
- Wai Yi Leung
- Sander van der Zeeuw
......@@ -29,4 +29,4 @@ As of the 0.5.0 release, the following people (sorted by last name) have contrib
Check our website at: [SASC](https://sasc.lumc.nl/)
We are also reachable through email: [SASC mail](mailto:sasc@lumc.nl)
Or send us an email: [SASC mail](mailto:sasc@lumc.nl)
\ No newline at end of file
......@@ -68,9 +68,9 @@ This config file should be written in either JSON or YAML format. It can contain
deeper into the JSON file. E.g. in the example below the settings for Picard tools are altered only for Picard and not global.
~~~
``` json
"picard": { "validationstringency": "LENIENT" }
~~~
```
Global setting examples are:
~~~
......@@ -89,7 +89,7 @@ E.g. pipelines and tools which uses FASTA references should now set value `"refe
Additionally, we can set `"reference_name"` for the name to be used (e.g. `"hg19"`). If unset, Biopet will default to `unknown`.
It is also possible to set the `"species"` flag. Again, we will default to `unknown` if unset.
#### Example settings config
~~~
``` json
{
"reference_fasta": "/references/hg19_nohap/ucsc.hg19_nohap.fasta",
"reference_name": "hg19_nohap",
......@@ -111,7 +111,7 @@ It is also possible to set the `"species"` flag. Again, we will default to `unkn
"chunking": true,
"haplotypecaller": { "scattercount": 1000 }
}
~~~
```
### JSON validation
......
......@@ -17,7 +17,7 @@ license, please contact us to obtain a separate license.
Private release:
~~~bash
Due to the license issue with GATK, this part of Biopet can only be used inside the
Due to a license issue with GATK, this part of Biopet can only be used inside the
LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions
on how to use this protected part of biopet or contact us at sasc@lumc.nl
~~~
......
......@@ -6,6 +6,8 @@ For end-users:
* [Java 7 JVM](http://www.oracle.com/technetwork/java/javase/downloads/index.html) or [OpenJDK 7](http://openjdk.java.net/install/)
* [Cran R 2.15.3](http://cran.r-project.org/)
* It strongly advised to run Biopet pipelines on a compute cluster since the amount of resources needed can not be achieved on
a local machine. Note that this does not mean it is not possible!
For developers:
......
......@@ -39,7 +39,8 @@ The layout of the sample configuration for Carp is basically the same as with ou
}
~~~
What's important there is that you can specify the control ChIP-seq experiment(s) for a given sample. These controls are usually ChIP-seq runs from input DNA and/or from treatment with nonspecific binding proteins such as IgG. In the example above, we are specifying `sample_Y` as the control for `sample_X`.
What's important here is that you can specify the control ChIP-seq experiment(s) for a given sample. These controls are usually
ChIP-seq runs from input DNA and/or from treatment with nonspecific binding proteins such as IgG. In the example above, we are specifying `sample_Y` as the control for `sample_X`.
### Pipeline Settings Configuration
......@@ -51,7 +52,8 @@ For the pipeline settings, there are some values that you need to specify while
While optional settings are:
1. `aligner`: which aligner to use (`bwa` or `bowtie`)
2. `macs2`: Here only the callpeak modus is implemented. But one can set all the options from [macs2 callpeak](https://github
.com/taoliu/MACS/#call-peaks)
## Running Carp
As with other pipelines in the Biopet suite, Carp can be run by specifying the pipeline after the `pipeline` subcommand:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment