The command above will do a *dry* run of a pipeline using a config file as if the command would be submitted to the SHARK cluster (the `-qsub` flag) to the `BWA` parallel environment (the `-jobParaEnv BWA` flag). We also set the maximum retry of failing jobs to two times (via the `-retry 2` flag). Doing a good run is a good idea to ensure that your real run proceeds smoothly. It may not catch all the errors, but if the dry run fails you can be sure that the real run will never succeed.
The command above will do a *dry* run of a pipeline using a config file as if the command would be submitted to the SHARK cluster (the `-qsub` flag) to the `BWA` parallel environment (the `-jobParaEnv BWA` flag). The `-jobQueue all.q` flag ensures that the proper Queue is used. We also set the maximum retry of failing jobs to two times (via the `-retry 2` flag). Doing a good run is a good idea to ensure that your real run proceeds smoothly. It may not catch all the errors, but if the dry run fails you can be sure that the real run will never succeed.
If the dry run proceeds without problems, you can then do the real run by using the `-run` flag:
It is usually a good idea to do the real run using `screen` or `nohup` to prevent the job from terminating when you log out of SHARK. In practice, using `biopet` as it is is also fine. What you need to keep in mind, is that each pipeline has their own expected config layout. You can check out more about the general structure of our config files [here](general/config.md). For the specific structure that each pipeline accepts, please consult the respective pipeline page.
@@ -22,7 +23,8 @@ This pipeline is used to analyse a group of samples. This pipeline only accepts
| Key | Type | default | Function |
| --- | ---- | ------- | -------- |
| gears_use_kraken | Boolean | true | Run fastq file with kraken |
| gears_use_centrifuge | Boolean | true | Run fastq files with centrifuge |
| gears_use_kraken | Boolean | false | Run fastq files with kraken |
| gears_use_qiime_closed | Boolean | false | Run fastq files with qiime with the closed reference module |
| gears_use_qiime_open | Boolean | false | Run fastq files with qiime with the open reference module |
| gears_use_qiime_rtax | Boolean | false | Run fastq files with qiime with the rtax module |
...
...
@@ -65,7 +67,7 @@ Command line flags for Gears are:
| -sample | --sampleid | String (**required**) | Name of sample |
| -library | --libid | String (optional) | Name of library |
If `-R2` is given, the pipeline will assume a paired-end setup. `-bam` is mutualy exclusive with the `-R1` and `-R2` flags. Either specify `-bam` or `-R1` and/or `-R2`.
If `-R2` is given, the pipeline will assume a paired-end setup. `-bam` is mutually exclusive with the `-R1` and `-R2` flags. Either specify `-bam` or `-R1` and/or `-R2`.
When extension is .yml or .yaml output is in yaml format, otherwise in json. When not given output goes to stdout as yaml.
~~~
A user provides a TAB separated file (TSV) with sample specific properties which are parsed into JSON format by the tool.
For example, a user wants to add certain properties to the description of a sample, such as the treatment a sample received. Then a TSV file with an extra column called treatment is provided.
The resulting JSON file will have the 'treatment' property in it as well. The order of the columns is not relevant to the end result
The resulting file will have the 'treatment' property in it as well. The order of the columns is not relevant to the end result
The tag files works the same only the value is prefixed in the key `tags`.
#### Example
~~~ json
{
"samples":{
"Sample_ID_1":{
"treatment":"heatshock",
"libraries":{
"Lib_ID_1":{
"bam":"MyFirst.bam"
}
}
},
"Sample_ID_2":{
"treatment":"heatshock",
"libraries":{
"Lib_ID_2":{
"bam":"MySecond.bam"
}
}
}
}
}
~~~
#### Sample definition
To get the above example out of the tool one should provide 2 TSV files as follows:
...
...
@@ -83,3 +58,45 @@ Basically anything you want to pass to your pipeline is possible.