diff --git a/docs/cluster/oge.md b/docs/cluster/oge.md index c5218ad0970c0371b5cee7120a54ee15bb17d391..0e9b639286507d41ac69e41f3b2c9abef9d317ac 100644 --- a/docs/cluster/oge.md +++ b/docs/cluster/oge.md @@ -1,6 +1,6 @@ # Introduction - - +Within the LUMC we have a compute cluster which runs on the Sun Grid Engine (SGE). The cluster has around 600 cores now and multiple terabytes of memory. +The SGE enables the cluster to schedule all the jobs coming from different users in a fair way. So Resources are shared equally between multiple users. # Sun Grid Engine diff --git a/docs/examples/gentrap_example.json b/docs/examples/gentrap_example.json index ddf0b6e5474bbb2d47e2cbf9e8a5ab400c5f4bff..1e69c4d6f92234cfa39f1b5872ba73f98aeff174 100644 --- a/docs/examples/gentrap_example.json +++ b/docs/examples/gentrap_example.json @@ -26,12 +26,12 @@ "expression_measures": ["fragments_per_gene", "bases_per_gene", "bases_per_exon"], "strand_protocol": "non_specific", "aligner": "gsnap", - "reference": "/share/isilon/system/local/Genomes-new-27-10-2011/H.Sapiens/hg19_nohap/gsnap/reference.fa", + "reference": "/path/to/Genome/H.Sapiens/hg19_nohap/gsnap/reference.fa", "annotation_gtf": "/path/to/data/annotation/ucsc_refseq.gtf", "annotation_bed": "/path/to/data/annotation/ucsc_refseq.bed", "annotation_refflat": "/path/to/data/annotation/ucsc_refseq.refFlat", "gsnap": { - "dir": "/share/isilon/system/local/Genomes-new-27-10-2011/H.Sapiens/hg19_nohap/gsnap", + "dir": "/path/to/genome/H.Sapiens/hg19_nohap/gsnap", "db": "hg19_nohap", "quiet_if_excessive": true, "npaths": 1 diff --git a/docs/general/config.md b/docs/general/config.md index bebee860cf2c45aa06a54945bc17828031f2a1bf..c4490e7b4abea681aeade8498cc636fc6ed246eb 100644 --- a/docs/general/config.md +++ b/docs/general/config.md @@ -7,7 +7,7 @@ The sample config should be in [__JSON__](http://www.json.org/) or [__YAML__](ht - First field should have the key __"samples"__ - Second field should contain the __"libraries"__ - Third field contains __"R1" or "R2"__ or __"bam"__ -- The fastq input files can be provided zipped and un zipped +- The fastq input files can be provided zipped and unzipped #### Example sample config @@ -57,8 +57,8 @@ Note that there is a tool called [SamplesTsvToJson](../tools/SamplesTsvToJson.md ### The settings config The settings config enables a user to alter the settings for almost all settings available in the tools used for a given pipeline. -This config file should be written in JSON format. It can contain setup settings like references for the tools used, -if the pipeline should use chunking or setting memory limits for certain programs almost everything can be adjusted trough this config file. +This config file should be written in JSON format. +It can contain setup settings like references, cut offs, program modes, memory limits (program specific), if chunking should be used and many more, one can even set program executables here, if for some reason the user does not want to use the systems default tools. One could set global variables containing settings for all tools used in the pipeline or set tool specific options one layer deeper into the JSON file. E.g. in the example below the settings for Picard tools are altered only for Picard and not global. @@ -77,10 +77,11 @@ Global setting examples are: ---- #### References -Pipelines and tools that use references should now use the reference module. This gives some more fine-grained control over references. -E.g. pipelines and tools that use a fasta references file should now set value `reference_fasta`. -Additionally, we can set `reference_name` for the name to be used (e.g. `hg19`). If unset, Biopet will default to `unknown`. -It is also possible to set the `species` flag. Again, we will default to `unknown` if unset. +Pipelines and tools that use references should now use the reference module. +This gives a more fine-grained control over references and enables a user to curate the references in a structural way. +E.g. pipelines and tools which uses FASTA references should now set value `"reference_fasta"`. +Additionally, we can set `"reference_name"` for the name to be used (e.g. `"hg19"`). If unset, Biopet will default to `unknown`. +It is also possible to set the `"species"` flag. Again, we will default to `unknown` if unset. #### Example settings config ~~~ { @@ -108,5 +109,5 @@ It is also possible to set the `species` flag. Again, we will default to `unknow ### JSON validation -To check if the JSON file created is correct we can use multiple options the simplest way is using [this](http://jsonformatter.curiousconcept.com/) -website. It is also possible to use Python or Scala for validating but this requires some more knowledge. \ No newline at end of file +To check if the created JSON file is correct their are several possibilities: the simplest way is using [this](http://jsonformatter.curiousconcept.com/) +website. It is also possible to use Python, Scala or any other programming languages for validating JSON files but this requires some more knowledge. \ No newline at end of file diff --git a/docs/tools/SamplesTsvToJson.md b/docs/tools/SamplesTsvToJson.md index dc17f44568eb4e5252ec5aaa5a42cb90d6746294..84a33413e9d98bd110d02952aaa46a4891cf59e1 100644 --- a/docs/tools/SamplesTsvToJson.md +++ b/docs/tools/SamplesTsvToJson.md @@ -58,7 +58,7 @@ To get the above example out of the tool one should provide 2 TSV files as follo ---- -| samples | library | bam | +| sample | library | bam | | ------- | ------- | --------- | |Sample_ID_1 |Lib_ID_1 |MyFirst.bam | |Sample_ID_2 |Lib_ID_2 |MySecond.bam |