diff --git a/.gitignore b/.gitignore index 4f40f8c4d9b2bf4eb9d898bbc0691a20c5663dd0..77de2f25e244b88f845d817edc590690e679258e 100644 --- a/.gitignore +++ b/.gitignore @@ -13,3 +13,4 @@ target/ public/target/ protected/target/ site/ +*.sc \ No newline at end of file diff --git a/biopet-aggregate/pom.xml b/biopet-aggregate/pom.xml index e29997cc6d6e3cd65e557393d3153d63bba30468..d105b1ef54883992abee5b83277df325c62a0686 100644 --- a/biopet-aggregate/pom.xml +++ b/biopet-aggregate/pom.xml @@ -30,6 +30,7 @@ <module>../public/toucan</module> <module>../public/shiva</module> <module>../public/basty</module> + <module>../public/tinycap</module> <module>../public/biopet-utils</module> <module>../public/biopet-tools</module> <module>../public/biopet-tools-extensions</module> diff --git a/docs/general/config.md b/docs/general/config.md index d5cda2ca1341a0eeff92bffabbe5944bd69f51f0..e9b9d5e78b2140bcc3029ee9cc4fa76278127148 100644 --- a/docs/general/config.md +++ b/docs/general/config.md @@ -12,7 +12,7 @@ The sample config should be in [__JSON__](http://www.json.org/) or [__YAML__](ht #### Example sample config -###### yaml: +###### YAML: ``` yaml output_dir: /home/user/myoutputdir @@ -24,7 +24,7 @@ samples: R2: R2.fastq.gz ``` -###### json: +###### JSON: ``` json { diff --git a/docs/pipelines/carp.md b/docs/pipelines/carp.md index 1391afbf6b21238a9257ba0f94e9146117317162..f3e1f31b3579a228a2240c7476dfc9a687a039d5 100644 --- a/docs/pipelines/carp.md +++ b/docs/pipelines/carp.md @@ -12,7 +12,9 @@ Please refer [to our mapping pipeline](mapping.md) for information about how the ### Sample Configuration -The layout of the sample configuration for Carp is basically the same as with our other multi sample pipelines, for example: +The layout of the sample configuration for Carp is basically the same as with our other multi sample pipelines it may be either ```json``` or ```yaml``` formatted. +Below we show two examples for ```json``` and ```yaml```. One should appreciate that multiple libraries can be used if a sample is sequenced on multiple lanes. This is noted with library id in the config file. + ~~~ json { @@ -31,7 +33,7 @@ The layout of the sample configuration for Carp is basically the same as with ou "lib_one": { "R1": "/absolute/path/to/first/read/pair.fq", "R2": "/absolute/path/to/second/read/pair.fq" - } + }, "lib_two": { "R1": "/absolute/path/to/first/read/pair.fq", "R2": "/absolute/path/to/second/read/pair.fq" @@ -42,8 +44,50 @@ The layout of the sample configuration for Carp is basically the same as with ou } ~~~ +~~~ yaml +samples: + sample_X + control: + - sample_Y + libraries: + lib_one: + R1: /absolute/path/to/first/read/pair.fq + R2: /absolute/path/to/second/read/pair.fq + sample_Y: + libraries: + lib_one: + R1: /absolute/path/to/first/read/pair.fq + R2: /absolute/path/to/second/read/pair.fq + lib_two: + R1: /absolute/path/to/first/read/pair.fq + R2: /absolute/path/to/second/read/pair.fq +~~~ + What's important here is that you can specify the control ChIP-seq experiment(s) for a given sample. These controls are usually ChIP-seq runs from input DNA and/or from treatment with nonspecific binding proteins such as IgG. In the example above, we are specifying `sample_Y` as the control for `sample_X`. +**Please notice** that the control is given in the form of a ```list```. This is because sometimes one wants to use multiple control samples, this can be achieved to pass the sampleNames of the control samples in a list to the field **control** in the config file. +In ```json``` this will become: + +~~~ json +{ + "samples": { + "sample_X": { + "control": ["sample_Y","sample_Z"] + } + } + } + ~~~ + +In ```yaml``` this is a bit different and will look like this: + +~~~ yaml +samples: + sample_X: + control: + - sample_Y + - sample_Z +~~~ + ### Pipeline Settings Configuration diff --git a/docs/pipelines/flexiprep.md b/docs/pipelines/flexiprep.md index cd1e3f8400e9b9bcce0e03fa8ee335e812bb562c..2140398ab88dd7ae89d99b23e9539b2a4dd4ad44 100644 --- a/docs/pipelines/flexiprep.md +++ b/docs/pipelines/flexiprep.md @@ -44,8 +44,8 @@ Command line flags for Flexiprep are: | Flag (short)| Flag (long) | Type | Function | | ------------ | ----------- | ---- | -------- | -| -R1 | --input_r1 | Path (**required**) | Path to input fastq file | -| -R2 | --input_r2 | Path (optional) | Path to second read pair fastq file. | +| -R1 | --inputR1 | Path (**required**) | Path to input fastq file | +| -R2 | --inputR2 | Path (optional) | Path to second read pair fastq file. | | -sample | --sampleid | String (**required**) | Name of sample | | -library | --libid | String (**required**) | Name of library | diff --git a/docs/pipelines/mapping.md b/docs/pipelines/mapping.md index f04efebc60425764eeb38cbeae90ca9f4751865b..79d2a57494ca3fb24e369515dcfed47b883bf556 100644 --- a/docs/pipelines/mapping.md +++ b/docs/pipelines/mapping.md @@ -28,8 +28,8 @@ Command line flags for the mapping pipeline are: | Flag (short)| Flag (long) | Type | Function | | ------------ | ----------- | ---- | -------- | -| -R1 | --input_r1 | Path (**required**) | Path to input fastq file | -| -R2 | --input_r2 | Path (optional) | Path to second read pair fastq file. | +| -R1 | --inputR1 | Path (**required**) | Path to input fastq file | +| -R2 | --inputR2 | Path (optional) | Path to second read pair fastq file. | | -sample | --sampleid | String (**required**) | Name of sample | | -library | --libid | String (**required**) | Name of library | diff --git a/docs/pipelines/shiva.md b/docs/pipelines/shiva.md index b2dcd2138196c31fb1f274f23468890f510f51d6..019635e33250b0c46ae162b61705167a57d4d816 100644 --- a/docs/pipelines/shiva.md +++ b/docs/pipelines/shiva.md @@ -135,6 +135,17 @@ To view all possible config options please navigate to our Gitlab wiki page Since Shiva uses the [Mapping](mapping.md) pipeline internally, mapping config values can be specified as well. For all the options, please see the corresponding documentation for the mapping pipeline. +### Exome variant calling + +If one calls variants with Shiva on exome samples and a ```amplicon_bed``` file is available, the user is able to add this file to the config file. + When the file is given, the coverage over the positions in the bed file will be calculated plus the number of variants on each position. If there is an interest + in a specific region of the genome/exome one is capable to give multiple ```regionOfInterest.bed``` files with the option ```regions_of_interest``` (in list/array format). + + A short recap: the option ```amplicon_bed``` can only be given one time and should be composed of the amplicon kit used to obtain the exome data. + The option ```regions_of_interest``` can contain multiple bed files in ```list``` format and can contain any region a user wants. If multiple regions are given, + the pipeline will make an coverage plot over each bed file separately. + + ### Modes Shiva furthermore supports three modes. The default and recommended option is `multisample_variantcalling`. diff --git a/docs/pipelines/tinycap.md b/docs/pipelines/tinycap.md new file mode 100644 index 0000000000000000000000000000000000000000..9d9d374adb82aa77686bf540e0883ffdcecbaa63 --- /dev/null +++ b/docs/pipelines/tinycap.md @@ -0,0 +1,109 @@ +# TinyCap + +## Introduction + +``TinyCap`` is an analysis pipeline meant to process smallRNA captures. We use a fixed aligner in this pipeline: `bowtie` . +By default, we allow one fragment to align up to 5 different locations on the genome. In most of the cases, the shorter +the sequence, the less 'unique' the pattern is. Multiple **"best"** alignments is in these cases possible. +To avoid **'first-occurence found and align-to'** bias towards the reference genome, we allow the aligner +to report more alignment positions. + +After alignment, `htseq-count` is responsible for the quantification of transcripts. +One should supply 2 annotation-files for this to happen: + +- mirBase GFF3 file with all annotated and curated miRNA for the genome of interest. [visit mirBase](http://www.mirbase.org/ftp.shtml) +- Ensembl (Gene sets) in GTF format. [visit Ensembl](http://www.ensembl.org/info/data/ftp/index.html) + +Count tables are generated per sample and and aggregation per (run)project is created in the top level folder of the project. + + +## Starting the pipeline + +```bash +biopet pipelines tinycap [options] \ +-config `<path-to>/settings_tinycap.json` +-config `<path-to>/sample_sheet.json` \ +-l DEBUG \ +-qsub \ +-jobParaEnv BWA \ +-run +``` + +## Example + +Note that one should first create the appropriate [configs](../general/config.md). + +The pipeline specific (minimum) config looks like: + +```json +{ + "output_dir": "<path-to>/outputdirectory", + "reference_name": "GRCh38", + "species": "H.sapiens", + "annotation_gff": "<path-to>/data/annotation/mirbase-21-hsa.gff3", + "annotation_refflat": "<path-to>/data/annotation/ucsc_ensembl_83_38.refFlat", + "annotation_gtf": "<path-to>/data/annotation/ucsc_ensembl_83_38.gtf" +} +``` + + +### Advanced config: + +One can specify other options such as: `bowtie` (alignment) options, clipping and trimming options `sickle` and `cutadapt`. +```json +"bowtie": { + "chunkmbs": 256, # this is a performance option, keep it high (256) as many alternative alignments are possible + "seedmms": 3, + "seedlen": 25, + "k": 5, # take and report best 5 alignments + "best": true # sort by best hit +}, +"sickle": { + "lengthThreshold": 8 # minimum length to keep after trimming +}, +"cutadapt": { + "error_rate": 0.2, # recommended: 0.2, allow more mismatches in adapter to be clipped of (ratio) + "minimum_length": 8, # minimum length to keep after clipping, setting lower will cause multiple alignments afterwards + "q": 30, # minimum quality over the read after the clipping in order to keep and report the read + "default_clip_mode": "both", # clip from: front/end/both (5'/3'/both). Depending on the protocol. Setting `both` makes clipping take more time, but is safest to do on short sequences such as smallRNA. + "times": 2 # in cases of chimera reads/adapters, how many times should cutadapt try to remove am adapter-sequence +} +``` + +The settings above is quite strict and aggressive on the clipping with `cutadapt`. By default the option `sensitiveAdapterSearch` is turned on in the TinyCap pipeline: + +```json +"fastqc": { + "sensitiveAdapterSearch": true +} +``` + +This setting is turned on to enable aggressive adapter trimming. e.g. found (partial) adapters in `FastQC` +are all used in `Cutadapt`. Depending on the sequencing technique and sample preparation, e.g. short +sequences (76bp - 100bp). Turning of this option will still produce sensible results. + + + +## Examine results + +### Result files + +- `counttables_smallrna.tinycap.tsv` +- `counttables_mirna.tinycap.tsv` + + +### Tested setups + +The pipeline is designed and tested with sequences produced by: Illumina HiSeq 2000/2500, Illumina MiSeq. Both on single-end sequences. +Whenever a run is performed in Paired End mode, one should use the `R1` only. For analysis of (long) non-coding RNA, one should use `Gentrap`, this pipeline is optimized for Paired End RNA analysis. + + +Wetlab-Protocol: NEB SmallRNA kit and TruSeq SmallRNA kits were used for the data generated to test this pipeline. + + +## References + +- [Cutadapt](https://github.com/marcelm/cutadapt) +- [HTSeqCount](http://www-huber.embl.de/HTSeq/doc/overview.html) +- [Bowtie1](http://bowtie-bio.sourceforge.net/index.shtml) + diff --git a/docs/releasenotes/release_notes_0.6.0.md b/docs/releasenotes/release_notes_0.6.0.md index 24cdc8357a778ae6c6e1a18c1735b26185c9e8fd..762a11ecf272fe6081a8d2ba95bfe76da7da5c10 100644 --- a/docs/releasenotes/release_notes_0.6.0.md +++ b/docs/releasenotes/release_notes_0.6.0.md @@ -20,6 +20,14 @@ * Added trimming of reverse complement adapters (flexiprep does this automatic) * Added [Tinycap](../pipelines/tinycap.md) for smallRNA analysis * [Gentrap](../pipelines/gentrap.md): Refactoring changed the "expression_measures" options +* Fixed biopet logging +* Added sample tagging +* Seqstat now reports histogram of read lengths +* Fixed bug in seqstat when having multiple sizes exists in the fastq file +* Added variant plots for targets to report of Shiva +* Adapter feed to cutadapt now use only that parts that are reported by fastqc and not the full sequence +* Added a reference selector when fasta file can't be found. User now get a list of available species and genomes in the config +* Fixed bcftools with IUPAC symbols ## Infrastructure changes diff --git a/external-example/pom.xml b/external-example/pom.xml index c799ac3cca14b7f1459294136f02f97a998f1b47..56d50a68ab9ec0af643cf92da638715095f1ebdb 100644 --- a/external-example/pom.xml +++ b/external-example/pom.xml @@ -11,7 +11,7 @@ <artifactId>ExternalExample</artifactId> <!--TODO: replace version, for a new pipeline it's advised to start with '0.1.0-SNAPSHOT' --> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> @@ -40,19 +40,19 @@ <artifactId>BiopetCore</artifactId> <!--TODO: replace version of pipeline to a fixed version --> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </dependency> <dependency> <groupId>nl.lumc.sasc</groupId> <artifactId>BiopetExtensions</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </dependency> <dependency> <groupId>nl.lumc.sasc</groupId> <artifactId>Shiva</artifactId> <!--TODO: replace version of pipeline to a fixed version --> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </dependency> </dependencies> diff --git a/mkdocs.yml b/mkdocs.yml index 689c08275838e7df8947aaba1ab0b8e2d90ac68d..e6475c8369d1d85ab155307bd672fc40e68ff9c6 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -17,6 +17,7 @@ pages: - Mapping (Alignment): 'pipelines/mapping.md' - Sage: 'pipelines/sage.md' - Shiva (variantcalling): 'pipelines/shiva.md' + - TinyCap (smallRNA): 'pipelines/tinycap.md' - Toucan (Annotation): 'pipelines/toucan.md' - Tools: - SamplesTsvToJson: 'tools/SamplesTsvToJson.md' diff --git a/pom.xml b/pom.xml index c8313ae54dbc2c9af3e0cc8fcaaa5ffa41a0fbf7..e383c77a4d26cee678f1c3446134c8582c60b28b 100644 --- a/pom.xml +++ b/pom.xml @@ -9,7 +9,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>public</relativePath> </parent> diff --git a/protected/biopet-gatk-extensions/pom.xml b/protected/biopet-gatk-extensions/pom.xml index bf0ce809bb4e3ff1178e4a95686e380c1f45b601..667dd080caa402dbc3168a73ed77be40c64f30f8 100644 --- a/protected/biopet-gatk-extensions/pom.xml +++ b/protected/biopet-gatk-extensions/pom.xml @@ -15,7 +15,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>BiopetGatk</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/protected/biopet-gatk-pipelines/pom.xml b/protected/biopet-gatk-pipelines/pom.xml index 2cd542a4a3a2dfe4d58f09cf4a28609344e764c4..f1f5df4c43b52e1219541c5907069654d5086843 100644 --- a/protected/biopet-gatk-pipelines/pom.xml +++ b/protected/biopet-gatk-pipelines/pom.xml @@ -15,7 +15,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>BiopetGatk</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/Shiva.scala b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/Shiva.scala index 1c82fa8e7d461dd0932ee0acf85320b5666d026b..830e4a7350c2af6e249123c7c7883f222cdfab0a 100644 --- a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/Shiva.scala +++ b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/Shiva.scala @@ -6,9 +6,9 @@ package nl.lumc.sasc.biopet.pipelines.gatk import nl.lumc.sasc.biopet.core.PipelineCommand -import nl.lumc.sasc.biopet.utils.config.Configurable import nl.lumc.sasc.biopet.extensions.gatk.broad._ -import nl.lumc.sasc.biopet.pipelines.shiva.{ ShivaVariantcallingTrait, ShivaTrait } +import nl.lumc.sasc.biopet.pipelines.shiva.ShivaTrait +import nl.lumc.sasc.biopet.utils.config.Configurable import org.broadinstitute.gatk.queue.QScript /** diff --git a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCaller.scala b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCaller.scala index f18b16812b48375bb7e1d57969ef9b1db7ba5691..be299d9a9ca43567ba7f0f81123ed29cc4e2031e 100644 --- a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCaller.scala +++ b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCaller.scala @@ -1,3 +1,8 @@ +/** + * Due to the license issue with GATK, this part of Biopet can only be used inside the + * LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions + * on how to use this protected part of biopet or contact us at sasc@lumc.nl + */ package nl.lumc.sasc.biopet.pipelines.gatk.variantcallers import nl.lumc.sasc.biopet.pipelines.shiva.variantcallers.Variantcaller diff --git a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerAllele.scala b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerAllele.scala index a4bff551c7bff4a6376d16d8b0401b008a6aad8e..d743b4a528bf0803ba305287a993c758bc8d1d8a 100644 --- a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerAllele.scala +++ b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerAllele.scala @@ -1,3 +1,8 @@ +/** + * Due to the license issue with GATK, this part of Biopet can only be used inside the + * LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions + * on how to use this protected part of biopet or contact us at sasc@lumc.nl + */ package nl.lumc.sasc.biopet.pipelines.gatk.variantcallers import nl.lumc.sasc.biopet.pipelines.shiva.variantcallers.Variantcaller diff --git a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerGvcf.scala b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerGvcf.scala index cf289cb789e4c5ad944b30eb0b71b163cc1115ff..1491135cc566d954baa3793335f3fb6f27c3c645 100644 --- a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerGvcf.scala +++ b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/HaplotypeCallerGvcf.scala @@ -1,3 +1,8 @@ +/** + * Due to the license issue with GATK, this part of Biopet can only be used inside the + * LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions + * on how to use this protected part of biopet or contact us at sasc@lumc.nl + */ package nl.lumc.sasc.biopet.pipelines.gatk.variantcallers import nl.lumc.sasc.biopet.pipelines.shiva.variantcallers.Variantcaller diff --git a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyper.scala b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyper.scala index b71273b284c153ed628565d23c8ca0dffd7bd010..5cf58b69df5c7d03e4c2a8d8621c623deb9a7937 100644 --- a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyper.scala +++ b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyper.scala @@ -1,3 +1,8 @@ +/** + * Due to the license issue with GATK, this part of Biopet can only be used inside the + * LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions + * on how to use this protected part of biopet or contact us at sasc@lumc.nl + */ package nl.lumc.sasc.biopet.pipelines.gatk.variantcallers import nl.lumc.sasc.biopet.pipelines.shiva.variantcallers.Variantcaller diff --git a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyperAllele.scala b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyperAllele.scala index 61bb63ae3f897cc29bd37fc4c8af53f874faad28..43278defac34fbd5e3298c29587de51147b82f0e 100644 --- a/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyperAllele.scala +++ b/protected/biopet-gatk-pipelines/src/main/scala/nl/lumc/sasc/biopet/pipelines/gatk/variantcallers/UnifiedGenotyperAllele.scala @@ -1,3 +1,8 @@ +/** + * Due to the license issue with GATK, this part of Biopet can only be used inside the + * LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions + * on how to use this protected part of biopet or contact us at sasc@lumc.nl + */ package nl.lumc.sasc.biopet.pipelines.gatk.variantcallers import nl.lumc.sasc.biopet.pipelines.shiva.variantcallers.Variantcaller diff --git a/protected/biopet-protected-package/pom.xml b/protected/biopet-protected-package/pom.xml index e6228ebb6f54f15b653fb53872f7736a1833d43e..03b88654a2ff63177615f1299a9e058420080ac4 100644 --- a/protected/biopet-protected-package/pom.xml +++ b/protected/biopet-protected-package/pom.xml @@ -15,7 +15,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>BiopetGatk</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/protected/pom.xml b/protected/pom.xml index 644d1c08504f15831f7c72ec2b1d1e4949c958fa..fa84a64573890a57aa3cafca124da515f19053de 100644 --- a/protected/pom.xml +++ b/protected/pom.xml @@ -11,7 +11,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>BiopetRoot</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> <artifactId>BiopetGatk</artifactId> diff --git a/public/bam2wig/pom.xml b/public/bam2wig/pom.xml index fab54a97890e332f3ac2e72833ac9d55d42e9d41..5afb29d42f96f6bda92b4d8c6a35d2b09308f558 100644 --- a/public/bam2wig/pom.xml +++ b/public/bam2wig/pom.xml @@ -27,7 +27,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/bam2wig/src/main/scala/nl/lumc/sasc/biopet/pipelines/bamtobigwig/Bam2Wig.scala b/public/bam2wig/src/main/scala/nl/lumc/sasc/biopet/pipelines/bamtobigwig/Bam2Wig.scala index fb9e39611fd861f9158503581f17d4bd12e6cb30..4e65a7f59f765d0c4c00cc2bbff980ce8bb1b5ce 100644 --- a/public/bam2wig/src/main/scala/nl/lumc/sasc/biopet/pipelines/bamtobigwig/Bam2Wig.scala +++ b/public/bam2wig/src/main/scala/nl/lumc/sasc/biopet/pipelines/bamtobigwig/Bam2Wig.scala @@ -38,6 +38,10 @@ class Bam2Wig(val root: Configurable) extends QScript with BiopetQScript { inputFiles :+= new InputFile(bamFile) } + def outputWigleFile = new File(outputDir, bamFile.getName + ".wig") + def outputTdfFile = new File(outputDir, bamFile.getName + ".tdf") + def outputBwFile = new File(outputDir, bamFile.getName + ".bw") + def biopetScript(): Unit = { val bs = new BamToChromSizes(this) bs.bamFile = bamFile @@ -48,14 +52,14 @@ class Bam2Wig(val root: Configurable) extends QScript with BiopetQScript { val igvCount = new IGVToolsCount(this) igvCount.input = bamFile igvCount.genomeChromSizes = bs.chromSizesFile - igvCount.wig = Some(new File(outputDir, bamFile.getName + ".wig")) - igvCount.tdf = Some(new File(outputDir, bamFile.getName + ".tdf")) + igvCount.wig = Some(outputWigleFile) + igvCount.tdf = Some(outputTdfFile) add(igvCount) val wigToBigWig = new WigToBigWig(this) wigToBigWig.inputWigFile = igvCount.wig.get wigToBigWig.inputChromSizesFile = bs.chromSizesFile - wigToBigWig.outputBigWig = new File(outputDir, bamFile.getName + ".bw") + wigToBigWig.outputBigWig = outputBwFile add(wigToBigWig) } } diff --git a/public/bammetrics/pom.xml b/public/bammetrics/pom.xml index 411086864943b1d61a84b9a132bbddce325cff95..b31dce4f1a8bdfd7750a7044ef6ea04aff8239b1 100644 --- a/public/bammetrics/pom.xml +++ b/public/bammetrics/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/alignmentSummary.ssp b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/alignmentSummary.ssp index e2ad934d400e15283a8e94c428171e3e777e2172..2ac9b5098e1176cc56f097700928eddc1b6d8f88 100644 --- a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/alignmentSummary.ssp +++ b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/alignmentSummary.ssp @@ -28,7 +28,7 @@ #if (sampleId.isDefined && libId.isDefined) Here we show basic <a href="https://en.wikibooks.org/wiki/Next_Generation_Sequencing_%28NGS%29/Alignment">alignment</a> statistics for this run for sample ${sampleId} with library ${libId}. Total number of reads, number of alignments reads and number of duplicate reads are given, and the percentages thereof as a percentage of total. #elseif(sampleId.isDefined && showPlot) - The following plot shows basic <a href="https://en.wikibooks.org/wiki/Next_Generation_Sequencing_%28NGS%29/Alignment">alignment</a> statistics for this run for sample ${sampleId}. Every library is represented by a multi-color bar. Red represents the total number of properly mapped reads for this sample. Green represents the total number of duplicates reads, which is usually caused by <a href="http://www.cureffi.org/2012/12/11/how-pcr-duplicates-arise-in-next-generation-sequencing/">PCR duplicates</a>. Blue denotes the number of unmapped reads, and purple denotes reads flagged <em>secondary</em> (this is dependent on the aligner used). A table showing similar statistics, including values represented as percent of total, can be downloaded as a tab-delimited file. + The following plot shows basic <a href="https://en.wikibooks.org/wiki/Next_Generation_Sequencing_%28NGS%29/Alignment">alignment</a> statistics for this run for sample ${sampleId}. Every library is represented by a multi-color bar. Red represents the total number of properly mapped reads for this sample. Green represents the total number of duplicates reads, which is usually caused by <a href="http://www.cureffi.org/2012/12/11/how-pcr-duplicates-arise-in-next-generation-sequencing/">PCR duplicates</a>. Blue denotes the number of unmapped reads, and purple denotes reads flagged <em>secondary</em> (this depends on the aligner used). A table showing similar statistics, including values represented as percent of total, can be downloaded as a tab-delimited file. #elseif(sampleId.isDefined && !showPlot) Here we show basic <a href="https://en.wikibooks.org/wiki/Next_Generation_Sequencing_%28NGS%29/Alignment">alignment</a> statistics for this run for every library of sample ${sampleId}. Total number of reads, number of alignments reads and number of duplicate reads are given, and the percentages thereof as a percentage of total. #else @@ -46,16 +46,18 @@ </div> <div class="panel-footer"> #if (showTable) - <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#alignmentSummaryTable">Hide table</button> + <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#alignmentSummaryTable"> + <i class="glyphicon glyphicon-eye-close"></i> Hide table</button> #else - <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#alignmentSummaryTable">Show table</button> + <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#alignmentSummaryTable"> + <i class="glyphicon glyphicon-eye-open"></i> Show table</button> #end - <i class="glyphicon glyphicon-file"></i> <a href="alignmentSummary.tsv">tsv file</a> + <a href="alignmentSummary.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> TSV file</button></a> </div> #end <div class="panel-body collapse #if (showTable)in#end" id="alignmentSummaryTable"> <!-- Table --> -<table class="table sortable-theme-bootstrap" data-sortable> +<table class="table"> <thead><tr> <th data-sorted="true" data-sorted-direction="ascending">Sample</th> #if (!sampleLevel) <th>Library</th> #end @@ -70,8 +72,8 @@ #{ val libs = (libId, sampleLevel) match { case (_, true) => List("") - case (Some(libId), _) => List(libId.toString) - case _ => summary.libraries(sample).toList + case (Some(libId), _) => List(libId.toString).sorted + case _ => summary.libraries(sample).toList.sorted } }# <tr><td rowspan="${libs.size}"><a href="${rootPath}Samples/${sample}/index.html">${sample}</a></td> diff --git a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/insertSize.ssp b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/insertSize.ssp index ba6692dd7af137c57e51d023438995f136d2e477..233cc8843c52ae095f7e418b3eb6de07c9a1b229 100644 --- a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/insertSize.ssp +++ b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/insertSize.ssp @@ -30,12 +30,13 @@ <div class="col-md-6"> <p> #if (sampleId.isDefined && libId.isDefined) - something + This plot shows the insert size distribution for all libraries combined in sample <b>${sampleId}</b>. #elseif(sampleId.isDefined) - This plot shows the insert size distribution for the libraries of sample ${sampleId}. <a href="http://thegenomefactory.blogspot.nl/2013/08/paired-end-read-confusion-library.html">Insert size</a> denotes the size of the so-called <em>insert</em> between two read pairs in a paired-end sequencing run. This should correspond to the length of the sequence between the sequencing adaptors. The provided table shows mean and median insert size for each sample, together with the standard deviation. + This plot shows the insert size distribution for the libraries of sample <b>${sampleId}</b>. #else - This plot shows the insert size distribution for each of the ${samples.size} samples. <a href="http://thegenomefactory.blogspot.nl/2013/08/paired-end-read-confusion-library.html">Insert size</a> denotes the size of the so-called <em>insert</em> between two read pairs in a paired-end sequencing run. This should correspond to the length of the sequence between the sequencing adaptors. The provided table shows mean and median insert size for each sample, together with the standard deviation. + This plot shows the insert size distribution for each of the <b>${samples.size}</b> samples. #end + <a href="http://thegenomefactory.blogspot.nl/2013/08/paired-end-read-confusion-library.html">Insert size</a> denotes the size of the so-called <em>insert</em> between two read pairs in a paired-end sequencing run. This should correspond to the length of the sequence between the sequencing adaptors. The provided table shows mean and median insert size for each sample, together with the standard deviation. </p> </div> </div> @@ -49,17 +50,20 @@ </div> <div class="panel-footer"> #if (showTable) - <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#insertsizeTable">Hide table</button> + <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#insertsizeTable"> + <i class="glyphicon glyphicon-eye-close"></i> Hide table</button> #else - <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#insertsizeTable">Show table</button> + <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#insertsizeTable"> + <i class="glyphicon glyphicon-eye-open"></i> Show table</button> #end - <i class="glyphicon glyphicon-file"></i> <a href="insertsize.tsv">tsv file</a> + <button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"> <a href="insertsize.tsv">tsv file</a></i></button> + </div> #end <div class="panel-body collapse #if (showTable)in#end" id="insertsizeTable"> <!-- Table --> -<table class="table sortable-theme-bootstrap" data-sortable> +<table class="table"> <thead><tr> <th data-sorted="true" data-sorted-direction="ascending">Sample</th> #if (!sampleLevel) <th>Library</th> #end @@ -84,7 +88,7 @@ val prefixPath = List("samples", sample) ::: (if (libId.isEmpty) Nil else List("libraries", libId)) ::: List("bammetrics", "stats") val fieldValues = for (field <- fields) yield { - summary.getValue((prefixPath ::: List("CollectInsertSizeMetrics", "metrics", field.toUpperCase)):_*).getOrElse(prefixPath ::: metricsTag :: Nil) + summary.getValue((prefixPath ::: List("CollectInsertSizeMetrics", "metrics", field.toUpperCase)):_*).getOrElse("N/A") } }# #for (value <- fieldValues) diff --git a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/rnaHistogram.ssp b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/rnaHistogram.ssp index 9c055ad97492ba782453472a4d7528f92dc28ade..94b33a15505bb024be01d0debdd637431ec769d5 100644 --- a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/rnaHistogram.ssp +++ b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/rnaHistogram.ssp @@ -29,7 +29,7 @@ <div class="col-md-1"></div> <div class="col-md-6"> <p> - This Show the relative coverage for all transcripts. De data here is generated by picard CollectRnaMetrics + This shows the relative coverage for all transcripts using Picard CollectRnaMetrics </p> </div> </div> @@ -39,7 +39,7 @@ #{ BammetricsReport.rnaHistogramPlot(outputDir, "rna", summary, !sampleLevel, sampleId = sampleId, libId = libId) }# <div class="panel-body"> - <img src="rna.png" class="img-responsive" /> + <img src="rna.png" class="img-responsive" /> </div> <div class="panel-footer"> #if (showTable) @@ -47,7 +47,7 @@ #else <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#rnaTable">Show table</button> #end - <i class="glyphicon glyphicon-file"></i> <a href="rna.tsv">tsv file</a> + <a href="rna.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> TSV file</button></a> </div> #end diff --git a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/wgsHistogram.ssp b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/wgsHistogram.ssp index e900774e0788d03fc5180444fbbb6bd875ad253e..85349b15dccb78663c1930fe67ce5da98aff0436 100644 --- a/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/wgsHistogram.ssp +++ b/public/bammetrics/src/main/resources/nl/lumc/sasc/biopet/pipelines/bammetrics/wgsHistogram.ssp @@ -47,7 +47,7 @@ #else <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#wgsTable">Show table</button> #end - <i class="glyphicon glyphicon-file"></i> <a href="wgs.tsv">tsv file</a> + <a href="wgs.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> TSV file</button></a> </div> #end diff --git a/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BammetricsReport.scala b/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BammetricsReport.scala index d214860778f986461206fa8d757b2b6c95cdfe0a..5e34a2e0ca29120f9a92f2db3f16f96b80f95972 100644 --- a/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BammetricsReport.scala +++ b/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BammetricsReport.scala @@ -419,7 +419,7 @@ object BammetricsReport extends ReportBuilder { val plot = new LinePlot(null) plot.input = tsvFile plot.output = pngFile - plot.xlabel = Some("Reletive position") + plot.xlabel = Some("Relative position") plot.ylabel = Some("Coverage") plot.width = Some(1200) plot.removeZero = true diff --git a/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/TargetRegions.scala b/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/TargetRegions.scala index 01c6568c848479dc429b68c771b7f987cc4c9bd6..15368155979fc4f4026a0fbf98220ed1adb9606f 100644 --- a/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/TargetRegions.scala +++ b/public/bammetrics/src/main/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/TargetRegions.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.bammetrics import java.io.File diff --git a/public/bammetrics/src/test/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BamMetricsTest.scala b/public/bammetrics/src/test/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BamMetricsTest.scala index 8865b9b2bb2bf4a3e3faa620d72280dd6e4e70a1..4a75d99e00d3a21324c8d5459d891e6435691185 100644 --- a/public/bammetrics/src/test/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BamMetricsTest.scala +++ b/public/bammetrics/src/test/scala/nl/lumc/sasc/biopet/pipelines/bammetrics/BamMetricsTest.scala @@ -18,13 +18,10 @@ package nl.lumc.sasc.biopet.pipelines.bammetrics import java.io.{ File, FileOutputStream } import com.google.common.io.Files -import nl.lumc.sasc.biopet.utils.config.Config -import nl.lumc.sasc.biopet.extensions.bedtools.{ BedtoolsCoverage, BedtoolsIntersect } import nl.lumc.sasc.biopet.extensions.picard._ -import nl.lumc.sasc.biopet.extensions.samtools.SamtoolsFlagstat -import nl.lumc.sasc.biopet.pipelines.bammetrics.scripts.CoverageStats import nl.lumc.sasc.biopet.extensions.tools.BiopetFlagstat import nl.lumc.sasc.biopet.utils.ConfigUtils +import nl.lumc.sasc.biopet.utils.config.Config import org.apache.commons.io.FileUtils import org.broadinstitute.gatk.queue.QSettings import org.scalatest.Matchers diff --git a/public/basty/pom.xml b/public/basty/pom.xml index e7a4ca01c2aea65350049cbb3b104684cb44a4b4..7352555700e733369e9206a6f5d9ef550618fae7 100644 --- a/public/basty/pom.xml +++ b/public/basty/pom.xml @@ -32,7 +32,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/biopet-core/pom.xml b/public/biopet-core/pom.xml index c2064007a268ffa2a822eb4565a95acb52448a40..39f94b8b3704868e1398fdb3667f2fee621ffe29 100644 --- a/public/biopet-core/pom.xml +++ b/public/biopet-core/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/ext/js/d3.v3.5.5.min.js b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/ext/js/d3.v3.5.5.min.js index 34d5513ebfe117f8b47ee12c7b108610999039b5..dc9a619b830c41e00f3922535bd4661f774e695a 100644 --- a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/ext/js/d3.v3.5.5.min.js +++ b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/ext/js/d3.v3.5.5.min.js @@ -1,3 +1,18 @@ +/* + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ !function(){function n(n){return n&&(n.ownerDocument||n.document||n).documentElement}function t(n){return n&&(n.ownerDocument&&n.ownerDocument.defaultView||n.document&&n||n.defaultView)}function e(n,t){return t>n?-1:n>t?1:n>=t?0:0/0}function r(n){return null===n?0/0:+n}function u(n){return!isNaN(n)}function i(n){return{left:function(t,e,r,u){for(arguments.length<3&&(r=0),arguments.length<4&&(u=t.length);u>r;){var i=r+u>>>1;n(t[i],e)<0?r=i+1:u=i}return r},right:function(t,e,r,u){for(arguments.length<3&&(r=0),arguments.length<4&&(u=t.length);u>r;){var i=r+u>>>1;n(t[i],e)>0?u=i:r=i+1}return r}}}function o(n){return n.length}function a(n){for(var t=1;n*t%1;)t*=10;return t}function c(n,t){for(var e in t)Object.defineProperty(n.prototype,e,{value:t[e],enumerable:!1})}function l(){this._=Object.create(null)}function s(n){return(n+="")===pa||n[0]===va?va+n:n}function f(n){return(n+="")[0]===va?n.slice(1):n}function h(n){return s(n)in this._}function g(n){return(n=s(n))in this._&&delete this._[n]}function p(){var n=[];for(var t in this._)n.push(f(t));return n}function v(){var n=0;for(var t in this._)++n;return n}function d(){for(var n in this._)return!1;return!0}function m(){this._=Object.create(null)}function y(n){return n}function M(n,t,e){return function(){var r=e.apply(t,arguments);return r===t?n:r}}function x(n,t){if(t in n)return t;t=t.charAt(0).toUpperCase()+t.slice(1);for(var e=0,r=da.length;r>e;++e){var u=da[e]+t;if(u in n)return u}}function b(){}function _(){}function w(n){function t(){for(var t,r=e,u=-1,i=r.length;++u<i;)(t=r[u].on)&&t.apply(this,arguments);return n}var e=[],r=new l;return t.on=function(t,u){var i,o=r.get(t);return arguments.length<2?o&&o.on:(o&&(o.on=null,e=e.slice(0,i=e.indexOf(o)).concat(e.slice(i+1)),r.remove(t)),u&&e.push(r.set(t,{on:u})),n)},t}function S(){ta.event.preventDefault()}function k(){for(var n,t=ta.event;n=t.sourceEvent;)t=n;return t}function E(n){for(var t=new _,e=0,r=arguments.length;++e<r;)t[arguments[e]]=w(t);return t.of=function(e,r){return function(u){try{var i=u.sourceEvent=ta.event;u.target=n,ta.event=u,t[u.type].apply(e,r)}finally{ta.event=i}}},t}function A(n){return ya(n,_a),n}function N(n){return"function"==typeof n?n:function(){return Ma(n,this)}}function C(n){return"function"==typeof n?n:function(){return xa(n,this)}}function z(n,t){function e(){this.removeAttribute(n)}function r(){this.removeAttributeNS(n.space,n.local)}function u(){this.setAttribute(n,t)}function i(){this.setAttributeNS(n.space,n.local,t)}function o(){var e=t.apply(this,arguments);null==e?this.removeAttribute(n):this.setAttribute(n,e)}function a(){var e=t.apply(this,arguments);null==e?this.removeAttributeNS(n.space,n.local):this.setAttributeNS(n.space,n.local,e)}return n=ta.ns.qualify(n),null==t?n.local?r:e:"function"==typeof t?n.local?a:o:n.local?i:u}function q(n){return n.trim().replace(/\s+/g," ")}function L(n){return new RegExp("(?:^|\\s+)"+ta.requote(n)+"(?:\\s+|$)","g")}function T(n){return(n+"").trim().split(/^|\s+/)}function R(n,t){function e(){for(var e=-1;++e<u;)n[e](this,t)}function r(){for(var e=-1,r=t.apply(this,arguments);++e<u;)n[e](this,r)}n=T(n).map(D);var u=n.length;return"function"==typeof t?r:e}function D(n){var t=L(n);return function(e,r){if(u=e.classList)return r?u.add(n):u.remove(n);var u=e.getAttribute("class")||"";r?(t.lastIndex=0,t.test(u)||e.setAttribute("class",q(u+" "+n))):e.setAttribute("class",q(u.replace(t," ")))}}function P(n,t,e){function r(){this.style.removeProperty(n)}function u(){this.style.setProperty(n,t,e)}function i(){var r=t.apply(this,arguments);null==r?this.style.removeProperty(n):this.style.setProperty(n,r,e)}return null==t?r:"function"==typeof t?i:u}function U(n,t){function e(){delete this[n]}function r(){this[n]=t}function u(){var e=t.apply(this,arguments);null==e?delete this[n]:this[n]=e}return null==t?e:"function"==typeof t?u:r}function j(n){function t(){var t=this.ownerDocument,e=this.namespaceURI;return e?t.createElementNS(e,n):t.createElement(n)}function e(){return this.ownerDocument.createElementNS(n.space,n.local)}return"function"==typeof n?n:(n=ta.ns.qualify(n)).local?e:t}function F(){var n=this.parentNode;n&&n.removeChild(this)}function H(n){return{__data__:n}}function O(n){return function(){return ba(this,n)}}function I(n){return arguments.length||(n=e),function(t,e){return t&&e?n(t.__data__,e.__data__):!t-!e}}function Y(n,t){for(var e=0,r=n.length;r>e;e++)for(var u,i=n[e],o=0,a=i.length;a>o;o++)(u=i[o])&&t(u,o,e);return n}function Z(n){return ya(n,Sa),n}function V(n){var t,e;return function(r,u,i){var o,a=n[i].update,c=a.length;for(i!=e&&(e=i,t=0),u>=t&&(t=u+1);!(o=a[t])&&++t<c;);return o}}function X(n,t,e){function r(){var t=this[o];t&&(this.removeEventListener(n,t,t.$),delete this[o])}function u(){var u=c(t,ra(arguments));r.call(this),this.addEventListener(n,this[o]=u,u.$=e),u._=t}function i(){var t,e=new RegExp("^__on([^.]+)"+ta.requote(n)+"$");for(var r in this)if(t=r.match(e)){var u=this[r];this.removeEventListener(t[1],u,u.$),delete this[r]}}var o="__on"+n,a=n.indexOf("."),c=$;a>0&&(n=n.slice(0,a));var l=ka.get(n);return l&&(n=l,c=B),a?t?u:r:t?b:i}function $(n,t){return function(e){var r=ta.event;ta.event=e,t[0]=this.__data__;try{n.apply(this,t)}finally{ta.event=r}}}function B(n,t){var e=$(n,t);return function(n){var t=this,r=n.relatedTarget;r&&(r===t||8&r.compareDocumentPosition(t))||e.call(t,n)}}function W(e){var r=".dragsuppress-"+ ++Aa,u="click"+r,i=ta.select(t(e)).on("touchmove"+r,S).on("dragstart"+r,S).on("selectstart"+r,S);if(null==Ea&&(Ea="onselectstart"in e?!1:x(e.style,"userSelect")),Ea){var o=n(e).style,a=o[Ea];o[Ea]="none"}return function(n){if(i.on(r,null),Ea&&(o[Ea]=a),n){var t=function(){i.on(u,null)};i.on(u,function(){S(),t()},!0),setTimeout(t,0)}}}function J(n,e){e.changedTouches&&(e=e.changedTouches[0]);var r=n.ownerSVGElement||n;if(r.createSVGPoint){var u=r.createSVGPoint();if(0>Na){var i=t(n);if(i.scrollX||i.scrollY){r=ta.select("body").append("svg").style({position:"absolute",top:0,left:0,margin:0,padding:0,border:"none"},"important");var o=r[0][0].getScreenCTM();Na=!(o.f||o.e),r.remove()}}return Na?(u.x=e.pageX,u.y=e.pageY):(u.x=e.clientX,u.y=e.clientY),u=u.matrixTransform(n.getScreenCTM().inverse()),[u.x,u.y]}var a=n.getBoundingClientRect();return[e.clientX-a.left-n.clientLeft,e.clientY-a.top-n.clientTop]}function G(){return ta.event.changedTouches[0].identifier}function K(n){return n>0?1:0>n?-1:0}function Q(n,t,e){return(t[0]-n[0])*(e[1]-n[1])-(t[1]-n[1])*(e[0]-n[0])}function nt(n){return n>1?0:-1>n?qa:Math.acos(n)}function tt(n){return n>1?Ra:-1>n?-Ra:Math.asin(n)}function et(n){return((n=Math.exp(n))-1/n)/2}function rt(n){return((n=Math.exp(n))+1/n)/2}function ut(n){return((n=Math.exp(2*n))-1)/(n+1)}function it(n){return(n=Math.sin(n/2))*n}function ot(){}function at(n,t,e){return this instanceof at?(this.h=+n,this.s=+t,void(this.l=+e)):arguments.length<2?n instanceof at?new at(n.h,n.s,n.l):bt(""+n,_t,at):new at(n,t,e)}function ct(n,t,e){function r(n){return n>360?n-=360:0>n&&(n+=360),60>n?i+(o-i)*n/60:180>n?o:240>n?i+(o-i)*(240-n)/60:i}function u(n){return Math.round(255*r(n))}var i,o;return n=isNaN(n)?0:(n%=360)<0?n+360:n,t=isNaN(t)?0:0>t?0:t>1?1:t,e=0>e?0:e>1?1:e,o=.5>=e?e*(1+t):e+t-e*t,i=2*e-o,new mt(u(n+120),u(n),u(n-120))}function lt(n,t,e){return this instanceof lt?(this.h=+n,this.c=+t,void(this.l=+e)):arguments.length<2?n instanceof lt?new lt(n.h,n.c,n.l):n instanceof ft?gt(n.l,n.a,n.b):gt((n=wt((n=ta.rgb(n)).r,n.g,n.b)).l,n.a,n.b):new lt(n,t,e)}function st(n,t,e){return isNaN(n)&&(n=0),isNaN(t)&&(t=0),new ft(e,Math.cos(n*=Da)*t,Math.sin(n)*t)}function ft(n,t,e){return this instanceof ft?(this.l=+n,this.a=+t,void(this.b=+e)):arguments.length<2?n instanceof ft?new ft(n.l,n.a,n.b):n instanceof lt?st(n.h,n.c,n.l):wt((n=mt(n)).r,n.g,n.b):new ft(n,t,e)}function ht(n,t,e){var r=(n+16)/116,u=r+t/500,i=r-e/200;return u=pt(u)*Xa,r=pt(r)*$a,i=pt(i)*Ba,new mt(dt(3.2404542*u-1.5371385*r-.4985314*i),dt(-.969266*u+1.8760108*r+.041556*i),dt(.0556434*u-.2040259*r+1.0572252*i))}function gt(n,t,e){return n>0?new lt(Math.atan2(e,t)*Pa,Math.sqrt(t*t+e*e),n):new lt(0/0,0/0,n)}function pt(n){return n>.206893034?n*n*n:(n-4/29)/7.787037}function vt(n){return n>.008856?Math.pow(n,1/3):7.787037*n+4/29}function dt(n){return Math.round(255*(.00304>=n?12.92*n:1.055*Math.pow(n,1/2.4)-.055))}function mt(n,t,e){return this instanceof mt?(this.r=~~n,this.g=~~t,void(this.b=~~e)):arguments.length<2?n instanceof mt?new mt(n.r,n.g,n.b):bt(""+n,mt,ct):new mt(n,t,e)}function yt(n){return new mt(n>>16,n>>8&255,255&n)}function Mt(n){return yt(n)+""}function xt(n){return 16>n?"0"+Math.max(0,n).toString(16):Math.min(255,n).toString(16)}function bt(n,t,e){var r,u,i,o=0,a=0,c=0;if(r=/([a-z]+)\((.*)\)/i.exec(n))switch(u=r[2].split(","),r[1]){case"hsl":return e(parseFloat(u[0]),parseFloat(u[1])/100,parseFloat(u[2])/100);case"rgb":return t(kt(u[0]),kt(u[1]),kt(u[2]))}return(i=Ga.get(n.toLowerCase()))?t(i.r,i.g,i.b):(null==n||"#"!==n.charAt(0)||isNaN(i=parseInt(n.slice(1),16))||(4===n.length?(o=(3840&i)>>4,o=o>>4|o,a=240&i,a=a>>4|a,c=15&i,c=c<<4|c):7===n.length&&(o=(16711680&i)>>16,a=(65280&i)>>8,c=255&i)),t(o,a,c))}function _t(n,t,e){var r,u,i=Math.min(n/=255,t/=255,e/=255),o=Math.max(n,t,e),a=o-i,c=(o+i)/2;return a?(u=.5>c?a/(o+i):a/(2-o-i),r=n==o?(t-e)/a+(e>t?6:0):t==o?(e-n)/a+2:(n-t)/a+4,r*=60):(r=0/0,u=c>0&&1>c?0:r),new at(r,u,c)}function wt(n,t,e){n=St(n),t=St(t),e=St(e);var r=vt((.4124564*n+.3575761*t+.1804375*e)/Xa),u=vt((.2126729*n+.7151522*t+.072175*e)/$a),i=vt((.0193339*n+.119192*t+.9503041*e)/Ba);return ft(116*u-16,500*(r-u),200*(u-i))}function St(n){return(n/=255)<=.04045?n/12.92:Math.pow((n+.055)/1.055,2.4)}function kt(n){var t=parseFloat(n);return"%"===n.charAt(n.length-1)?Math.round(2.55*t):t}function Et(n){return"function"==typeof n?n:function(){return n}}function At(n){return function(t,e,r){return 2===arguments.length&&"function"==typeof e&&(r=e,e=null),Nt(t,e,n,r)}}function Nt(n,t,e,r){function u(){var n,t=c.status;if(!t&&zt(c)||t>=200&&300>t||304===t){try{n=e.call(i,c)}catch(r){return void o.error.call(i,r)}o.load.call(i,n)}else o.error.call(i,c)}var i={},o=ta.dispatch("beforesend","progress","load","error"),a={},c=new XMLHttpRequest,l=null;return!this.XDomainRequest||"withCredentials"in c||!/^(http(s)?:)?\/\//.test(n)||(c=new XDomainRequest),"onload"in c?c.onload=c.onerror=u:c.onreadystatechange=function(){c.readyState>3&&u()},c.onprogress=function(n){var t=ta.event;ta.event=n;try{o.progress.call(i,c)}finally{ta.event=t}},i.header=function(n,t){return n=(n+"").toLowerCase(),arguments.length<2?a[n]:(null==t?delete a[n]:a[n]=t+"",i)},i.mimeType=function(n){return arguments.length?(t=null==n?null:n+"",i):t},i.responseType=function(n){return arguments.length?(l=n,i):l},i.response=function(n){return e=n,i},["get","post"].forEach(function(n){i[n]=function(){return i.send.apply(i,[n].concat(ra(arguments)))}}),i.send=function(e,r,u){if(2===arguments.length&&"function"==typeof r&&(u=r,r=null),c.open(e,n,!0),null==t||"accept"in a||(a.accept=t+",*/*"),c.setRequestHeader)for(var s in a)c.setRequestHeader(s,a[s]);return null!=t&&c.overrideMimeType&&c.overrideMimeType(t),null!=l&&(c.responseType=l),null!=u&&i.on("error",u).on("load",function(n){u(null,n)}),o.beforesend.call(i,c),c.send(null==r?null:r),i},i.abort=function(){return c.abort(),i},ta.rebind(i,o,"on"),null==r?i:i.get(Ct(r))}function Ct(n){return 1===n.length?function(t,e){n(null==t?e:null)}:n}function zt(n){var t=n.responseType;return t&&"text"!==t?n.response:n.responseText}function qt(){var n=Lt(),t=Tt()-n;t>24?(isFinite(t)&&(clearTimeout(tc),tc=setTimeout(qt,t)),nc=0):(nc=1,rc(qt))}function Lt(){var n=Date.now();for(ec=Ka;ec;)n>=ec.t&&(ec.f=ec.c(n-ec.t)),ec=ec.n;return n}function Tt(){for(var n,t=Ka,e=1/0;t;)t.f?t=n?n.n=t.n:Ka=t.n:(t.t<e&&(e=t.t),t=(n=t).n);return Qa=n,e}function Rt(n,t){return t-(n?Math.ceil(Math.log(n)/Math.LN10):1)}function Dt(n,t){var e=Math.pow(10,3*ga(8-t));return{scale:t>8?function(n){return n/e}:function(n){return n*e},symbol:n}}function Pt(n){var t=n.decimal,e=n.thousands,r=n.grouping,u=n.currency,i=r&&e?function(n,t){for(var u=n.length,i=[],o=0,a=r[0],c=0;u>0&&a>0&&(c+a+1>t&&(a=Math.max(1,t-c)),i.push(n.substring(u-=a,u+a)),!((c+=a+1)>t));)a=r[o=(o+1)%r.length];return i.reverse().join(e)}:y;return function(n){var e=ic.exec(n),r=e[1]||" ",o=e[2]||">",a=e[3]||"-",c=e[4]||"",l=e[5],s=+e[6],f=e[7],h=e[8],g=e[9],p=1,v="",d="",m=!1,y=!0;switch(h&&(h=+h.substring(1)),(l||"0"===r&&"="===o)&&(l=r="0",o="="),g){case"n":f=!0,g="g";break;case"%":p=100,d="%",g="f";break;case"p":p=100,d="%",g="r";break;case"b":case"o":case"x":case"X":"#"===c&&(v="0"+g.toLowerCase());case"c":y=!1;case"d":m=!0,h=0;break;case"s":p=-1,g="r"}"$"===c&&(v=u[0],d=u[1]),"r"!=g||h||(g="g"),null!=h&&("g"==g?h=Math.max(1,Math.min(21,h)):("e"==g||"f"==g)&&(h=Math.max(0,Math.min(20,h)))),g=oc.get(g)||Ut;var M=l&&f;return function(n){var e=d;if(m&&n%1)return"";var u=0>n||0===n&&0>1/n?(n=-n,"-"):"-"===a?"":a;if(0>p){var c=ta.formatPrefix(n,h);n=c.scale(n),e=c.symbol+d}else n*=p;n=g(n,h);var x,b,_=n.lastIndexOf(".");if(0>_){var w=y?n.lastIndexOf("e"):-1;0>w?(x=n,b=""):(x=n.substring(0,w),b=n.substring(w))}else x=n.substring(0,_),b=t+n.substring(_+1);!l&&f&&(x=i(x,1/0));var S=v.length+x.length+b.length+(M?0:u.length),k=s>S?new Array(S=s-S+1).join(r):"";return M&&(x=i(k+x,k.length?s-b.length:1/0)),u+=v,n=x+b,("<"===o?u+n+k:">"===o?k+u+n:"^"===o?k.substring(0,S>>=1)+u+n+k.substring(S):u+(M?n:k+n))+e}}}function Ut(n){return n+""}function jt(){this._=new Date(arguments.length>1?Date.UTC.apply(this,arguments):arguments[0])}function Ft(n,t,e){function r(t){var e=n(t),r=i(e,1);return r-t>t-e?e:r}function u(e){return t(e=n(new cc(e-1)),1),e}function i(n,e){return t(n=new cc(+n),e),n}function o(n,r,i){var o=u(n),a=[];if(i>1)for(;r>o;)e(o)%i||a.push(new Date(+o)),t(o,1);else for(;r>o;)a.push(new Date(+o)),t(o,1);return a}function a(n,t,e){try{cc=jt;var r=new jt;return r._=n,o(r,t,e)}finally{cc=Date}}n.floor=n,n.round=r,n.ceil=u,n.offset=i,n.range=o;var c=n.utc=Ht(n);return c.floor=c,c.round=Ht(r),c.ceil=Ht(u),c.offset=Ht(i),c.range=a,n}function Ht(n){return function(t,e){try{cc=jt;var r=new jt;return r._=t,n(r,e)._}finally{cc=Date}}}function Ot(n){function t(n){function t(t){for(var e,u,i,o=[],a=-1,c=0;++a<r;)37===n.charCodeAt(a)&&(o.push(n.slice(c,a)),null!=(u=sc[e=n.charAt(++a)])&&(e=n.charAt(++a)),(i=N[e])&&(e=i(t,null==u?"e"===e?" ":"0":u)),o.push(e),c=a+1);return o.push(n.slice(c,a)),o.join("")}var r=n.length;return t.parse=function(t){var r={y:1900,m:0,d:1,H:0,M:0,S:0,L:0,Z:null},u=e(r,n,t,0);if(u!=t.length)return null;"p"in r&&(r.H=r.H%12+12*r.p);var i=null!=r.Z&&cc!==jt,o=new(i?jt:cc);return"j"in r?o.setFullYear(r.y,0,r.j):"w"in r&&("W"in r||"U"in r)?(o.setFullYear(r.y,0,1),o.setFullYear(r.y,0,"W"in r?(r.w+6)%7+7*r.W-(o.getDay()+5)%7:r.w+7*r.U-(o.getDay()+6)%7)):o.setFullYear(r.y,r.m,r.d),o.setHours(r.H+(r.Z/100|0),r.M+r.Z%100,r.S,r.L),i?o._:o},t.toString=function(){return n},t}function e(n,t,e,r){for(var u,i,o,a=0,c=t.length,l=e.length;c>a;){if(r>=l)return-1;if(u=t.charCodeAt(a++),37===u){if(o=t.charAt(a++),i=C[o in sc?t.charAt(a++):o],!i||(r=i(n,e,r))<0)return-1}else if(u!=e.charCodeAt(r++))return-1}return r}function r(n,t,e){_.lastIndex=0;var r=_.exec(t.slice(e));return r?(n.w=w.get(r[0].toLowerCase()),e+r[0].length):-1}function u(n,t,e){x.lastIndex=0;var r=x.exec(t.slice(e));return r?(n.w=b.get(r[0].toLowerCase()),e+r[0].length):-1}function i(n,t,e){E.lastIndex=0;var r=E.exec(t.slice(e));return r?(n.m=A.get(r[0].toLowerCase()),e+r[0].length):-1}function o(n,t,e){S.lastIndex=0;var r=S.exec(t.slice(e));return r?(n.m=k.get(r[0].toLowerCase()),e+r[0].length):-1}function a(n,t,r){return e(n,N.c.toString(),t,r)}function c(n,t,r){return e(n,N.x.toString(),t,r)}function l(n,t,r){return e(n,N.X.toString(),t,r)}function s(n,t,e){var r=M.get(t.slice(e,e+=2).toLowerCase());return null==r?-1:(n.p=r,e)}var f=n.dateTime,h=n.date,g=n.time,p=n.periods,v=n.days,d=n.shortDays,m=n.months,y=n.shortMonths;t.utc=function(n){function e(n){try{cc=jt;var t=new cc;return t._=n,r(t)}finally{cc=Date}}var r=t(n);return e.parse=function(n){try{cc=jt;var t=r.parse(n);return t&&t._}finally{cc=Date}},e.toString=r.toString,e},t.multi=t.utc.multi=ae;var M=ta.map(),x=Yt(v),b=Zt(v),_=Yt(d),w=Zt(d),S=Yt(m),k=Zt(m),E=Yt(y),A=Zt(y);p.forEach(function(n,t){M.set(n.toLowerCase(),t)});var N={a:function(n){return d[n.getDay()]},A:function(n){return v[n.getDay()]},b:function(n){return y[n.getMonth()]},B:function(n){return m[n.getMonth()]},c:t(f),d:function(n,t){return It(n.getDate(),t,2)},e:function(n,t){return It(n.getDate(),t,2)},H:function(n,t){return It(n.getHours(),t,2)},I:function(n,t){return It(n.getHours()%12||12,t,2)},j:function(n,t){return It(1+ac.dayOfYear(n),t,3)},L:function(n,t){return It(n.getMilliseconds(),t,3)},m:function(n,t){return It(n.getMonth()+1,t,2)},M:function(n,t){return It(n.getMinutes(),t,2)},p:function(n){return p[+(n.getHours()>=12)]},S:function(n,t){return It(n.getSeconds(),t,2)},U:function(n,t){return It(ac.sundayOfYear(n),t,2)},w:function(n){return n.getDay()},W:function(n,t){return It(ac.mondayOfYear(n),t,2)},x:t(h),X:t(g),y:function(n,t){return It(n.getFullYear()%100,t,2)},Y:function(n,t){return It(n.getFullYear()%1e4,t,4)},Z:ie,"%":function(){return"%"}},C={a:r,A:u,b:i,B:o,c:a,d:Qt,e:Qt,H:te,I:te,j:ne,L:ue,m:Kt,M:ee,p:s,S:re,U:Xt,w:Vt,W:$t,x:c,X:l,y:Wt,Y:Bt,Z:Jt,"%":oe};return t}function It(n,t,e){var r=0>n?"-":"",u=(r?-n:n)+"",i=u.length;return r+(e>i?new Array(e-i+1).join(t)+u:u)}function Yt(n){return new RegExp("^(?:"+n.map(ta.requote).join("|")+")","i")}function Zt(n){for(var t=new l,e=-1,r=n.length;++e<r;)t.set(n[e].toLowerCase(),e);return t}function Vt(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+1));return r?(n.w=+r[0],e+r[0].length):-1}function Xt(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e));return r?(n.U=+r[0],e+r[0].length):-1}function $t(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e));return r?(n.W=+r[0],e+r[0].length):-1}function Bt(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+4));return r?(n.y=+r[0],e+r[0].length):-1}function Wt(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+2));return r?(n.y=Gt(+r[0]),e+r[0].length):-1}function Jt(n,t,e){return/^[+-]\d{4}$/.test(t=t.slice(e,e+5))?(n.Z=-t,e+5):-1}function Gt(n){return n+(n>68?1900:2e3)}function Kt(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+2));return r?(n.m=r[0]-1,e+r[0].length):-1}function Qt(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+2));return r?(n.d=+r[0],e+r[0].length):-1}function ne(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+3));return r?(n.j=+r[0],e+r[0].length):-1}function te(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+2));return r?(n.H=+r[0],e+r[0].length):-1}function ee(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+2));return r?(n.M=+r[0],e+r[0].length):-1}function re(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+2));return r?(n.S=+r[0],e+r[0].length):-1}function ue(n,t,e){fc.lastIndex=0;var r=fc.exec(t.slice(e,e+3));return r?(n.L=+r[0],e+r[0].length):-1}function ie(n){var t=n.getTimezoneOffset(),e=t>0?"-":"+",r=ga(t)/60|0,u=ga(t)%60;return e+It(r,"0",2)+It(u,"0",2)}function oe(n,t,e){hc.lastIndex=0;var r=hc.exec(t.slice(e,e+1));return r?e+r[0].length:-1}function ae(n){for(var t=n.length,e=-1;++e<t;)n[e][0]=this(n[e][0]);return function(t){for(var e=0,r=n[e];!r[1](t);)r=n[++e];return r[0](t)}}function ce(){}function le(n,t,e){var r=e.s=n+t,u=r-n,i=r-u;e.t=n-i+(t-u)}function se(n,t){n&&dc.hasOwnProperty(n.type)&&dc[n.type](n,t)}function fe(n,t,e){var r,u=-1,i=n.length-e;for(t.lineStart();++u<i;)r=n[u],t.point(r[0],r[1],r[2]);t.lineEnd()}function he(n,t){var e=-1,r=n.length;for(t.polygonStart();++e<r;)fe(n[e],t,1);t.polygonEnd()}function ge(){function n(n,t){n*=Da,t=t*Da/2+qa/4;var e=n-r,o=e>=0?1:-1,a=o*e,c=Math.cos(t),l=Math.sin(t),s=i*l,f=u*c+s*Math.cos(a),h=s*o*Math.sin(a);yc.add(Math.atan2(h,f)),r=n,u=c,i=l}var t,e,r,u,i;Mc.point=function(o,a){Mc.point=n,r=(t=o)*Da,u=Math.cos(a=(e=a)*Da/2+qa/4),i=Math.sin(a)},Mc.lineEnd=function(){n(t,e)}}function pe(n){var t=n[0],e=n[1],r=Math.cos(e);return[r*Math.cos(t),r*Math.sin(t),Math.sin(e)]}function ve(n,t){return n[0]*t[0]+n[1]*t[1]+n[2]*t[2]}function de(n,t){return[n[1]*t[2]-n[2]*t[1],n[2]*t[0]-n[0]*t[2],n[0]*t[1]-n[1]*t[0]]}function me(n,t){n[0]+=t[0],n[1]+=t[1],n[2]+=t[2]}function ye(n,t){return[n[0]*t,n[1]*t,n[2]*t]}function Me(n){var t=Math.sqrt(n[0]*n[0]+n[1]*n[1]+n[2]*n[2]);n[0]/=t,n[1]/=t,n[2]/=t}function xe(n){return[Math.atan2(n[1],n[0]),tt(n[2])]}function be(n,t){return ga(n[0]-t[0])<Ca&&ga(n[1]-t[1])<Ca}function _e(n,t){n*=Da;var e=Math.cos(t*=Da);we(e*Math.cos(n),e*Math.sin(n),Math.sin(t))}function we(n,t,e){++xc,_c+=(n-_c)/xc,wc+=(t-wc)/xc,Sc+=(e-Sc)/xc}function Se(){function n(n,u){n*=Da;var i=Math.cos(u*=Da),o=i*Math.cos(n),a=i*Math.sin(n),c=Math.sin(u),l=Math.atan2(Math.sqrt((l=e*c-r*a)*l+(l=r*o-t*c)*l+(l=t*a-e*o)*l),t*o+e*a+r*c);bc+=l,kc+=l*(t+(t=o)),Ec+=l*(e+(e=a)),Ac+=l*(r+(r=c)),we(t,e,r)}var t,e,r;qc.point=function(u,i){u*=Da;var o=Math.cos(i*=Da);t=o*Math.cos(u),e=o*Math.sin(u),r=Math.sin(i),qc.point=n,we(t,e,r)}}function ke(){qc.point=_e}function Ee(){function n(n,t){n*=Da;var e=Math.cos(t*=Da),o=e*Math.cos(n),a=e*Math.sin(n),c=Math.sin(t),l=u*c-i*a,s=i*o-r*c,f=r*a-u*o,h=Math.sqrt(l*l+s*s+f*f),g=r*o+u*a+i*c,p=h&&-nt(g)/h,v=Math.atan2(h,g);Nc+=p*l,Cc+=p*s,zc+=p*f,bc+=v,kc+=v*(r+(r=o)),Ec+=v*(u+(u=a)),Ac+=v*(i+(i=c)),we(r,u,i)}var t,e,r,u,i;qc.point=function(o,a){t=o,e=a,qc.point=n,o*=Da;var c=Math.cos(a*=Da);r=c*Math.cos(o),u=c*Math.sin(o),i=Math.sin(a),we(r,u,i)},qc.lineEnd=function(){n(t,e),qc.lineEnd=ke,qc.point=_e}}function Ae(n,t){function e(e,r){return e=n(e,r),t(e[0],e[1])}return n.invert&&t.invert&&(e.invert=function(e,r){return e=t.invert(e,r),e&&n.invert(e[0],e[1])}),e}function Ne(){return!0}function Ce(n,t,e,r,u){var i=[],o=[];if(n.forEach(function(n){if(!((t=n.length-1)<=0)){var t,e=n[0],r=n[t];if(be(e,r)){u.lineStart();for(var a=0;t>a;++a)u.point((e=n[a])[0],e[1]);return void u.lineEnd()}var c=new qe(e,n,null,!0),l=new qe(e,null,c,!1);c.o=l,i.push(c),o.push(l),c=new qe(r,n,null,!1),l=new qe(r,null,c,!0),c.o=l,i.push(c),o.push(l)}}),o.sort(t),ze(i),ze(o),i.length){for(var a=0,c=e,l=o.length;l>a;++a)o[a].e=c=!c;for(var s,f,h=i[0];;){for(var g=h,p=!0;g.v;)if((g=g.n)===h)return;s=g.z,u.lineStart();do{if(g.v=g.o.v=!0,g.e){if(p)for(var a=0,l=s.length;l>a;++a)u.point((f=s[a])[0],f[1]);else r(g.x,g.n.x,1,u);g=g.n}else{if(p){s=g.p.z;for(var a=s.length-1;a>=0;--a)u.point((f=s[a])[0],f[1])}else r(g.x,g.p.x,-1,u);g=g.p}g=g.o,s=g.z,p=!p}while(!g.v);u.lineEnd()}}}function ze(n){if(t=n.length){for(var t,e,r=0,u=n[0];++r<t;)u.n=e=n[r],e.p=u,u=e;u.n=e=n[0],e.p=u}}function qe(n,t,e,r){this.x=n,this.z=t,this.o=e,this.e=r,this.v=!1,this.n=this.p=null}function Le(n,t,e,r){return function(u,i){function o(t,e){var r=u(t,e);n(t=r[0],e=r[1])&&i.point(t,e)}function a(n,t){var e=u(n,t);d.point(e[0],e[1])}function c(){y.point=a,d.lineStart()}function l(){y.point=o,d.lineEnd()}function s(n,t){v.push([n,t]);var e=u(n,t);x.point(e[0],e[1])}function f(){x.lineStart(),v=[]}function h(){s(v[0][0],v[0][1]),x.lineEnd();var n,t=x.clean(),e=M.buffer(),r=e.length;if(v.pop(),p.push(v),v=null,r)if(1&t){n=e[0];var u,r=n.length-1,o=-1;if(r>0){for(b||(i.polygonStart(),b=!0),i.lineStart();++o<r;)i.point((u=n[o])[0],u[1]);i.lineEnd()}}else r>1&&2&t&&e.push(e.pop().concat(e.shift())),g.push(e.filter(Te))}var g,p,v,d=t(i),m=u.invert(r[0],r[1]),y={point:o,lineStart:c,lineEnd:l,polygonStart:function(){y.point=s,y.lineStart=f,y.lineEnd=h,g=[],p=[]},polygonEnd:function(){y.point=o,y.lineStart=c,y.lineEnd=l,g=ta.merge(g);var n=Fe(m,p);g.length?(b||(i.polygonStart(),b=!0),Ce(g,De,n,e,i)):n&&(b||(i.polygonStart(),b=!0),i.lineStart(),e(null,null,1,i),i.lineEnd()),b&&(i.polygonEnd(),b=!1),g=p=null},sphere:function(){i.polygonStart(),i.lineStart(),e(null,null,1,i),i.lineEnd(),i.polygonEnd()}},M=Re(),x=t(M),b=!1;return y}}function Te(n){return n.length>1}function Re(){var n,t=[];return{lineStart:function(){t.push(n=[])},point:function(t,e){n.push([t,e])},lineEnd:b,buffer:function(){var e=t;return t=[],n=null,e},rejoin:function(){t.length>1&&t.push(t.pop().concat(t.shift()))}}}function De(n,t){return((n=n.x)[0]<0?n[1]-Ra-Ca:Ra-n[1])-((t=t.x)[0]<0?t[1]-Ra-Ca:Ra-t[1])}function Pe(n){var t,e=0/0,r=0/0,u=0/0;return{lineStart:function(){n.lineStart(),t=1},point:function(i,o){var a=i>0?qa:-qa,c=ga(i-e);ga(c-qa)<Ca?(n.point(e,r=(r+o)/2>0?Ra:-Ra),n.point(u,r),n.lineEnd(),n.lineStart(),n.point(a,r),n.point(i,r),t=0):u!==a&&c>=qa&&(ga(e-u)<Ca&&(e-=u*Ca),ga(i-a)<Ca&&(i-=a*Ca),r=Ue(e,r,i,o),n.point(u,r),n.lineEnd(),n.lineStart(),n.point(a,r),t=0),n.point(e=i,r=o),u=a},lineEnd:function(){n.lineEnd(),e=r=0/0},clean:function(){return 2-t}}}function Ue(n,t,e,r){var u,i,o=Math.sin(n-e);return ga(o)>Ca?Math.atan((Math.sin(t)*(i=Math.cos(r))*Math.sin(e)-Math.sin(r)*(u=Math.cos(t))*Math.sin(n))/(u*i*o)):(t+r)/2}function je(n,t,e,r){var u;if(null==n)u=e*Ra,r.point(-qa,u),r.point(0,u),r.point(qa,u),r.point(qa,0),r.point(qa,-u),r.point(0,-u),r.point(-qa,-u),r.point(-qa,0),r.point(-qa,u);else if(ga(n[0]-t[0])>Ca){var i=n[0]<t[0]?qa:-qa;u=e*i/2,r.point(-i,u),r.point(0,u),r.point(i,u)}else r.point(t[0],t[1])}function Fe(n,t){var e=n[0],r=n[1],u=[Math.sin(e),-Math.cos(e),0],i=0,o=0;yc.reset();for(var a=0,c=t.length;c>a;++a){var l=t[a],s=l.length;if(s)for(var f=l[0],h=f[0],g=f[1]/2+qa/4,p=Math.sin(g),v=Math.cos(g),d=1;;){d===s&&(d=0),n=l[d];var m=n[0],y=n[1]/2+qa/4,M=Math.sin(y),x=Math.cos(y),b=m-h,_=b>=0?1:-1,w=_*b,S=w>qa,k=p*M;if(yc.add(Math.atan2(k*_*Math.sin(w),v*x+k*Math.cos(w))),i+=S?b+_*La:b,S^h>=e^m>=e){var E=de(pe(f),pe(n));Me(E);var A=de(u,E);Me(A);var N=(S^b>=0?-1:1)*tt(A[2]);(r>N||r===N&&(E[0]||E[1]))&&(o+=S^b>=0?1:-1)}if(!d++)break;h=m,p=M,v=x,f=n}}return(-Ca>i||Ca>i&&0>yc)^1&o}function He(n){function t(n,t){return Math.cos(n)*Math.cos(t)>i}function e(n){var e,i,c,l,s;return{lineStart:function(){l=c=!1,s=1},point:function(f,h){var g,p=[f,h],v=t(f,h),d=o?v?0:u(f,h):v?u(f+(0>f?qa:-qa),h):0;if(!e&&(l=c=v)&&n.lineStart(),v!==c&&(g=r(e,p),(be(e,g)||be(p,g))&&(p[0]+=Ca,p[1]+=Ca,v=t(p[0],p[1]))),v!==c)s=0,v?(n.lineStart(),g=r(p,e),n.point(g[0],g[1])):(g=r(e,p),n.point(g[0],g[1]),n.lineEnd()),e=g;else if(a&&e&&o^v){var m;d&i||!(m=r(p,e,!0))||(s=0,o?(n.lineStart(),n.point(m[0][0],m[0][1]),n.point(m[1][0],m[1][1]),n.lineEnd()):(n.point(m[1][0],m[1][1]),n.lineEnd(),n.lineStart(),n.point(m[0][0],m[0][1])))}!v||e&&be(e,p)||n.point(p[0],p[1]),e=p,c=v,i=d},lineEnd:function(){c&&n.lineEnd(),e=null},clean:function(){return s|(l&&c)<<1}}}function r(n,t,e){var r=pe(n),u=pe(t),o=[1,0,0],a=de(r,u),c=ve(a,a),l=a[0],s=c-l*l;if(!s)return!e&&n;var f=i*c/s,h=-i*l/s,g=de(o,a),p=ye(o,f),v=ye(a,h);me(p,v);var d=g,m=ve(p,d),y=ve(d,d),M=m*m-y*(ve(p,p)-1);if(!(0>M)){var x=Math.sqrt(M),b=ye(d,(-m-x)/y);if(me(b,p),b=xe(b),!e)return b;var _,w=n[0],S=t[0],k=n[1],E=t[1];w>S&&(_=w,w=S,S=_);var A=S-w,N=ga(A-qa)<Ca,C=N||Ca>A;if(!N&&k>E&&(_=k,k=E,E=_),C?N?k+E>0^b[1]<(ga(b[0]-w)<Ca?k:E):k<=b[1]&&b[1]<=E:A>qa^(w<=b[0]&&b[0]<=S)){var z=ye(d,(-m+x)/y);return me(z,p),[b,xe(z)]}}}function u(t,e){var r=o?n:qa-n,u=0;return-r>t?u|=1:t>r&&(u|=2),-r>e?u|=4:e>r&&(u|=8),u}var i=Math.cos(n),o=i>0,a=ga(i)>Ca,c=gr(n,6*Da);return Le(t,e,c,o?[0,-n]:[-qa,n-qa])}function Oe(n,t,e,r){return function(u){var i,o=u.a,a=u.b,c=o.x,l=o.y,s=a.x,f=a.y,h=0,g=1,p=s-c,v=f-l;if(i=n-c,p||!(i>0)){if(i/=p,0>p){if(h>i)return;g>i&&(g=i)}else if(p>0){if(i>g)return;i>h&&(h=i)}if(i=e-c,p||!(0>i)){if(i/=p,0>p){if(i>g)return;i>h&&(h=i)}else if(p>0){if(h>i)return;g>i&&(g=i)}if(i=t-l,v||!(i>0)){if(i/=v,0>v){if(h>i)return;g>i&&(g=i)}else if(v>0){if(i>g)return;i>h&&(h=i)}if(i=r-l,v||!(0>i)){if(i/=v,0>v){if(i>g)return;i>h&&(h=i)}else if(v>0){if(h>i)return;g>i&&(g=i)}return h>0&&(u.a={x:c+h*p,y:l+h*v}),1>g&&(u.b={x:c+g*p,y:l+g*v}),u}}}}}}function Ie(n,t,e,r){function u(r,u){return ga(r[0]-n)<Ca?u>0?0:3:ga(r[0]-e)<Ca?u>0?2:1:ga(r[1]-t)<Ca?u>0?1:0:u>0?3:2}function i(n,t){return o(n.x,t.x)}function o(n,t){var e=u(n,1),r=u(t,1);return e!==r?e-r:0===e?t[1]-n[1]:1===e?n[0]-t[0]:2===e?n[1]-t[1]:t[0]-n[0]}return function(a){function c(n){for(var t=0,e=d.length,r=n[1],u=0;e>u;++u)for(var i,o=1,a=d[u],c=a.length,l=a[0];c>o;++o)i=a[o],l[1]<=r?i[1]>r&&Q(l,i,n)>0&&++t:i[1]<=r&&Q(l,i,n)<0&&--t,l=i;return 0!==t}function l(i,a,c,l){var s=0,f=0;if(null==i||(s=u(i,c))!==(f=u(a,c))||o(i,a)<0^c>0){do l.point(0===s||3===s?n:e,s>1?r:t);while((s=(s+c+4)%4)!==f)}else l.point(a[0],a[1])}function s(u,i){return u>=n&&e>=u&&i>=t&&r>=i}function f(n,t){s(n,t)&&a.point(n,t)}function h(){C.point=p,d&&d.push(m=[]),S=!0,w=!1,b=_=0/0}function g(){v&&(p(y,M),x&&w&&A.rejoin(),v.push(A.buffer())),C.point=f,w&&a.lineEnd()}function p(n,t){n=Math.max(-Tc,Math.min(Tc,n)),t=Math.max(-Tc,Math.min(Tc,t));var e=s(n,t);if(d&&m.push([n,t]),S)y=n,M=t,x=e,S=!1,e&&(a.lineStart(),a.point(n,t));else if(e&&w)a.point(n,t);else{var r={a:{x:b,y:_},b:{x:n,y:t}};N(r)?(w||(a.lineStart(),a.point(r.a.x,r.a.y)),a.point(r.b.x,r.b.y),e||a.lineEnd(),k=!1):e&&(a.lineStart(),a.point(n,t),k=!1)}b=n,_=t,w=e}var v,d,m,y,M,x,b,_,w,S,k,E=a,A=Re(),N=Oe(n,t,e,r),C={point:f,lineStart:h,lineEnd:g,polygonStart:function(){a=A,v=[],d=[],k=!0},polygonEnd:function(){a=E,v=ta.merge(v);var t=c([n,r]),e=k&&t,u=v.length;(e||u)&&(a.polygonStart(),e&&(a.lineStart(),l(null,null,1,a),a.lineEnd()),u&&Ce(v,i,t,l,a),a.polygonEnd()),v=d=m=null}};return C}}function Ye(n){var t=0,e=qa/3,r=ir(n),u=r(t,e);return u.parallels=function(n){return arguments.length?r(t=n[0]*qa/180,e=n[1]*qa/180):[t/qa*180,e/qa*180]},u}function Ze(n,t){function e(n,t){var e=Math.sqrt(i-2*u*Math.sin(t))/u;return[e*Math.sin(n*=u),o-e*Math.cos(n)]}var r=Math.sin(n),u=(r+Math.sin(t))/2,i=1+r*(2*u-r),o=Math.sqrt(i)/u;return e.invert=function(n,t){var e=o-t;return[Math.atan2(n,e)/u,tt((i-(n*n+e*e)*u*u)/(2*u))]},e}function Ve(){function n(n,t){Dc+=u*n-r*t,r=n,u=t}var t,e,r,u;Hc.point=function(i,o){Hc.point=n,t=r=i,e=u=o},Hc.lineEnd=function(){n(t,e)}}function Xe(n,t){Pc>n&&(Pc=n),n>jc&&(jc=n),Uc>t&&(Uc=t),t>Fc&&(Fc=t)}function $e(){function n(n,t){o.push("M",n,",",t,i)}function t(n,t){o.push("M",n,",",t),a.point=e}function e(n,t){o.push("L",n,",",t)}function r(){a.point=n}function u(){o.push("Z")}var i=Be(4.5),o=[],a={point:n,lineStart:function(){a.point=t},lineEnd:r,polygonStart:function(){a.lineEnd=u},polygonEnd:function(){a.lineEnd=r,a.point=n},pointRadius:function(n){return i=Be(n),a},result:function(){if(o.length){var n=o.join("");return o=[],n}}};return a}function Be(n){return"m0,"+n+"a"+n+","+n+" 0 1,1 0,"+-2*n+"a"+n+","+n+" 0 1,1 0,"+2*n+"z"}function We(n,t){_c+=n,wc+=t,++Sc}function Je(){function n(n,r){var u=n-t,i=r-e,o=Math.sqrt(u*u+i*i);kc+=o*(t+n)/2,Ec+=o*(e+r)/2,Ac+=o,We(t=n,e=r)}var t,e;Ic.point=function(r,u){Ic.point=n,We(t=r,e=u)}}function Ge(){Ic.point=We}function Ke(){function n(n,t){var e=n-r,i=t-u,o=Math.sqrt(e*e+i*i);kc+=o*(r+n)/2,Ec+=o*(u+t)/2,Ac+=o,o=u*n-r*t,Nc+=o*(r+n),Cc+=o*(u+t),zc+=3*o,We(r=n,u=t)}var t,e,r,u;Ic.point=function(i,o){Ic.point=n,We(t=r=i,e=u=o)},Ic.lineEnd=function(){n(t,e)}}function Qe(n){function t(t,e){n.moveTo(t+o,e),n.arc(t,e,o,0,La)}function e(t,e){n.moveTo(t,e),a.point=r}function r(t,e){n.lineTo(t,e)}function u(){a.point=t}function i(){n.closePath()}var o=4.5,a={point:t,lineStart:function(){a.point=e},lineEnd:u,polygonStart:function(){a.lineEnd=i},polygonEnd:function(){a.lineEnd=u,a.point=t},pointRadius:function(n){return o=n,a},result:b};return a}function nr(n){function t(n){return(a?r:e)(n)}function e(t){return rr(t,function(e,r){e=n(e,r),t.point(e[0],e[1])})}function r(t){function e(e,r){e=n(e,r),t.point(e[0],e[1])}function r(){M=0/0,S.point=i,t.lineStart()}function i(e,r){var i=pe([e,r]),o=n(e,r);u(M,x,y,b,_,w,M=o[0],x=o[1],y=e,b=i[0],_=i[1],w=i[2],a,t),t.point(M,x)}function o(){S.point=e,t.lineEnd()}function c(){r(),S.point=l,S.lineEnd=s}function l(n,t){i(f=n,h=t),g=M,p=x,v=b,d=_,m=w,S.point=i}function s(){u(M,x,y,b,_,w,g,p,f,v,d,m,a,t),S.lineEnd=o,o()}var f,h,g,p,v,d,m,y,M,x,b,_,w,S={point:e,lineStart:r,lineEnd:o,polygonStart:function(){t.polygonStart(),S.lineStart=c },polygonEnd:function(){t.polygonEnd(),S.lineStart=r}};return S}function u(t,e,r,a,c,l,s,f,h,g,p,v,d,m){var y=s-t,M=f-e,x=y*y+M*M;if(x>4*i&&d--){var b=a+g,_=c+p,w=l+v,S=Math.sqrt(b*b+_*_+w*w),k=Math.asin(w/=S),E=ga(ga(w)-1)<Ca||ga(r-h)<Ca?(r+h)/2:Math.atan2(_,b),A=n(E,k),N=A[0],C=A[1],z=N-t,q=C-e,L=M*z-y*q;(L*L/x>i||ga((y*z+M*q)/x-.5)>.3||o>a*g+c*p+l*v)&&(u(t,e,r,a,c,l,N,C,E,b/=S,_/=S,w,d,m),m.point(N,C),u(N,C,E,b,_,w,s,f,h,g,p,v,d,m))}}var i=.5,o=Math.cos(30*Da),a=16;return t.precision=function(n){return arguments.length?(a=(i=n*n)>0&&16,t):Math.sqrt(i)},t}function tr(n){var t=nr(function(t,e){return n([t*Pa,e*Pa])});return function(n){return or(t(n))}}function er(n){this.stream=n}function rr(n,t){return{point:t,sphere:function(){n.sphere()},lineStart:function(){n.lineStart()},lineEnd:function(){n.lineEnd()},polygonStart:function(){n.polygonStart()},polygonEnd:function(){n.polygonEnd()}}}function ur(n){return ir(function(){return n})()}function ir(n){function t(n){return n=a(n[0]*Da,n[1]*Da),[n[0]*h+c,l-n[1]*h]}function e(n){return n=a.invert((n[0]-c)/h,(l-n[1])/h),n&&[n[0]*Pa,n[1]*Pa]}function r(){a=Ae(o=lr(m,M,x),i);var n=i(v,d);return c=g-n[0]*h,l=p+n[1]*h,u()}function u(){return s&&(s.valid=!1,s=null),t}var i,o,a,c,l,s,f=nr(function(n,t){return n=i(n,t),[n[0]*h+c,l-n[1]*h]}),h=150,g=480,p=250,v=0,d=0,m=0,M=0,x=0,b=Lc,_=y,w=null,S=null;return t.stream=function(n){return s&&(s.valid=!1),s=or(b(o,f(_(n)))),s.valid=!0,s},t.clipAngle=function(n){return arguments.length?(b=null==n?(w=n,Lc):He((w=+n)*Da),u()):w},t.clipExtent=function(n){return arguments.length?(S=n,_=n?Ie(n[0][0],n[0][1],n[1][0],n[1][1]):y,u()):S},t.scale=function(n){return arguments.length?(h=+n,r()):h},t.translate=function(n){return arguments.length?(g=+n[0],p=+n[1],r()):[g,p]},t.center=function(n){return arguments.length?(v=n[0]%360*Da,d=n[1]%360*Da,r()):[v*Pa,d*Pa]},t.rotate=function(n){return arguments.length?(m=n[0]%360*Da,M=n[1]%360*Da,x=n.length>2?n[2]%360*Da:0,r()):[m*Pa,M*Pa,x*Pa]},ta.rebind(t,f,"precision"),function(){return i=n.apply(this,arguments),t.invert=i.invert&&e,r()}}function or(n){return rr(n,function(t,e){n.point(t*Da,e*Da)})}function ar(n,t){return[n,t]}function cr(n,t){return[n>qa?n-La:-qa>n?n+La:n,t]}function lr(n,t,e){return n?t||e?Ae(fr(n),hr(t,e)):fr(n):t||e?hr(t,e):cr}function sr(n){return function(t,e){return t+=n,[t>qa?t-La:-qa>t?t+La:t,e]}}function fr(n){var t=sr(n);return t.invert=sr(-n),t}function hr(n,t){function e(n,t){var e=Math.cos(t),a=Math.cos(n)*e,c=Math.sin(n)*e,l=Math.sin(t),s=l*r+a*u;return[Math.atan2(c*i-s*o,a*r-l*u),tt(s*i+c*o)]}var r=Math.cos(n),u=Math.sin(n),i=Math.cos(t),o=Math.sin(t);return e.invert=function(n,t){var e=Math.cos(t),a=Math.cos(n)*e,c=Math.sin(n)*e,l=Math.sin(t),s=l*i-c*o;return[Math.atan2(c*i+l*o,a*r+s*u),tt(s*r-a*u)]},e}function gr(n,t){var e=Math.cos(n),r=Math.sin(n);return function(u,i,o,a){var c=o*t;null!=u?(u=pr(e,u),i=pr(e,i),(o>0?i>u:u>i)&&(u+=o*La)):(u=n+o*La,i=n-.5*c);for(var l,s=u;o>0?s>i:i>s;s-=c)a.point((l=xe([e,-r*Math.cos(s),-r*Math.sin(s)]))[0],l[1])}}function pr(n,t){var e=pe(t);e[0]-=n,Me(e);var r=nt(-e[1]);return((-e[2]<0?-r:r)+2*Math.PI-Ca)%(2*Math.PI)}function vr(n,t,e){var r=ta.range(n,t-Ca,e).concat(t);return function(n){return r.map(function(t){return[n,t]})}}function dr(n,t,e){var r=ta.range(n,t-Ca,e).concat(t);return function(n){return r.map(function(t){return[t,n]})}}function mr(n){return n.source}function yr(n){return n.target}function Mr(n,t,e,r){var u=Math.cos(t),i=Math.sin(t),o=Math.cos(r),a=Math.sin(r),c=u*Math.cos(n),l=u*Math.sin(n),s=o*Math.cos(e),f=o*Math.sin(e),h=2*Math.asin(Math.sqrt(it(r-t)+u*o*it(e-n))),g=1/Math.sin(h),p=h?function(n){var t=Math.sin(n*=h)*g,e=Math.sin(h-n)*g,r=e*c+t*s,u=e*l+t*f,o=e*i+t*a;return[Math.atan2(u,r)*Pa,Math.atan2(o,Math.sqrt(r*r+u*u))*Pa]}:function(){return[n*Pa,t*Pa]};return p.distance=h,p}function xr(){function n(n,u){var i=Math.sin(u*=Da),o=Math.cos(u),a=ga((n*=Da)-t),c=Math.cos(a);Yc+=Math.atan2(Math.sqrt((a=o*Math.sin(a))*a+(a=r*i-e*o*c)*a),e*i+r*o*c),t=n,e=i,r=o}var t,e,r;Zc.point=function(u,i){t=u*Da,e=Math.sin(i*=Da),r=Math.cos(i),Zc.point=n},Zc.lineEnd=function(){Zc.point=Zc.lineEnd=b}}function br(n,t){function e(t,e){var r=Math.cos(t),u=Math.cos(e),i=n(r*u);return[i*u*Math.sin(t),i*Math.sin(e)]}return e.invert=function(n,e){var r=Math.sqrt(n*n+e*e),u=t(r),i=Math.sin(u),o=Math.cos(u);return[Math.atan2(n*i,r*o),Math.asin(r&&e*i/r)]},e}function _r(n,t){function e(n,t){o>0?-Ra+Ca>t&&(t=-Ra+Ca):t>Ra-Ca&&(t=Ra-Ca);var e=o/Math.pow(u(t),i);return[e*Math.sin(i*n),o-e*Math.cos(i*n)]}var r=Math.cos(n),u=function(n){return Math.tan(qa/4+n/2)},i=n===t?Math.sin(n):Math.log(r/Math.cos(t))/Math.log(u(t)/u(n)),o=r*Math.pow(u(n),i)/i;return i?(e.invert=function(n,t){var e=o-t,r=K(i)*Math.sqrt(n*n+e*e);return[Math.atan2(n,e)/i,2*Math.atan(Math.pow(o/r,1/i))-Ra]},e):Sr}function wr(n,t){function e(n,t){var e=i-t;return[e*Math.sin(u*n),i-e*Math.cos(u*n)]}var r=Math.cos(n),u=n===t?Math.sin(n):(r-Math.cos(t))/(t-n),i=r/u+n;return ga(u)<Ca?ar:(e.invert=function(n,t){var e=i-t;return[Math.atan2(n,e)/u,i-K(u)*Math.sqrt(n*n+e*e)]},e)}function Sr(n,t){return[n,Math.log(Math.tan(qa/4+t/2))]}function kr(n){var t,e=ur(n),r=e.scale,u=e.translate,i=e.clipExtent;return e.scale=function(){var n=r.apply(e,arguments);return n===e?t?e.clipExtent(null):e:n},e.translate=function(){var n=u.apply(e,arguments);return n===e?t?e.clipExtent(null):e:n},e.clipExtent=function(n){var o=i.apply(e,arguments);if(o===e){if(t=null==n){var a=qa*r(),c=u();i([[c[0]-a,c[1]-a],[c[0]+a,c[1]+a]])}}else t&&(o=null);return o},e.clipExtent(null)}function Er(n,t){return[Math.log(Math.tan(qa/4+t/2)),-n]}function Ar(n){return n[0]}function Nr(n){return n[1]}function Cr(n){for(var t=n.length,e=[0,1],r=2,u=2;t>u;u++){for(;r>1&&Q(n[e[r-2]],n[e[r-1]],n[u])<=0;)--r;e[r++]=u}return e.slice(0,r)}function zr(n,t){return n[0]-t[0]||n[1]-t[1]}function qr(n,t,e){return(e[0]-t[0])*(n[1]-t[1])<(e[1]-t[1])*(n[0]-t[0])}function Lr(n,t,e,r){var u=n[0],i=e[0],o=t[0]-u,a=r[0]-i,c=n[1],l=e[1],s=t[1]-c,f=r[1]-l,h=(a*(c-l)-f*(u-i))/(f*o-a*s);return[u+h*o,c+h*s]}function Tr(n){var t=n[0],e=n[n.length-1];return!(t[0]-e[0]||t[1]-e[1])}function Rr(){tu(this),this.edge=this.site=this.circle=null}function Dr(n){var t=el.pop()||new Rr;return t.site=n,t}function Pr(n){Xr(n),Qc.remove(n),el.push(n),tu(n)}function Ur(n){var t=n.circle,e=t.x,r=t.cy,u={x:e,y:r},i=n.P,o=n.N,a=[n];Pr(n);for(var c=i;c.circle&&ga(e-c.circle.x)<Ca&&ga(r-c.circle.cy)<Ca;)i=c.P,a.unshift(c),Pr(c),c=i;a.unshift(c),Xr(c);for(var l=o;l.circle&&ga(e-l.circle.x)<Ca&&ga(r-l.circle.cy)<Ca;)o=l.N,a.push(l),Pr(l),l=o;a.push(l),Xr(l);var s,f=a.length;for(s=1;f>s;++s)l=a[s],c=a[s-1],Kr(l.edge,c.site,l.site,u);c=a[0],l=a[f-1],l.edge=Jr(c.site,l.site,null,u),Vr(c),Vr(l)}function jr(n){for(var t,e,r,u,i=n.x,o=n.y,a=Qc._;a;)if(r=Fr(a,o)-i,r>Ca)a=a.L;else{if(u=i-Hr(a,o),!(u>Ca)){r>-Ca?(t=a.P,e=a):u>-Ca?(t=a,e=a.N):t=e=a;break}if(!a.R){t=a;break}a=a.R}var c=Dr(n);if(Qc.insert(t,c),t||e){if(t===e)return Xr(t),e=Dr(t.site),Qc.insert(c,e),c.edge=e.edge=Jr(t.site,c.site),Vr(t),void Vr(e);if(!e)return void(c.edge=Jr(t.site,c.site));Xr(t),Xr(e);var l=t.site,s=l.x,f=l.y,h=n.x-s,g=n.y-f,p=e.site,v=p.x-s,d=p.y-f,m=2*(h*d-g*v),y=h*h+g*g,M=v*v+d*d,x={x:(d*y-g*M)/m+s,y:(h*M-v*y)/m+f};Kr(e.edge,l,p,x),c.edge=Jr(l,n,null,x),e.edge=Jr(n,p,null,x),Vr(t),Vr(e)}}function Fr(n,t){var e=n.site,r=e.x,u=e.y,i=u-t;if(!i)return r;var o=n.P;if(!o)return-1/0;e=o.site;var a=e.x,c=e.y,l=c-t;if(!l)return a;var s=a-r,f=1/i-1/l,h=s/l;return f?(-h+Math.sqrt(h*h-2*f*(s*s/(-2*l)-c+l/2+u-i/2)))/f+r:(r+a)/2}function Hr(n,t){var e=n.N;if(e)return Fr(e,t);var r=n.site;return r.y===t?r.x:1/0}function Or(n){this.site=n,this.edges=[]}function Ir(n){for(var t,e,r,u,i,o,a,c,l,s,f=n[0][0],h=n[1][0],g=n[0][1],p=n[1][1],v=Kc,d=v.length;d--;)if(i=v[d],i&&i.prepare())for(a=i.edges,c=a.length,o=0;c>o;)s=a[o].end(),r=s.x,u=s.y,l=a[++o%c].start(),t=l.x,e=l.y,(ga(r-t)>Ca||ga(u-e)>Ca)&&(a.splice(o,0,new Qr(Gr(i.site,s,ga(r-f)<Ca&&p-u>Ca?{x:f,y:ga(t-f)<Ca?e:p}:ga(u-p)<Ca&&h-r>Ca?{x:ga(e-p)<Ca?t:h,y:p}:ga(r-h)<Ca&&u-g>Ca?{x:h,y:ga(t-h)<Ca?e:g}:ga(u-g)<Ca&&r-f>Ca?{x:ga(e-g)<Ca?t:f,y:g}:null),i.site,null)),++c)}function Yr(n,t){return t.angle-n.angle}function Zr(){tu(this),this.x=this.y=this.arc=this.site=this.cy=null}function Vr(n){var t=n.P,e=n.N;if(t&&e){var r=t.site,u=n.site,i=e.site;if(r!==i){var o=u.x,a=u.y,c=r.x-o,l=r.y-a,s=i.x-o,f=i.y-a,h=2*(c*f-l*s);if(!(h>=-za)){var g=c*c+l*l,p=s*s+f*f,v=(f*g-l*p)/h,d=(c*p-s*g)/h,f=d+a,m=rl.pop()||new Zr;m.arc=n,m.site=u,m.x=v+o,m.y=f+Math.sqrt(v*v+d*d),m.cy=f,n.circle=m;for(var y=null,M=tl._;M;)if(m.y<M.y||m.y===M.y&&m.x<=M.x){if(!M.L){y=M.P;break}M=M.L}else{if(!M.R){y=M;break}M=M.R}tl.insert(y,m),y||(nl=m)}}}}function Xr(n){var t=n.circle;t&&(t.P||(nl=t.N),tl.remove(t),rl.push(t),tu(t),n.circle=null)}function $r(n){for(var t,e=Gc,r=Oe(n[0][0],n[0][1],n[1][0],n[1][1]),u=e.length;u--;)t=e[u],(!Br(t,n)||!r(t)||ga(t.a.x-t.b.x)<Ca&&ga(t.a.y-t.b.y)<Ca)&&(t.a=t.b=null,e.splice(u,1))}function Br(n,t){var e=n.b;if(e)return!0;var r,u,i=n.a,o=t[0][0],a=t[1][0],c=t[0][1],l=t[1][1],s=n.l,f=n.r,h=s.x,g=s.y,p=f.x,v=f.y,d=(h+p)/2,m=(g+v)/2;if(v===g){if(o>d||d>=a)return;if(h>p){if(i){if(i.y>=l)return}else i={x:d,y:c};e={x:d,y:l}}else{if(i){if(i.y<c)return}else i={x:d,y:l};e={x:d,y:c}}}else if(r=(h-p)/(v-g),u=m-r*d,-1>r||r>1)if(h>p){if(i){if(i.y>=l)return}else i={x:(c-u)/r,y:c};e={x:(l-u)/r,y:l}}else{if(i){if(i.y<c)return}else i={x:(l-u)/r,y:l};e={x:(c-u)/r,y:c}}else if(v>g){if(i){if(i.x>=a)return}else i={x:o,y:r*o+u};e={x:a,y:r*a+u}}else{if(i){if(i.x<o)return}else i={x:a,y:r*a+u};e={x:o,y:r*o+u}}return n.a=i,n.b=e,!0}function Wr(n,t){this.l=n,this.r=t,this.a=this.b=null}function Jr(n,t,e,r){var u=new Wr(n,t);return Gc.push(u),e&&Kr(u,n,t,e),r&&Kr(u,t,n,r),Kc[n.i].edges.push(new Qr(u,n,t)),Kc[t.i].edges.push(new Qr(u,t,n)),u}function Gr(n,t,e){var r=new Wr(n,null);return r.a=t,r.b=e,Gc.push(r),r}function Kr(n,t,e,r){n.a||n.b?n.l===e?n.b=r:n.a=r:(n.a=r,n.l=t,n.r=e)}function Qr(n,t,e){var r=n.a,u=n.b;this.edge=n,this.site=t,this.angle=e?Math.atan2(e.y-t.y,e.x-t.x):n.l===t?Math.atan2(u.x-r.x,r.y-u.y):Math.atan2(r.x-u.x,u.y-r.y)}function nu(){this._=null}function tu(n){n.U=n.C=n.L=n.R=n.P=n.N=null}function eu(n,t){var e=t,r=t.R,u=e.U;u?u.L===e?u.L=r:u.R=r:n._=r,r.U=u,e.U=r,e.R=r.L,e.R&&(e.R.U=e),r.L=e}function ru(n,t){var e=t,r=t.L,u=e.U;u?u.L===e?u.L=r:u.R=r:n._=r,r.U=u,e.U=r,e.L=r.R,e.L&&(e.L.U=e),r.R=e}function uu(n){for(;n.L;)n=n.L;return n}function iu(n,t){var e,r,u,i=n.sort(ou).pop();for(Gc=[],Kc=new Array(n.length),Qc=new nu,tl=new nu;;)if(u=nl,i&&(!u||i.y<u.y||i.y===u.y&&i.x<u.x))(i.x!==e||i.y!==r)&&(Kc[i.i]=new Or(i),jr(i),e=i.x,r=i.y),i=n.pop();else{if(!u)break;Ur(u.arc)}t&&($r(t),Ir(t));var o={cells:Kc,edges:Gc};return Qc=tl=Gc=Kc=null,o}function ou(n,t){return t.y-n.y||t.x-n.x}function au(n,t,e){return(n.x-e.x)*(t.y-n.y)-(n.x-t.x)*(e.y-n.y)}function cu(n){return n.x}function lu(n){return n.y}function su(){return{leaf:!0,nodes:[],point:null,x:null,y:null}}function fu(n,t,e,r,u,i){if(!n(t,e,r,u,i)){var o=.5*(e+u),a=.5*(r+i),c=t.nodes;c[0]&&fu(n,c[0],e,r,o,a),c[1]&&fu(n,c[1],o,r,u,a),c[2]&&fu(n,c[2],e,a,o,i),c[3]&&fu(n,c[3],o,a,u,i)}}function hu(n,t,e,r,u,i,o){var a,c=1/0;return function l(n,s,f,h,g){if(!(s>i||f>o||r>h||u>g)){if(p=n.point){var p,v=t-n.x,d=e-n.y,m=v*v+d*d;if(c>m){var y=Math.sqrt(c=m);r=t-y,u=e-y,i=t+y,o=e+y,a=p}}for(var M=n.nodes,x=.5*(s+h),b=.5*(f+g),_=t>=x,w=e>=b,S=w<<1|_,k=S+4;k>S;++S)if(n=M[3&S])switch(3&S){case 0:l(n,s,f,x,b);break;case 1:l(n,x,f,h,b);break;case 2:l(n,s,b,x,g);break;case 3:l(n,x,b,h,g)}}}(n,r,u,i,o),a}function gu(n,t){n=ta.rgb(n),t=ta.rgb(t);var e=n.r,r=n.g,u=n.b,i=t.r-e,o=t.g-r,a=t.b-u;return function(n){return"#"+xt(Math.round(e+i*n))+xt(Math.round(r+o*n))+xt(Math.round(u+a*n))}}function pu(n,t){var e,r={},u={};for(e in n)e in t?r[e]=mu(n[e],t[e]):u[e]=n[e];for(e in t)e in n||(u[e]=t[e]);return function(n){for(e in r)u[e]=r[e](n);return u}}function vu(n,t){return n=+n,t=+t,function(e){return n*(1-e)+t*e}}function du(n,t){var e,r,u,i=il.lastIndex=ol.lastIndex=0,o=-1,a=[],c=[];for(n+="",t+="";(e=il.exec(n))&&(r=ol.exec(t));)(u=r.index)>i&&(u=t.slice(i,u),a[o]?a[o]+=u:a[++o]=u),(e=e[0])===(r=r[0])?a[o]?a[o]+=r:a[++o]=r:(a[++o]=null,c.push({i:o,x:vu(e,r)})),i=ol.lastIndex;return i<t.length&&(u=t.slice(i),a[o]?a[o]+=u:a[++o]=u),a.length<2?c[0]?(t=c[0].x,function(n){return t(n)+""}):function(){return t}:(t=c.length,function(n){for(var e,r=0;t>r;++r)a[(e=c[r]).i]=e.x(n);return a.join("")})}function mu(n,t){for(var e,r=ta.interpolators.length;--r>=0&&!(e=ta.interpolators[r](n,t)););return e}function yu(n,t){var e,r=[],u=[],i=n.length,o=t.length,a=Math.min(n.length,t.length);for(e=0;a>e;++e)r.push(mu(n[e],t[e]));for(;i>e;++e)u[e]=n[e];for(;o>e;++e)u[e]=t[e];return function(n){for(e=0;a>e;++e)u[e]=r[e](n);return u}}function Mu(n){return function(t){return 0>=t?0:t>=1?1:n(t)}}function xu(n){return function(t){return 1-n(1-t)}}function bu(n){return function(t){return.5*(.5>t?n(2*t):2-n(2-2*t))}}function _u(n){return n*n}function wu(n){return n*n*n}function Su(n){if(0>=n)return 0;if(n>=1)return 1;var t=n*n,e=t*n;return 4*(.5>n?e:3*(n-t)+e-.75)}function ku(n){return function(t){return Math.pow(t,n)}}function Eu(n){return 1-Math.cos(n*Ra)}function Au(n){return Math.pow(2,10*(n-1))}function Nu(n){return 1-Math.sqrt(1-n*n)}function Cu(n,t){var e;return arguments.length<2&&(t=.45),arguments.length?e=t/La*Math.asin(1/n):(n=1,e=t/4),function(r){return 1+n*Math.pow(2,-10*r)*Math.sin((r-e)*La/t)}}function zu(n){return n||(n=1.70158),function(t){return t*t*((n+1)*t-n)}}function qu(n){return 1/2.75>n?7.5625*n*n:2/2.75>n?7.5625*(n-=1.5/2.75)*n+.75:2.5/2.75>n?7.5625*(n-=2.25/2.75)*n+.9375:7.5625*(n-=2.625/2.75)*n+.984375}function Lu(n,t){n=ta.hcl(n),t=ta.hcl(t);var e=n.h,r=n.c,u=n.l,i=t.h-e,o=t.c-r,a=t.l-u;return isNaN(o)&&(o=0,r=isNaN(r)?t.c:r),isNaN(i)?(i=0,e=isNaN(e)?t.h:e):i>180?i-=360:-180>i&&(i+=360),function(n){return st(e+i*n,r+o*n,u+a*n)+""}}function Tu(n,t){n=ta.hsl(n),t=ta.hsl(t);var e=n.h,r=n.s,u=n.l,i=t.h-e,o=t.s-r,a=t.l-u;return isNaN(o)&&(o=0,r=isNaN(r)?t.s:r),isNaN(i)?(i=0,e=isNaN(e)?t.h:e):i>180?i-=360:-180>i&&(i+=360),function(n){return ct(e+i*n,r+o*n,u+a*n)+""}}function Ru(n,t){n=ta.lab(n),t=ta.lab(t);var e=n.l,r=n.a,u=n.b,i=t.l-e,o=t.a-r,a=t.b-u;return function(n){return ht(e+i*n,r+o*n,u+a*n)+""}}function Du(n,t){return t-=n,function(e){return Math.round(n+t*e)}}function Pu(n){var t=[n.a,n.b],e=[n.c,n.d],r=ju(t),u=Uu(t,e),i=ju(Fu(e,t,-u))||0;t[0]*e[1]<e[0]*t[1]&&(t[0]*=-1,t[1]*=-1,r*=-1,u*=-1),this.rotate=(r?Math.atan2(t[1],t[0]):Math.atan2(-e[0],e[1]))*Pa,this.translate=[n.e,n.f],this.scale=[r,i],this.skew=i?Math.atan2(u,i)*Pa:0}function Uu(n,t){return n[0]*t[0]+n[1]*t[1]}function ju(n){var t=Math.sqrt(Uu(n,n));return t&&(n[0]/=t,n[1]/=t),t}function Fu(n,t,e){return n[0]+=e*t[0],n[1]+=e*t[1],n}function Hu(n,t){var e,r=[],u=[],i=ta.transform(n),o=ta.transform(t),a=i.translate,c=o.translate,l=i.rotate,s=o.rotate,f=i.skew,h=o.skew,g=i.scale,p=o.scale;return a[0]!=c[0]||a[1]!=c[1]?(r.push("translate(",null,",",null,")"),u.push({i:1,x:vu(a[0],c[0])},{i:3,x:vu(a[1],c[1])})):r.push(c[0]||c[1]?"translate("+c+")":""),l!=s?(l-s>180?s+=360:s-l>180&&(l+=360),u.push({i:r.push(r.pop()+"rotate(",null,")")-2,x:vu(l,s)})):s&&r.push(r.pop()+"rotate("+s+")"),f!=h?u.push({i:r.push(r.pop()+"skewX(",null,")")-2,x:vu(f,h)}):h&&r.push(r.pop()+"skewX("+h+")"),g[0]!=p[0]||g[1]!=p[1]?(e=r.push(r.pop()+"scale(",null,",",null,")"),u.push({i:e-4,x:vu(g[0],p[0])},{i:e-2,x:vu(g[1],p[1])})):(1!=p[0]||1!=p[1])&&r.push(r.pop()+"scale("+p+")"),e=u.length,function(n){for(var t,i=-1;++i<e;)r[(t=u[i]).i]=t.x(n);return r.join("")}}function Ou(n,t){return t=(t-=n=+n)||1/t,function(e){return(e-n)/t}}function Iu(n,t){return t=(t-=n=+n)||1/t,function(e){return Math.max(0,Math.min(1,(e-n)/t))}}function Yu(n){for(var t=n.source,e=n.target,r=Vu(t,e),u=[t];t!==r;)t=t.parent,u.push(t);for(var i=u.length;e!==r;)u.splice(i,0,e),e=e.parent;return u}function Zu(n){for(var t=[],e=n.parent;null!=e;)t.push(n),n=e,e=e.parent;return t.push(n),t}function Vu(n,t){if(n===t)return n;for(var e=Zu(n),r=Zu(t),u=e.pop(),i=r.pop(),o=null;u===i;)o=u,u=e.pop(),i=r.pop();return o}function Xu(n){n.fixed|=2}function $u(n){n.fixed&=-7}function Bu(n){n.fixed|=4,n.px=n.x,n.py=n.y}function Wu(n){n.fixed&=-5}function Ju(n,t,e){var r=0,u=0;if(n.charge=0,!n.leaf)for(var i,o=n.nodes,a=o.length,c=-1;++c<a;)i=o[c],null!=i&&(Ju(i,t,e),n.charge+=i.charge,r+=i.charge*i.cx,u+=i.charge*i.cy);if(n.point){n.leaf||(n.point.x+=Math.random()-.5,n.point.y+=Math.random()-.5);var l=t*e[n.point.index];n.charge+=n.pointCharge=l,r+=l*n.point.x,u+=l*n.point.y}n.cx=r/n.charge,n.cy=u/n.charge}function Gu(n,t){return ta.rebind(n,t,"sort","children","value"),n.nodes=n,n.links=ri,n}function Ku(n,t){for(var e=[n];null!=(n=e.pop());)if(t(n),(u=n.children)&&(r=u.length))for(var r,u;--r>=0;)e.push(u[r])}function Qu(n,t){for(var e=[n],r=[];null!=(n=e.pop());)if(r.push(n),(i=n.children)&&(u=i.length))for(var u,i,o=-1;++o<u;)e.push(i[o]);for(;null!=(n=r.pop());)t(n)}function ni(n){return n.children}function ti(n){return n.value}function ei(n,t){return t.value-n.value}function ri(n){return ta.merge(n.map(function(n){return(n.children||[]).map(function(t){return{source:n,target:t}})}))}function ui(n){return n.x}function ii(n){return n.y}function oi(n,t,e){n.y0=t,n.y=e}function ai(n){return ta.range(n.length)}function ci(n){for(var t=-1,e=n[0].length,r=[];++t<e;)r[t]=0;return r}function li(n){for(var t,e=1,r=0,u=n[0][1],i=n.length;i>e;++e)(t=n[e][1])>u&&(r=e,u=t);return r}function si(n){return n.reduce(fi,0)}function fi(n,t){return n+t[1]}function hi(n,t){return gi(n,Math.ceil(Math.log(t.length)/Math.LN2+1))}function gi(n,t){for(var e=-1,r=+n[0],u=(n[1]-r)/t,i=[];++e<=t;)i[e]=u*e+r;return i}function pi(n){return[ta.min(n),ta.max(n)]}function vi(n,t){return n.value-t.value}function di(n,t){var e=n._pack_next;n._pack_next=t,t._pack_prev=n,t._pack_next=e,e._pack_prev=t}function mi(n,t){n._pack_next=t,t._pack_prev=n}function yi(n,t){var e=t.x-n.x,r=t.y-n.y,u=n.r+t.r;return.999*u*u>e*e+r*r}function Mi(n){function t(n){s=Math.min(n.x-n.r,s),f=Math.max(n.x+n.r,f),h=Math.min(n.y-n.r,h),g=Math.max(n.y+n.r,g)}if((e=n.children)&&(l=e.length)){var e,r,u,i,o,a,c,l,s=1/0,f=-1/0,h=1/0,g=-1/0;if(e.forEach(xi),r=e[0],r.x=-r.r,r.y=0,t(r),l>1&&(u=e[1],u.x=u.r,u.y=0,t(u),l>2))for(i=e[2],wi(r,u,i),t(i),di(r,i),r._pack_prev=i,di(i,u),u=r._pack_next,o=3;l>o;o++){wi(r,u,i=e[o]);var p=0,v=1,d=1;for(a=u._pack_next;a!==u;a=a._pack_next,v++)if(yi(a,i)){p=1;break}if(1==p)for(c=r._pack_prev;c!==a._pack_prev&&!yi(c,i);c=c._pack_prev,d++);p?(d>v||v==d&&u.r<r.r?mi(r,u=a):mi(r=c,u),o--):(di(r,i),u=i,t(i))}var m=(s+f)/2,y=(h+g)/2,M=0;for(o=0;l>o;o++)i=e[o],i.x-=m,i.y-=y,M=Math.max(M,i.r+Math.sqrt(i.x*i.x+i.y*i.y));n.r=M,e.forEach(bi)}}function xi(n){n._pack_next=n._pack_prev=n}function bi(n){delete n._pack_next,delete n._pack_prev}function _i(n,t,e,r){var u=n.children;if(n.x=t+=r*n.x,n.y=e+=r*n.y,n.r*=r,u)for(var i=-1,o=u.length;++i<o;)_i(u[i],t,e,r)}function wi(n,t,e){var r=n.r+e.r,u=t.x-n.x,i=t.y-n.y;if(r&&(u||i)){var o=t.r+e.r,a=u*u+i*i;o*=o,r*=r;var c=.5+(r-o)/(2*a),l=Math.sqrt(Math.max(0,2*o*(r+a)-(r-=a)*r-o*o))/(2*a);e.x=n.x+c*u+l*i,e.y=n.y+c*i-l*u}else e.x=n.x+r,e.y=n.y}function Si(n,t){return n.parent==t.parent?1:2}function ki(n){var t=n.children;return t.length?t[0]:n.t}function Ei(n){var t,e=n.children;return(t=e.length)?e[t-1]:n.t}function Ai(n,t,e){var r=e/(t.i-n.i);t.c-=r,t.s+=e,n.c+=r,t.z+=e,t.m+=e}function Ni(n){for(var t,e=0,r=0,u=n.children,i=u.length;--i>=0;)t=u[i],t.z+=e,t.m+=e,e+=t.s+(r+=t.c)}function Ci(n,t,e){return n.a.parent===t.parent?n.a:e}function zi(n){return 1+ta.max(n,function(n){return n.y})}function qi(n){return n.reduce(function(n,t){return n+t.x},0)/n.length}function Li(n){var t=n.children;return t&&t.length?Li(t[0]):n}function Ti(n){var t,e=n.children;return e&&(t=e.length)?Ti(e[t-1]):n}function Ri(n){return{x:n.x,y:n.y,dx:n.dx,dy:n.dy}}function Di(n,t){var e=n.x+t[3],r=n.y+t[0],u=n.dx-t[1]-t[3],i=n.dy-t[0]-t[2];return 0>u&&(e+=u/2,u=0),0>i&&(r+=i/2,i=0),{x:e,y:r,dx:u,dy:i}}function Pi(n){var t=n[0],e=n[n.length-1];return e>t?[t,e]:[e,t]}function Ui(n){return n.rangeExtent?n.rangeExtent():Pi(n.range())}function ji(n,t,e,r){var u=e(n[0],n[1]),i=r(t[0],t[1]);return function(n){return i(u(n))}}function Fi(n,t){var e,r=0,u=n.length-1,i=n[r],o=n[u];return i>o&&(e=r,r=u,u=e,e=i,i=o,o=e),n[r]=t.floor(i),n[u]=t.ceil(o),n}function Hi(n){return n?{floor:function(t){return Math.floor(t/n)*n},ceil:function(t){return Math.ceil(t/n)*n}}:ml}function Oi(n,t,e,r){var u=[],i=[],o=0,a=Math.min(n.length,t.length)-1;for(n[a]<n[0]&&(n=n.slice().reverse(),t=t.slice().reverse());++o<=a;)u.push(e(n[o-1],n[o])),i.push(r(t[o-1],t[o]));return function(t){var e=ta.bisect(n,t,1,a)-1;return i[e](u[e](t))}}function Ii(n,t,e,r){function u(){var u=Math.min(n.length,t.length)>2?Oi:ji,c=r?Iu:Ou;return o=u(n,t,c,e),a=u(t,n,c,mu),i}function i(n){return o(n)}var o,a;return i.invert=function(n){return a(n)},i.domain=function(t){return arguments.length?(n=t.map(Number),u()):n},i.range=function(n){return arguments.length?(t=n,u()):t},i.rangeRound=function(n){return i.range(n).interpolate(Du)},i.clamp=function(n){return arguments.length?(r=n,u()):r},i.interpolate=function(n){return arguments.length?(e=n,u()):e},i.ticks=function(t){return Xi(n,t)},i.tickFormat=function(t,e){return $i(n,t,e)},i.nice=function(t){return Zi(n,t),u()},i.copy=function(){return Ii(n,t,e,r)},u()}function Yi(n,t){return ta.rebind(n,t,"range","rangeRound","interpolate","clamp")}function Zi(n,t){return Fi(n,Hi(Vi(n,t)[2]))}function Vi(n,t){null==t&&(t=10);var e=Pi(n),r=e[1]-e[0],u=Math.pow(10,Math.floor(Math.log(r/t)/Math.LN10)),i=t/r*u;return.15>=i?u*=10:.35>=i?u*=5:.75>=i&&(u*=2),e[0]=Math.ceil(e[0]/u)*u,e[1]=Math.floor(e[1]/u)*u+.5*u,e[2]=u,e}function Xi(n,t){return ta.range.apply(ta,Vi(n,t))}function $i(n,t,e){var r=Vi(n,t);if(e){var u=ic.exec(e);if(u.shift(),"s"===u[8]){var i=ta.formatPrefix(Math.max(ga(r[0]),ga(r[1])));return u[7]||(u[7]="."+Bi(i.scale(r[2]))),u[8]="f",e=ta.format(u.join("")),function(n){return e(i.scale(n))+i.symbol}}u[7]||(u[7]="."+Wi(u[8],r)),e=u.join("")}else e=",."+Bi(r[2])+"f";return ta.format(e)}function Bi(n){return-Math.floor(Math.log(n)/Math.LN10+.01)}function Wi(n,t){var e=Bi(t[2]);return n in yl?Math.abs(e-Bi(Math.max(ga(t[0]),ga(t[1]))))+ +("e"!==n):e-2*("%"===n)}function Ji(n,t,e,r){function u(n){return(e?Math.log(0>n?0:n):-Math.log(n>0?0:-n))/Math.log(t)}function i(n){return e?Math.pow(t,n):-Math.pow(t,-n)}function o(t){return n(u(t))}return o.invert=function(t){return i(n.invert(t))},o.domain=function(t){return arguments.length?(e=t[0]>=0,n.domain((r=t.map(Number)).map(u)),o):r},o.base=function(e){return arguments.length?(t=+e,n.domain(r.map(u)),o):t},o.nice=function(){var t=Fi(r.map(u),e?Math:xl);return n.domain(t),r=t.map(i),o},o.ticks=function(){var n=Pi(r),o=[],a=n[0],c=n[1],l=Math.floor(u(a)),s=Math.ceil(u(c)),f=t%1?2:t;if(isFinite(s-l)){if(e){for(;s>l;l++)for(var h=1;f>h;h++)o.push(i(l)*h);o.push(i(l))}else for(o.push(i(l));l++<s;)for(var h=f-1;h>0;h--)o.push(i(l)*h);for(l=0;o[l]<a;l++);for(s=o.length;o[s-1]>c;s--);o=o.slice(l,s)}return o},o.tickFormat=function(n,t){if(!arguments.length)return Ml;arguments.length<2?t=Ml:"function"!=typeof t&&(t=ta.format(t));var r,a=Math.max(.1,n/o.ticks().length),c=e?(r=1e-12,Math.ceil):(r=-1e-12,Math.floor);return function(n){return n/i(c(u(n)+r))<=a?t(n):""}},o.copy=function(){return Ji(n.copy(),t,e,r)},Yi(o,n)}function Gi(n,t,e){function r(t){return n(u(t))}var u=Ki(t),i=Ki(1/t);return r.invert=function(t){return i(n.invert(t))},r.domain=function(t){return arguments.length?(n.domain((e=t.map(Number)).map(u)),r):e},r.ticks=function(n){return Xi(e,n)},r.tickFormat=function(n,t){return $i(e,n,t)},r.nice=function(n){return r.domain(Zi(e,n))},r.exponent=function(o){return arguments.length?(u=Ki(t=o),i=Ki(1/t),n.domain(e.map(u)),r):t},r.copy=function(){return Gi(n.copy(),t,e)},Yi(r,n)}function Ki(n){return function(t){return 0>t?-Math.pow(-t,n):Math.pow(t,n)}}function Qi(n,t){function e(e){return i[((u.get(e)||("range"===t.t?u.set(e,n.push(e)):0/0))-1)%i.length]}function r(t,e){return ta.range(n.length).map(function(n){return t+e*n})}var u,i,o;return e.domain=function(r){if(!arguments.length)return n;n=[],u=new l;for(var i,o=-1,a=r.length;++o<a;)u.has(i=r[o])||u.set(i,n.push(i));return e[t.t].apply(e,t.a)},e.range=function(n){return arguments.length?(i=n,o=0,t={t:"range",a:arguments},e):i},e.rangePoints=function(u,a){arguments.length<2&&(a=0);var c=u[0],l=u[1],s=n.length<2?(c=(c+l)/2,0):(l-c)/(n.length-1+a);return i=r(c+s*a/2,s),o=0,t={t:"rangePoints",a:arguments},e},e.rangeRoundPoints=function(u,a){arguments.length<2&&(a=0);var c=u[0],l=u[1],s=n.length<2?(c=l=Math.round((c+l)/2),0):(l-c)/(n.length-1+a)|0;return i=r(c+Math.round(s*a/2+(l-c-(n.length-1+a)*s)/2),s),o=0,t={t:"rangeRoundPoints",a:arguments},e},e.rangeBands=function(u,a,c){arguments.length<2&&(a=0),arguments.length<3&&(c=a);var l=u[1]<u[0],s=u[l-0],f=u[1-l],h=(f-s)/(n.length-a+2*c);return i=r(s+h*c,h),l&&i.reverse(),o=h*(1-a),t={t:"rangeBands",a:arguments},e},e.rangeRoundBands=function(u,a,c){arguments.length<2&&(a=0),arguments.length<3&&(c=a);var l=u[1]<u[0],s=u[l-0],f=u[1-l],h=Math.floor((f-s)/(n.length-a+2*c));return i=r(s+Math.round((f-s-(n.length-a)*h)/2),h),l&&i.reverse(),o=Math.round(h*(1-a)),t={t:"rangeRoundBands",a:arguments},e},e.rangeBand=function(){return o},e.rangeExtent=function(){return Pi(t.a[0])},e.copy=function(){return Qi(n,t)},e.domain(n)}function no(n,t){function i(){var e=0,r=t.length;for(a=[];++e<r;)a[e-1]=ta.quantile(n,e/r);return o}function o(n){return isNaN(n=+n)?void 0:t[ta.bisect(a,n)]}var a;return o.domain=function(t){return arguments.length?(n=t.map(r).filter(u).sort(e),i()):n},o.range=function(n){return arguments.length?(t=n,i()):t},o.quantiles=function(){return a},o.invertExtent=function(e){return e=t.indexOf(e),0>e?[0/0,0/0]:[e>0?a[e-1]:n[0],e<a.length?a[e]:n[n.length-1]]},o.copy=function(){return no(n,t)},i()}function to(n,t,e){function r(t){return e[Math.max(0,Math.min(o,Math.floor(i*(t-n))))]}function u(){return i=e.length/(t-n),o=e.length-1,r}var i,o;return r.domain=function(e){return arguments.length?(n=+e[0],t=+e[e.length-1],u()):[n,t]},r.range=function(n){return arguments.length?(e=n,u()):e},r.invertExtent=function(t){return t=e.indexOf(t),t=0>t?0/0:t/i+n,[t,t+1/i]},r.copy=function(){return to(n,t,e)},u()}function eo(n,t){function e(e){return e>=e?t[ta.bisect(n,e)]:void 0}return e.domain=function(t){return arguments.length?(n=t,e):n},e.range=function(n){return arguments.length?(t=n,e):t},e.invertExtent=function(e){return e=t.indexOf(e),[n[e-1],n[e]]},e.copy=function(){return eo(n,t)},e}function ro(n){function t(n){return+n}return t.invert=t,t.domain=t.range=function(e){return arguments.length?(n=e.map(t),t):n},t.ticks=function(t){return Xi(n,t)},t.tickFormat=function(t,e){return $i(n,t,e)},t.copy=function(){return ro(n)},t}function uo(){return 0}function io(n){return n.innerRadius}function oo(n){return n.outerRadius}function ao(n){return n.startAngle}function co(n){return n.endAngle}function lo(n){return n&&n.padAngle}function so(n,t,e,r){return(n-e)*t-(t-r)*n>0?0:1}function fo(n,t,e,r,u){var i=n[0]-t[0],o=n[1]-t[1],a=(u?r:-r)/Math.sqrt(i*i+o*o),c=a*o,l=-a*i,s=n[0]+c,f=n[1]+l,h=t[0]+c,g=t[1]+l,p=(s+h)/2,v=(f+g)/2,d=h-s,m=g-f,y=d*d+m*m,M=e-r,x=s*g-h*f,b=(0>m?-1:1)*Math.sqrt(M*M*y-x*x),_=(x*m-d*b)/y,w=(-x*d-m*b)/y,S=(x*m+d*b)/y,k=(-x*d+m*b)/y,E=_-p,A=w-v,N=S-p,C=k-v;return E*E+A*A>N*N+C*C&&(_=S,w=k),[[_-c,w-l],[_*e/M,w*e/M]]}function ho(n){function t(t){function o(){l.push("M",i(n(s),a))}for(var c,l=[],s=[],f=-1,h=t.length,g=Et(e),p=Et(r);++f<h;)u.call(this,c=t[f],f)?s.push([+g.call(this,c,f),+p.call(this,c,f)]):s.length&&(o(),s=[]);return s.length&&o(),l.length?l.join(""):null}var e=Ar,r=Nr,u=Ne,i=go,o=i.key,a=.7;return t.x=function(n){return arguments.length?(e=n,t):e},t.y=function(n){return arguments.length?(r=n,t):r},t.defined=function(n){return arguments.length?(u=n,t):u},t.interpolate=function(n){return arguments.length?(o="function"==typeof n?i=n:(i=El.get(n)||go).key,t):o},t.tension=function(n){return arguments.length?(a=n,t):a},t}function go(n){return n.join("L")}function po(n){return go(n)+"Z"}function vo(n){for(var t=0,e=n.length,r=n[0],u=[r[0],",",r[1]];++t<e;)u.push("H",(r[0]+(r=n[t])[0])/2,"V",r[1]);return e>1&&u.push("H",r[0]),u.join("")}function mo(n){for(var t=0,e=n.length,r=n[0],u=[r[0],",",r[1]];++t<e;)u.push("V",(r=n[t])[1],"H",r[0]);return u.join("")}function yo(n){for(var t=0,e=n.length,r=n[0],u=[r[0],",",r[1]];++t<e;)u.push("H",(r=n[t])[0],"V",r[1]);return u.join("")}function Mo(n,t){return n.length<4?go(n):n[1]+_o(n.slice(1,-1),wo(n,t))}function xo(n,t){return n.length<3?go(n):n[0]+_o((n.push(n[0]),n),wo([n[n.length-2]].concat(n,[n[1]]),t))}function bo(n,t){return n.length<3?go(n):n[0]+_o(n,wo(n,t))}function _o(n,t){if(t.length<1||n.length!=t.length&&n.length!=t.length+2)return go(n);var e=n.length!=t.length,r="",u=n[0],i=n[1],o=t[0],a=o,c=1;if(e&&(r+="Q"+(i[0]-2*o[0]/3)+","+(i[1]-2*o[1]/3)+","+i[0]+","+i[1],u=n[1],c=2),t.length>1){a=t[1],i=n[c],c++,r+="C"+(u[0]+o[0])+","+(u[1]+o[1])+","+(i[0]-a[0])+","+(i[1]-a[1])+","+i[0]+","+i[1];for(var l=2;l<t.length;l++,c++)i=n[c],a=t[l],r+="S"+(i[0]-a[0])+","+(i[1]-a[1])+","+i[0]+","+i[1]}if(e){var s=n[c];r+="Q"+(i[0]+2*a[0]/3)+","+(i[1]+2*a[1]/3)+","+s[0]+","+s[1]}return r}function wo(n,t){for(var e,r=[],u=(1-t)/2,i=n[0],o=n[1],a=1,c=n.length;++a<c;)e=i,i=o,o=n[a],r.push([u*(o[0]-e[0]),u*(o[1]-e[1])]);return r}function So(n){if(n.length<3)return go(n);var t=1,e=n.length,r=n[0],u=r[0],i=r[1],o=[u,u,u,(r=n[1])[0]],a=[i,i,i,r[1]],c=[u,",",i,"L",No(Cl,o),",",No(Cl,a)];for(n.push(n[e-1]);++t<=e;)r=n[t],o.shift(),o.push(r[0]),a.shift(),a.push(r[1]),Co(c,o,a);return n.pop(),c.push("L",r),c.join("")}function ko(n){if(n.length<4)return go(n);for(var t,e=[],r=-1,u=n.length,i=[0],o=[0];++r<3;)t=n[r],i.push(t[0]),o.push(t[1]);for(e.push(No(Cl,i)+","+No(Cl,o)),--r;++r<u;)t=n[r],i.shift(),i.push(t[0]),o.shift(),o.push(t[1]),Co(e,i,o);return e.join("")}function Eo(n){for(var t,e,r=-1,u=n.length,i=u+4,o=[],a=[];++r<4;)e=n[r%u],o.push(e[0]),a.push(e[1]);for(t=[No(Cl,o),",",No(Cl,a)],--r;++r<i;)e=n[r%u],o.shift(),o.push(e[0]),a.shift(),a.push(e[1]),Co(t,o,a);return t.join("")}function Ao(n,t){var e=n.length-1;if(e)for(var r,u,i=n[0][0],o=n[0][1],a=n[e][0]-i,c=n[e][1]-o,l=-1;++l<=e;)r=n[l],u=l/e,r[0]=t*r[0]+(1-t)*(i+u*a),r[1]=t*r[1]+(1-t)*(o+u*c);return So(n)}function No(n,t){return n[0]*t[0]+n[1]*t[1]+n[2]*t[2]+n[3]*t[3]}function Co(n,t,e){n.push("C",No(Al,t),",",No(Al,e),",",No(Nl,t),",",No(Nl,e),",",No(Cl,t),",",No(Cl,e))}function zo(n,t){return(t[1]-n[1])/(t[0]-n[0])}function qo(n){for(var t=0,e=n.length-1,r=[],u=n[0],i=n[1],o=r[0]=zo(u,i);++t<e;)r[t]=(o+(o=zo(u=i,i=n[t+1])))/2;return r[t]=o,r}function Lo(n){for(var t,e,r,u,i=[],o=qo(n),a=-1,c=n.length-1;++a<c;)t=zo(n[a],n[a+1]),ga(t)<Ca?o[a]=o[a+1]=0:(e=o[a]/t,r=o[a+1]/t,u=e*e+r*r,u>9&&(u=3*t/Math.sqrt(u),o[a]=u*e,o[a+1]=u*r));for(a=-1;++a<=c;)u=(n[Math.min(c,a+1)][0]-n[Math.max(0,a-1)][0])/(6*(1+o[a]*o[a])),i.push([u||0,o[a]*u||0]);return i}function To(n){return n.length<3?go(n):n[0]+_o(n,Lo(n))}function Ro(n){for(var t,e,r,u=-1,i=n.length;++u<i;)t=n[u],e=t[0],r=t[1]-Ra,t[0]=e*Math.cos(r),t[1]=e*Math.sin(r);return n}function Do(n){function t(t){function c(){v.push("M",a(n(m),f),s,l(n(d.reverse()),f),"Z")}for(var h,g,p,v=[],d=[],m=[],y=-1,M=t.length,x=Et(e),b=Et(u),_=e===r?function(){return g}:Et(r),w=u===i?function(){return p}:Et(i);++y<M;)o.call(this,h=t[y],y)?(d.push([g=+x.call(this,h,y),p=+b.call(this,h,y)]),m.push([+_.call(this,h,y),+w.call(this,h,y)])):d.length&&(c(),d=[],m=[]);return d.length&&c(),v.length?v.join(""):null}var e=Ar,r=Ar,u=0,i=Nr,o=Ne,a=go,c=a.key,l=a,s="L",f=.7;return t.x=function(n){return arguments.length?(e=r=n,t):r},t.x0=function(n){return arguments.length?(e=n,t):e},t.x1=function(n){return arguments.length?(r=n,t):r },t.y=function(n){return arguments.length?(u=i=n,t):i},t.y0=function(n){return arguments.length?(u=n,t):u},t.y1=function(n){return arguments.length?(i=n,t):i},t.defined=function(n){return arguments.length?(o=n,t):o},t.interpolate=function(n){return arguments.length?(c="function"==typeof n?a=n:(a=El.get(n)||go).key,l=a.reverse||a,s=a.closed?"M":"L",t):c},t.tension=function(n){return arguments.length?(f=n,t):f},t}function Po(n){return n.radius}function Uo(n){return[n.x,n.y]}function jo(n){return function(){var t=n.apply(this,arguments),e=t[0],r=t[1]-Ra;return[e*Math.cos(r),e*Math.sin(r)]}}function Fo(){return 64}function Ho(){return"circle"}function Oo(n){var t=Math.sqrt(n/qa);return"M0,"+t+"A"+t+","+t+" 0 1,1 0,"+-t+"A"+t+","+t+" 0 1,1 0,"+t+"Z"}function Io(n){return function(){var t,e;(t=this[n])&&(e=t[t.active])&&(--t.count?delete t[t.active]:delete this[n],t.active+=.5,e.event&&e.event.interrupt.call(this,this.__data__,e.index))}}function Yo(n,t,e){return ya(n,Pl),n.namespace=t,n.id=e,n}function Zo(n,t,e,r){var u=n.id,i=n.namespace;return Y(n,"function"==typeof e?function(n,o,a){n[i][u].tween.set(t,r(e.call(n,n.__data__,o,a)))}:(e=r(e),function(n){n[i][u].tween.set(t,e)}))}function Vo(n){return null==n&&(n=""),function(){this.textContent=n}}function Xo(n){return null==n?"__transition__":"__transition_"+n+"__"}function $o(n,t,e,r,u){var i=n[e]||(n[e]={active:0,count:0}),o=i[r];if(!o){var a=u.time;o=i[r]={tween:new l,time:a,delay:u.delay,duration:u.duration,ease:u.ease,index:t},u=null,++i.count,ta.timer(function(u){function c(e){if(i.active>r)return s();var u=i[i.active];u&&(--i.count,delete i[i.active],u.event&&u.event.interrupt.call(n,n.__data__,u.index)),i.active=r,o.event&&o.event.start.call(n,n.__data__,t),o.tween.forEach(function(e,r){(r=r.call(n,n.__data__,t))&&v.push(r)}),h=o.ease,f=o.duration,ta.timer(function(){return p.c=l(e||1)?Ne:l,1},0,a)}function l(e){if(i.active!==r)return 1;for(var u=e/f,a=h(u),c=v.length;c>0;)v[--c].call(n,a);return u>=1?(o.event&&o.event.end.call(n,n.__data__,t),s()):void 0}function s(){return--i.count?delete i[r]:delete n[e],1}var f,h,g=o.delay,p=ec,v=[];return p.t=g+a,u>=g?c(u-g):void(p.c=c)},0,a)}}function Bo(n,t,e){n.attr("transform",function(n){var r=t(n);return"translate("+(isFinite(r)?r:e(n))+",0)"})}function Wo(n,t,e){n.attr("transform",function(n){var r=t(n);return"translate(0,"+(isFinite(r)?r:e(n))+")"})}function Jo(n){return n.toISOString()}function Go(n,t,e){function r(t){return n(t)}function u(n,e){var r=n[1]-n[0],u=r/e,i=ta.bisect(Vl,u);return i==Vl.length?[t.year,Vi(n.map(function(n){return n/31536e6}),e)[2]]:i?t[u/Vl[i-1]<Vl[i]/u?i-1:i]:[Bl,Vi(n,e)[2]]}return r.invert=function(t){return Ko(n.invert(t))},r.domain=function(t){return arguments.length?(n.domain(t),r):n.domain().map(Ko)},r.nice=function(n,t){function e(e){return!isNaN(e)&&!n.range(e,Ko(+e+1),t).length}var i=r.domain(),o=Pi(i),a=null==n?u(o,10):"number"==typeof n&&u(o,n);return a&&(n=a[0],t=a[1]),r.domain(Fi(i,t>1?{floor:function(t){for(;e(t=n.floor(t));)t=Ko(t-1);return t},ceil:function(t){for(;e(t=n.ceil(t));)t=Ko(+t+1);return t}}:n))},r.ticks=function(n,t){var e=Pi(r.domain()),i=null==n?u(e,10):"number"==typeof n?u(e,n):!n.range&&[{range:n},t];return i&&(n=i[0],t=i[1]),n.range(e[0],Ko(+e[1]+1),1>t?1:t)},r.tickFormat=function(){return e},r.copy=function(){return Go(n.copy(),t,e)},Yi(r,n)}function Ko(n){return new Date(n)}function Qo(n){return JSON.parse(n.responseText)}function na(n){var t=ua.createRange();return t.selectNode(ua.body),t.createContextualFragment(n.responseText)}var ta={version:"3.5.5"},ea=[].slice,ra=function(n){return ea.call(n)},ua=this.document;if(ua)try{ra(ua.documentElement.childNodes)[0].nodeType}catch(ia){ra=function(n){for(var t=n.length,e=new Array(t);t--;)e[t]=n[t];return e}}if(Date.now||(Date.now=function(){return+new Date}),ua)try{ua.createElement("DIV").style.setProperty("opacity",0,"")}catch(oa){var aa=this.Element.prototype,ca=aa.setAttribute,la=aa.setAttributeNS,sa=this.CSSStyleDeclaration.prototype,fa=sa.setProperty;aa.setAttribute=function(n,t){ca.call(this,n,t+"")},aa.setAttributeNS=function(n,t,e){la.call(this,n,t,e+"")},sa.setProperty=function(n,t,e){fa.call(this,n,t+"",e)}}ta.ascending=e,ta.descending=function(n,t){return n>t?-1:t>n?1:t>=n?0:0/0},ta.min=function(n,t){var e,r,u=-1,i=n.length;if(1===arguments.length){for(;++u<i;)if(null!=(r=n[u])&&r>=r){e=r;break}for(;++u<i;)null!=(r=n[u])&&e>r&&(e=r)}else{for(;++u<i;)if(null!=(r=t.call(n,n[u],u))&&r>=r){e=r;break}for(;++u<i;)null!=(r=t.call(n,n[u],u))&&e>r&&(e=r)}return e},ta.max=function(n,t){var e,r,u=-1,i=n.length;if(1===arguments.length){for(;++u<i;)if(null!=(r=n[u])&&r>=r){e=r;break}for(;++u<i;)null!=(r=n[u])&&r>e&&(e=r)}else{for(;++u<i;)if(null!=(r=t.call(n,n[u],u))&&r>=r){e=r;break}for(;++u<i;)null!=(r=t.call(n,n[u],u))&&r>e&&(e=r)}return e},ta.extent=function(n,t){var e,r,u,i=-1,o=n.length;if(1===arguments.length){for(;++i<o;)if(null!=(r=n[i])&&r>=r){e=u=r;break}for(;++i<o;)null!=(r=n[i])&&(e>r&&(e=r),r>u&&(u=r))}else{for(;++i<o;)if(null!=(r=t.call(n,n[i],i))&&r>=r){e=u=r;break}for(;++i<o;)null!=(r=t.call(n,n[i],i))&&(e>r&&(e=r),r>u&&(u=r))}return[e,u]},ta.sum=function(n,t){var e,r=0,i=n.length,o=-1;if(1===arguments.length)for(;++o<i;)u(e=+n[o])&&(r+=e);else for(;++o<i;)u(e=+t.call(n,n[o],o))&&(r+=e);return r},ta.mean=function(n,t){var e,i=0,o=n.length,a=-1,c=o;if(1===arguments.length)for(;++a<o;)u(e=r(n[a]))?i+=e:--c;else for(;++a<o;)u(e=r(t.call(n,n[a],a)))?i+=e:--c;return c?i/c:void 0},ta.quantile=function(n,t){var e=(n.length-1)*t+1,r=Math.floor(e),u=+n[r-1],i=e-r;return i?u+i*(n[r]-u):u},ta.median=function(n,t){var i,o=[],a=n.length,c=-1;if(1===arguments.length)for(;++c<a;)u(i=r(n[c]))&&o.push(i);else for(;++c<a;)u(i=r(t.call(n,n[c],c)))&&o.push(i);return o.length?ta.quantile(o.sort(e),.5):void 0},ta.variance=function(n,t){var e,i,o=n.length,a=0,c=0,l=-1,s=0;if(1===arguments.length)for(;++l<o;)u(e=r(n[l]))&&(i=e-a,a+=i/++s,c+=i*(e-a));else for(;++l<o;)u(e=r(t.call(n,n[l],l)))&&(i=e-a,a+=i/++s,c+=i*(e-a));return s>1?c/(s-1):void 0},ta.deviation=function(){var n=ta.variance.apply(this,arguments);return n?Math.sqrt(n):n};var ha=i(e);ta.bisectLeft=ha.left,ta.bisect=ta.bisectRight=ha.right,ta.bisector=function(n){return i(1===n.length?function(t,r){return e(n(t),r)}:n)},ta.shuffle=function(n,t,e){(i=arguments.length)<3&&(e=n.length,2>i&&(t=0));for(var r,u,i=e-t;i;)u=Math.random()*i--|0,r=n[i+t],n[i+t]=n[u+t],n[u+t]=r;return n},ta.permute=function(n,t){for(var e=t.length,r=new Array(e);e--;)r[e]=n[t[e]];return r},ta.pairs=function(n){for(var t,e=0,r=n.length-1,u=n[0],i=new Array(0>r?0:r);r>e;)i[e]=[t=u,u=n[++e]];return i},ta.zip=function(){if(!(r=arguments.length))return[];for(var n=-1,t=ta.min(arguments,o),e=new Array(t);++n<t;)for(var r,u=-1,i=e[n]=new Array(r);++u<r;)i[u]=arguments[u][n];return e},ta.transpose=function(n){return ta.zip.apply(ta,n)},ta.keys=function(n){var t=[];for(var e in n)t.push(e);return t},ta.values=function(n){var t=[];for(var e in n)t.push(n[e]);return t},ta.entries=function(n){var t=[];for(var e in n)t.push({key:e,value:n[e]});return t},ta.merge=function(n){for(var t,e,r,u=n.length,i=-1,o=0;++i<u;)o+=n[i].length;for(e=new Array(o);--u>=0;)for(r=n[u],t=r.length;--t>=0;)e[--o]=r[t];return e};var ga=Math.abs;ta.range=function(n,t,e){if(arguments.length<3&&(e=1,arguments.length<2&&(t=n,n=0)),(t-n)/e===1/0)throw new Error("infinite range");var r,u=[],i=a(ga(e)),o=-1;if(n*=i,t*=i,e*=i,0>e)for(;(r=n+e*++o)>t;)u.push(r/i);else for(;(r=n+e*++o)<t;)u.push(r/i);return u},ta.map=function(n,t){var e=new l;if(n instanceof l)n.forEach(function(n,t){e.set(n,t)});else if(Array.isArray(n)){var r,u=-1,i=n.length;if(1===arguments.length)for(;++u<i;)e.set(u,n[u]);else for(;++u<i;)e.set(t.call(n,r=n[u],u),r)}else for(var o in n)e.set(o,n[o]);return e};var pa="__proto__",va="\x00";c(l,{has:h,get:function(n){return this._[s(n)]},set:function(n,t){return this._[s(n)]=t},remove:g,keys:p,values:function(){var n=[];for(var t in this._)n.push(this._[t]);return n},entries:function(){var n=[];for(var t in this._)n.push({key:f(t),value:this._[t]});return n},size:v,empty:d,forEach:function(n){for(var t in this._)n.call(this,f(t),this._[t])}}),ta.nest=function(){function n(t,o,a){if(a>=i.length)return r?r.call(u,o):e?o.sort(e):o;for(var c,s,f,h,g=-1,p=o.length,v=i[a++],d=new l;++g<p;)(h=d.get(c=v(s=o[g])))?h.push(s):d.set(c,[s]);return t?(s=t(),f=function(e,r){s.set(e,n(t,r,a))}):(s={},f=function(e,r){s[e]=n(t,r,a)}),d.forEach(f),s}function t(n,e){if(e>=i.length)return n;var r=[],u=o[e++];return n.forEach(function(n,u){r.push({key:n,values:t(u,e)})}),u?r.sort(function(n,t){return u(n.key,t.key)}):r}var e,r,u={},i=[],o=[];return u.map=function(t,e){return n(e,t,0)},u.entries=function(e){return t(n(ta.map,e,0),0)},u.key=function(n){return i.push(n),u},u.sortKeys=function(n){return o[i.length-1]=n,u},u.sortValues=function(n){return e=n,u},u.rollup=function(n){return r=n,u},u},ta.set=function(n){var t=new m;if(n)for(var e=0,r=n.length;r>e;++e)t.add(n[e]);return t},c(m,{has:h,add:function(n){return this._[s(n+="")]=!0,n},remove:g,values:p,size:v,empty:d,forEach:function(n){for(var t in this._)n.call(this,f(t))}}),ta.behavior={},ta.rebind=function(n,t){for(var e,r=1,u=arguments.length;++r<u;)n[e=arguments[r]]=M(n,t,t[e]);return n};var da=["webkit","ms","moz","Moz","o","O"];ta.dispatch=function(){for(var n=new _,t=-1,e=arguments.length;++t<e;)n[arguments[t]]=w(n);return n},_.prototype.on=function(n,t){var e=n.indexOf("."),r="";if(e>=0&&(r=n.slice(e+1),n=n.slice(0,e)),n)return arguments.length<2?this[n].on(r):this[n].on(r,t);if(2===arguments.length){if(null==t)for(n in this)this.hasOwnProperty(n)&&this[n].on(r,null);return this}},ta.event=null,ta.requote=function(n){return n.replace(ma,"\\$&")};var ma=/[\\\^\$\*\+\?\|\[\]\(\)\.\{\}]/g,ya={}.__proto__?function(n,t){n.__proto__=t}:function(n,t){for(var e in t)n[e]=t[e]},Ma=function(n,t){return t.querySelector(n)},xa=function(n,t){return t.querySelectorAll(n)},ba=function(n,t){var e=n.matches||n[x(n,"matchesSelector")];return(ba=function(n,t){return e.call(n,t)})(n,t)};"function"==typeof Sizzle&&(Ma=function(n,t){return Sizzle(n,t)[0]||null},xa=Sizzle,ba=Sizzle.matchesSelector),ta.selection=function(){return ta.select(ua.documentElement)};var _a=ta.selection.prototype=[];_a.select=function(n){var t,e,r,u,i=[];n=N(n);for(var o=-1,a=this.length;++o<a;){i.push(t=[]),t.parentNode=(r=this[o]).parentNode;for(var c=-1,l=r.length;++c<l;)(u=r[c])?(t.push(e=n.call(u,u.__data__,c,o)),e&&"__data__"in u&&(e.__data__=u.__data__)):t.push(null)}return A(i)},_a.selectAll=function(n){var t,e,r=[];n=C(n);for(var u=-1,i=this.length;++u<i;)for(var o=this[u],a=-1,c=o.length;++a<c;)(e=o[a])&&(r.push(t=ra(n.call(e,e.__data__,a,u))),t.parentNode=e);return A(r)};var wa={svg:"http://www.w3.org/2000/svg",xhtml:"http://www.w3.org/1999/xhtml",xlink:"http://www.w3.org/1999/xlink",xml:"http://www.w3.org/XML/1998/namespace",xmlns:"http://www.w3.org/2000/xmlns/"};ta.ns={prefix:wa,qualify:function(n){var t=n.indexOf(":"),e=n;return t>=0&&(e=n.slice(0,t),n=n.slice(t+1)),wa.hasOwnProperty(e)?{space:wa[e],local:n}:n}},_a.attr=function(n,t){if(arguments.length<2){if("string"==typeof n){var e=this.node();return n=ta.ns.qualify(n),n.local?e.getAttributeNS(n.space,n.local):e.getAttribute(n)}for(t in n)this.each(z(t,n[t]));return this}return this.each(z(n,t))},_a.classed=function(n,t){if(arguments.length<2){if("string"==typeof n){var e=this.node(),r=(n=T(n)).length,u=-1;if(t=e.classList){for(;++u<r;)if(!t.contains(n[u]))return!1}else for(t=e.getAttribute("class");++u<r;)if(!L(n[u]).test(t))return!1;return!0}for(t in n)this.each(R(t,n[t]));return this}return this.each(R(n,t))},_a.style=function(n,e,r){var u=arguments.length;if(3>u){if("string"!=typeof n){2>u&&(e="");for(r in n)this.each(P(r,n[r],e));return this}if(2>u){var i=this.node();return t(i).getComputedStyle(i,null).getPropertyValue(n)}r=""}return this.each(P(n,e,r))},_a.property=function(n,t){if(arguments.length<2){if("string"==typeof n)return this.node()[n];for(t in n)this.each(U(t,n[t]));return this}return this.each(U(n,t))},_a.text=function(n){return arguments.length?this.each("function"==typeof n?function(){var t=n.apply(this,arguments);this.textContent=null==t?"":t}:null==n?function(){this.textContent=""}:function(){this.textContent=n}):this.node().textContent},_a.html=function(n){return arguments.length?this.each("function"==typeof n?function(){var t=n.apply(this,arguments);this.innerHTML=null==t?"":t}:null==n?function(){this.innerHTML=""}:function(){this.innerHTML=n}):this.node().innerHTML},_a.append=function(n){return n=j(n),this.select(function(){return this.appendChild(n.apply(this,arguments))})},_a.insert=function(n,t){return n=j(n),t=N(t),this.select(function(){return this.insertBefore(n.apply(this,arguments),t.apply(this,arguments)||null)})},_a.remove=function(){return this.each(F)},_a.data=function(n,t){function e(n,e){var r,u,i,o=n.length,f=e.length,h=Math.min(o,f),g=new Array(f),p=new Array(f),v=new Array(o);if(t){var d,m=new l,y=new Array(o);for(r=-1;++r<o;)m.has(d=t.call(u=n[r],u.__data__,r))?v[r]=u:m.set(d,u),y[r]=d;for(r=-1;++r<f;)(u=m.get(d=t.call(e,i=e[r],r)))?u!==!0&&(g[r]=u,u.__data__=i):p[r]=H(i),m.set(d,!0);for(r=-1;++r<o;)m.get(y[r])!==!0&&(v[r]=n[r])}else{for(r=-1;++r<h;)u=n[r],i=e[r],u?(u.__data__=i,g[r]=u):p[r]=H(i);for(;f>r;++r)p[r]=H(e[r]);for(;o>r;++r)v[r]=n[r]}p.update=g,p.parentNode=g.parentNode=v.parentNode=n.parentNode,a.push(p),c.push(g),s.push(v)}var r,u,i=-1,o=this.length;if(!arguments.length){for(n=new Array(o=(r=this[0]).length);++i<o;)(u=r[i])&&(n[i]=u.__data__);return n}var a=Z([]),c=A([]),s=A([]);if("function"==typeof n)for(;++i<o;)e(r=this[i],n.call(r,r.parentNode.__data__,i));else for(;++i<o;)e(r=this[i],n);return c.enter=function(){return a},c.exit=function(){return s},c},_a.datum=function(n){return arguments.length?this.property("__data__",n):this.property("__data__")},_a.filter=function(n){var t,e,r,u=[];"function"!=typeof n&&(n=O(n));for(var i=0,o=this.length;o>i;i++){u.push(t=[]),t.parentNode=(e=this[i]).parentNode;for(var a=0,c=e.length;c>a;a++)(r=e[a])&&n.call(r,r.__data__,a,i)&&t.push(r)}return A(u)},_a.order=function(){for(var n=-1,t=this.length;++n<t;)for(var e,r=this[n],u=r.length-1,i=r[u];--u>=0;)(e=r[u])&&(i&&i!==e.nextSibling&&i.parentNode.insertBefore(e,i),i=e);return this},_a.sort=function(n){n=I.apply(this,arguments);for(var t=-1,e=this.length;++t<e;)this[t].sort(n);return this.order()},_a.each=function(n){return Y(this,function(t,e,r){n.call(t,t.__data__,e,r)})},_a.call=function(n){var t=ra(arguments);return n.apply(t[0]=this,t),this},_a.empty=function(){return!this.node()},_a.node=function(){for(var n=0,t=this.length;t>n;n++)for(var e=this[n],r=0,u=e.length;u>r;r++){var i=e[r];if(i)return i}return null},_a.size=function(){var n=0;return Y(this,function(){++n}),n};var Sa=[];ta.selection.enter=Z,ta.selection.enter.prototype=Sa,Sa.append=_a.append,Sa.empty=_a.empty,Sa.node=_a.node,Sa.call=_a.call,Sa.size=_a.size,Sa.select=function(n){for(var t,e,r,u,i,o=[],a=-1,c=this.length;++a<c;){r=(u=this[a]).update,o.push(t=[]),t.parentNode=u.parentNode;for(var l=-1,s=u.length;++l<s;)(i=u[l])?(t.push(r[l]=e=n.call(u.parentNode,i.__data__,l,a)),e.__data__=i.__data__):t.push(null)}return A(o)},Sa.insert=function(n,t){return arguments.length<2&&(t=V(this)),_a.insert.call(this,n,t)},ta.select=function(t){var e;return"string"==typeof t?(e=[Ma(t,ua)],e.parentNode=ua.documentElement):(e=[t],e.parentNode=n(t)),A([e])},ta.selectAll=function(n){var t;return"string"==typeof n?(t=ra(xa(n,ua)),t.parentNode=ua.documentElement):(t=n,t.parentNode=null),A([t])},_a.on=function(n,t,e){var r=arguments.length;if(3>r){if("string"!=typeof n){2>r&&(t=!1);for(e in n)this.each(X(e,n[e],t));return this}if(2>r)return(r=this.node()["__on"+n])&&r._;e=!1}return this.each(X(n,t,e))};var ka=ta.map({mouseenter:"mouseover",mouseleave:"mouseout"});ua&&ka.forEach(function(n){"on"+n in ua&&ka.remove(n)});var Ea,Aa=0;ta.mouse=function(n){return J(n,k())};var Na=this.navigator&&/WebKit/.test(this.navigator.userAgent)?-1:0;ta.touch=function(n,t,e){if(arguments.length<3&&(e=t,t=k().changedTouches),t)for(var r,u=0,i=t.length;i>u;++u)if((r=t[u]).identifier===e)return J(n,r)},ta.behavior.drag=function(){function n(){this.on("mousedown.drag",i).on("touchstart.drag",o)}function e(n,t,e,i,o){return function(){function a(){var n,e,r=t(h,v);r&&(n=r[0]-M[0],e=r[1]-M[1],p|=n|e,M=r,g({type:"drag",x:r[0]+l[0],y:r[1]+l[1],dx:n,dy:e}))}function c(){t(h,v)&&(m.on(i+d,null).on(o+d,null),y(p&&ta.event.target===f),g({type:"dragend"}))}var l,s=this,f=ta.event.target,h=s.parentNode,g=r.of(s,arguments),p=0,v=n(),d=".drag"+(null==v?"":"-"+v),m=ta.select(e(f)).on(i+d,a).on(o+d,c),y=W(f),M=t(h,v);u?(l=u.apply(s,arguments),l=[l.x-M[0],l.y-M[1]]):l=[0,0],g({type:"dragstart"})}}var r=E(n,"drag","dragstart","dragend"),u=null,i=e(b,ta.mouse,t,"mousemove","mouseup"),o=e(G,ta.touch,y,"touchmove","touchend");return n.origin=function(t){return arguments.length?(u=t,n):u},ta.rebind(n,r,"on")},ta.touches=function(n,t){return arguments.length<2&&(t=k().touches),t?ra(t).map(function(t){var e=J(n,t);return e.identifier=t.identifier,e}):[]};var Ca=1e-6,za=Ca*Ca,qa=Math.PI,La=2*qa,Ta=La-Ca,Ra=qa/2,Da=qa/180,Pa=180/qa,Ua=Math.SQRT2,ja=2,Fa=4;ta.interpolateZoom=function(n,t){function e(n){var t=n*y;if(m){var e=rt(v),o=i/(ja*h)*(e*ut(Ua*t+v)-et(v));return[r+o*l,u+o*s,i*e/rt(Ua*t+v)]}return[r+n*l,u+n*s,i*Math.exp(Ua*t)]}var r=n[0],u=n[1],i=n[2],o=t[0],a=t[1],c=t[2],l=o-r,s=a-u,f=l*l+s*s,h=Math.sqrt(f),g=(c*c-i*i+Fa*f)/(2*i*ja*h),p=(c*c-i*i-Fa*f)/(2*c*ja*h),v=Math.log(Math.sqrt(g*g+1)-g),d=Math.log(Math.sqrt(p*p+1)-p),m=d-v,y=(m||Math.log(c/i))/Ua;return e.duration=1e3*y,e},ta.behavior.zoom=function(){function n(n){n.on(q,f).on(Oa+".zoom",g).on("dblclick.zoom",p).on(R,h)}function e(n){return[(n[0]-k.x)/k.k,(n[1]-k.y)/k.k]}function r(n){return[n[0]*k.k+k.x,n[1]*k.k+k.y]}function u(n){k.k=Math.max(N[0],Math.min(N[1],n))}function i(n,t){t=r(t),k.x+=n[0]-t[0],k.y+=n[1]-t[1]}function o(t,e,r,o){t.__chart__={x:k.x,y:k.y,k:k.k},u(Math.pow(2,o)),i(d=e,r),t=ta.select(t),C>0&&(t=t.transition().duration(C)),t.call(n.event)}function a(){b&&b.domain(x.range().map(function(n){return(n-k.x)/k.k}).map(x.invert)),w&&w.domain(_.range().map(function(n){return(n-k.y)/k.k}).map(_.invert))}function c(n){z++||n({type:"zoomstart"})}function l(n){a(),n({type:"zoom",scale:k.k,translate:[k.x,k.y]})}function s(n){--z||n({type:"zoomend"}),d=null}function f(){function n(){f=1,i(ta.mouse(u),g),l(a)}function r(){h.on(L,null).on(T,null),p(f&&ta.event.target===o),s(a)}var u=this,o=ta.event.target,a=D.of(u,arguments),f=0,h=ta.select(t(u)).on(L,n).on(T,r),g=e(ta.mouse(u)),p=W(u);Dl.call(u),c(a)}function h(){function n(){var n=ta.touches(p);return g=k.k,n.forEach(function(n){n.identifier in d&&(d[n.identifier]=e(n))}),n}function t(){var t=ta.event.target;ta.select(t).on(x,r).on(b,a),_.push(t);for(var e=ta.event.changedTouches,u=0,i=e.length;i>u;++u)d[e[u].identifier]=null;var c=n(),l=Date.now();if(1===c.length){if(500>l-M){var s=c[0];o(p,s,d[s.identifier],Math.floor(Math.log(k.k)/Math.LN2)+1),S()}M=l}else if(c.length>1){var s=c[0],f=c[1],h=s[0]-f[0],g=s[1]-f[1];m=h*h+g*g}}function r(){var n,t,e,r,o=ta.touches(p);Dl.call(p);for(var a=0,c=o.length;c>a;++a,r=null)if(e=o[a],r=d[e.identifier]){if(t)break;n=e,t=r}if(r){var s=(s=e[0]-n[0])*s+(s=e[1]-n[1])*s,f=m&&Math.sqrt(s/m);n=[(n[0]+e[0])/2,(n[1]+e[1])/2],t=[(t[0]+r[0])/2,(t[1]+r[1])/2],u(f*g)}M=null,i(n,t),l(v)}function a(){if(ta.event.touches.length){for(var t=ta.event.changedTouches,e=0,r=t.length;r>e;++e)delete d[t[e].identifier];for(var u in d)return void n()}ta.selectAll(_).on(y,null),w.on(q,f).on(R,h),E(),s(v)}var g,p=this,v=D.of(p,arguments),d={},m=0,y=".zoom-"+ta.event.changedTouches[0].identifier,x="touchmove"+y,b="touchend"+y,_=[],w=ta.select(p),E=W(p);t(),c(v),w.on(q,null).on(R,t)}function g(){var n=D.of(this,arguments);y?clearTimeout(y):(v=e(d=m||ta.mouse(this)),Dl.call(this),c(n)),y=setTimeout(function(){y=null,s(n)},50),S(),u(Math.pow(2,.002*Ha())*k.k),i(d,v),l(n)}function p(){var n=ta.mouse(this),t=Math.log(k.k)/Math.LN2;o(this,n,e(n),ta.event.shiftKey?Math.ceil(t)-1:Math.floor(t)+1)}var v,d,m,y,M,x,b,_,w,k={x:0,y:0,k:1},A=[960,500],N=Ia,C=250,z=0,q="mousedown.zoom",L="mousemove.zoom",T="mouseup.zoom",R="touchstart.zoom",D=E(n,"zoomstart","zoom","zoomend");return Oa||(Oa="onwheel"in ua?(Ha=function(){return-ta.event.deltaY*(ta.event.deltaMode?120:1)},"wheel"):"onmousewheel"in ua?(Ha=function(){return ta.event.wheelDelta},"mousewheel"):(Ha=function(){return-ta.event.detail},"MozMousePixelScroll")),n.event=function(n){n.each(function(){var n=D.of(this,arguments),t=k;Tl?ta.select(this).transition().each("start.zoom",function(){k=this.__chart__||{x:0,y:0,k:1},c(n)}).tween("zoom:zoom",function(){var e=A[0],r=A[1],u=d?d[0]:e/2,i=d?d[1]:r/2,o=ta.interpolateZoom([(u-k.x)/k.k,(i-k.y)/k.k,e/k.k],[(u-t.x)/t.k,(i-t.y)/t.k,e/t.k]);return function(t){var r=o(t),a=e/r[2];this.__chart__=k={x:u-r[0]*a,y:i-r[1]*a,k:a},l(n)}}).each("interrupt.zoom",function(){s(n)}).each("end.zoom",function(){s(n)}):(this.__chart__=k,c(n),l(n),s(n))})},n.translate=function(t){return arguments.length?(k={x:+t[0],y:+t[1],k:k.k},a(),n):[k.x,k.y]},n.scale=function(t){return arguments.length?(k={x:k.x,y:k.y,k:+t},a(),n):k.k},n.scaleExtent=function(t){return arguments.length?(N=null==t?Ia:[+t[0],+t[1]],n):N},n.center=function(t){return arguments.length?(m=t&&[+t[0],+t[1]],n):m},n.size=function(t){return arguments.length?(A=t&&[+t[0],+t[1]],n):A},n.duration=function(t){return arguments.length?(C=+t,n):C},n.x=function(t){return arguments.length?(b=t,x=t.copy(),k={x:0,y:0,k:1},n):b},n.y=function(t){return arguments.length?(w=t,_=t.copy(),k={x:0,y:0,k:1},n):w},ta.rebind(n,D,"on")};var Ha,Oa,Ia=[0,1/0];ta.color=ot,ot.prototype.toString=function(){return this.rgb()+""},ta.hsl=at;var Ya=at.prototype=new ot;Ya.brighter=function(n){return n=Math.pow(.7,arguments.length?n:1),new at(this.h,this.s,this.l/n)},Ya.darker=function(n){return n=Math.pow(.7,arguments.length?n:1),new at(this.h,this.s,n*this.l)},Ya.rgb=function(){return ct(this.h,this.s,this.l)},ta.hcl=lt;var Za=lt.prototype=new ot;Za.brighter=function(n){return new lt(this.h,this.c,Math.min(100,this.l+Va*(arguments.length?n:1)))},Za.darker=function(n){return new lt(this.h,this.c,Math.max(0,this.l-Va*(arguments.length?n:1)))},Za.rgb=function(){return st(this.h,this.c,this.l).rgb()},ta.lab=ft;var Va=18,Xa=.95047,$a=1,Ba=1.08883,Wa=ft.prototype=new ot;Wa.brighter=function(n){return new ft(Math.min(100,this.l+Va*(arguments.length?n:1)),this.a,this.b)},Wa.darker=function(n){return new ft(Math.max(0,this.l-Va*(arguments.length?n:1)),this.a,this.b)},Wa.rgb=function(){return ht(this.l,this.a,this.b)},ta.rgb=mt;var Ja=mt.prototype=new ot;Ja.brighter=function(n){n=Math.pow(.7,arguments.length?n:1);var t=this.r,e=this.g,r=this.b,u=30;return t||e||r?(t&&u>t&&(t=u),e&&u>e&&(e=u),r&&u>r&&(r=u),new mt(Math.min(255,t/n),Math.min(255,e/n),Math.min(255,r/n))):new mt(u,u,u)},Ja.darker=function(n){return n=Math.pow(.7,arguments.length?n:1),new mt(n*this.r,n*this.g,n*this.b)},Ja.hsl=function(){return _t(this.r,this.g,this.b)},Ja.toString=function(){return"#"+xt(this.r)+xt(this.g)+xt(this.b)};var Ga=ta.map({aliceblue:15792383,antiquewhite:16444375,aqua:65535,aquamarine:8388564,azure:15794175,beige:16119260,bisque:16770244,black:0,blanchedalmond:16772045,blue:255,blueviolet:9055202,brown:10824234,burlywood:14596231,cadetblue:6266528,chartreuse:8388352,chocolate:13789470,coral:16744272,cornflowerblue:6591981,cornsilk:16775388,crimson:14423100,cyan:65535,darkblue:139,darkcyan:35723,darkgoldenrod:12092939,darkgray:11119017,darkgreen:25600,darkgrey:11119017,darkkhaki:12433259,darkmagenta:9109643,darkolivegreen:5597999,darkorange:16747520,darkorchid:10040012,darkred:9109504,darksalmon:15308410,darkseagreen:9419919,darkslateblue:4734347,darkslategray:3100495,darkslategrey:3100495,darkturquoise:52945,darkviolet:9699539,deeppink:16716947,deepskyblue:49151,dimgray:6908265,dimgrey:6908265,dodgerblue:2003199,firebrick:11674146,floralwhite:16775920,forestgreen:2263842,fuchsia:16711935,gainsboro:14474460,ghostwhite:16316671,gold:16766720,goldenrod:14329120,gray:8421504,green:32768,greenyellow:11403055,grey:8421504,honeydew:15794160,hotpink:16738740,indianred:13458524,indigo:4915330,ivory:16777200,khaki:15787660,lavender:15132410,lavenderblush:16773365,lawngreen:8190976,lemonchiffon:16775885,lightblue:11393254,lightcoral:15761536,lightcyan:14745599,lightgoldenrodyellow:16448210,lightgray:13882323,lightgreen:9498256,lightgrey:13882323,lightpink:16758465,lightsalmon:16752762,lightseagreen:2142890,lightskyblue:8900346,lightslategray:7833753,lightslategrey:7833753,lightsteelblue:11584734,lightyellow:16777184,lime:65280,limegreen:3329330,linen:16445670,magenta:16711935,maroon:8388608,mediumaquamarine:6737322,mediumblue:205,mediumorchid:12211667,mediumpurple:9662683,mediumseagreen:3978097,mediumslateblue:8087790,mediumspringgreen:64154,mediumturquoise:4772300,mediumvioletred:13047173,midnightblue:1644912,mintcream:16121850,mistyrose:16770273,moccasin:16770229,navajowhite:16768685,navy:128,oldlace:16643558,olive:8421376,olivedrab:7048739,orange:16753920,orangered:16729344,orchid:14315734,palegoldenrod:15657130,palegreen:10025880,paleturquoise:11529966,palevioletred:14381203,papayawhip:16773077,peachpuff:16767673,peru:13468991,pink:16761035,plum:14524637,powderblue:11591910,purple:8388736,rebeccapurple:6697881,red:16711680,rosybrown:12357519,royalblue:4286945,saddlebrown:9127187,salmon:16416882,sandybrown:16032864,seagreen:3050327,seashell:16774638,sienna:10506797,silver:12632256,skyblue:8900331,slateblue:6970061,slategray:7372944,slategrey:7372944,snow:16775930,springgreen:65407,steelblue:4620980,tan:13808780,teal:32896,thistle:14204888,tomato:16737095,turquoise:4251856,violet:15631086,wheat:16113331,white:16777215,whitesmoke:16119285,yellow:16776960,yellowgreen:10145074});Ga.forEach(function(n,t){Ga.set(n,yt(t))}),ta.functor=Et,ta.xhr=At(y),ta.dsv=function(n,t){function e(n,e,i){arguments.length<3&&(i=e,e=null);var o=Nt(n,t,null==e?r:u(e),i);return o.row=function(n){return arguments.length?o.response(null==(e=n)?r:u(n)):e},o}function r(n){return e.parse(n.responseText)}function u(n){return function(t){return e.parse(t.responseText,n)}}function i(t){return t.map(o).join(n)}function o(n){return a.test(n)?'"'+n.replace(/\"/g,'""')+'"':n}var a=new RegExp('["'+n+"\n]"),c=n.charCodeAt(0);return e.parse=function(n,t){var r;return e.parseRows(n,function(n,e){if(r)return r(n,e-1);var u=new Function("d","return {"+n.map(function(n,t){return JSON.stringify(n)+": d["+t+"]"}).join(",")+"}");r=t?function(n,e){return t(u(n),e)}:u})},e.parseRows=function(n,t){function e(){if(s>=l)return o;if(u)return u=!1,i;var t=s;if(34===n.charCodeAt(t)){for(var e=t;e++<l;)if(34===n.charCodeAt(e)){if(34!==n.charCodeAt(e+1))break;++e}s=e+2;var r=n.charCodeAt(e+1);return 13===r?(u=!0,10===n.charCodeAt(e+2)&&++s):10===r&&(u=!0),n.slice(t+1,e).replace(/""/g,'"')}for(;l>s;){var r=n.charCodeAt(s++),a=1;if(10===r)u=!0;else if(13===r)u=!0,10===n.charCodeAt(s)&&(++s,++a);else if(r!==c)continue;return n.slice(t,s-a)}return n.slice(t)}for(var r,u,i={},o={},a=[],l=n.length,s=0,f=0;(r=e())!==o;){for(var h=[];r!==i&&r!==o;)h.push(r),r=e();t&&null==(h=t(h,f++))||a.push(h)}return a},e.format=function(t){if(Array.isArray(t[0]))return e.formatRows(t);var r=new m,u=[];return t.forEach(function(n){for(var t in n)r.has(t)||u.push(r.add(t))}),[u.map(o).join(n)].concat(t.map(function(t){return u.map(function(n){return o(t[n])}).join(n)})).join("\n")},e.formatRows=function(n){return n.map(i).join("\n")},e},ta.csv=ta.dsv(",","text/csv"),ta.tsv=ta.dsv(" ","text/tab-separated-values");var Ka,Qa,nc,tc,ec,rc=this[x(this,"requestAnimationFrame")]||function(n){setTimeout(n,17)};ta.timer=function(n,t,e){var r=arguments.length;2>r&&(t=0),3>r&&(e=Date.now());var u=e+t,i={c:n,t:u,f:!1,n:null};Qa?Qa.n=i:Ka=i,Qa=i,nc||(tc=clearTimeout(tc),nc=1,rc(qt))},ta.timer.flush=function(){Lt(),Tt()},ta.round=function(n,t){return t?Math.round(n*(t=Math.pow(10,t)))/t:Math.round(n)};var uc=["y","z","a","f","p","n","\xb5","m","","k","M","G","T","P","E","Z","Y"].map(Dt);ta.formatPrefix=function(n,t){var e=0;return n&&(0>n&&(n*=-1),t&&(n=ta.round(n,Rt(n,t))),e=1+Math.floor(1e-12+Math.log(n)/Math.LN10),e=Math.max(-24,Math.min(24,3*Math.floor((e-1)/3)))),uc[8+e/3]};var ic=/(?:([^{])?([<>=^]))?([+\- ])?([$#])?(0)?(\d+)?(,)?(\.-?\d+)?([a-z%])?/i,oc=ta.map({b:function(n){return n.toString(2)},c:function(n){return String.fromCharCode(n)},o:function(n){return n.toString(8)},x:function(n){return n.toString(16)},X:function(n){return n.toString(16).toUpperCase()},g:function(n,t){return n.toPrecision(t)},e:function(n,t){return n.toExponential(t)},f:function(n,t){return n.toFixed(t)},r:function(n,t){return(n=ta.round(n,Rt(n,t))).toFixed(Math.max(0,Math.min(20,Rt(n*(1+1e-15),t))))}}),ac=ta.time={},cc=Date;jt.prototype={getDate:function(){return this._.getUTCDate()},getDay:function(){return this._.getUTCDay()},getFullYear:function(){return this._.getUTCFullYear()},getHours:function(){return this._.getUTCHours()},getMilliseconds:function(){return this._.getUTCMilliseconds()},getMinutes:function(){return this._.getUTCMinutes()},getMonth:function(){return this._.getUTCMonth()},getSeconds:function(){return this._.getUTCSeconds()},getTime:function(){return this._.getTime()},getTimezoneOffset:function(){return 0},valueOf:function(){return this._.valueOf()},setDate:function(){lc.setUTCDate.apply(this._,arguments)},setDay:function(){lc.setUTCDay.apply(this._,arguments)},setFullYear:function(){lc.setUTCFullYear.apply(this._,arguments)},setHours:function(){lc.setUTCHours.apply(this._,arguments)},setMilliseconds:function(){lc.setUTCMilliseconds.apply(this._,arguments)},setMinutes:function(){lc.setUTCMinutes.apply(this._,arguments)},setMonth:function(){lc.setUTCMonth.apply(this._,arguments)},setSeconds:function(){lc.setUTCSeconds.apply(this._,arguments)},setTime:function(){lc.setTime.apply(this._,arguments)}};var lc=Date.prototype;ac.year=Ft(function(n){return n=ac.day(n),n.setMonth(0,1),n},function(n,t){n.setFullYear(n.getFullYear()+t)},function(n){return n.getFullYear()}),ac.years=ac.year.range,ac.years.utc=ac.year.utc.range,ac.day=Ft(function(n){var t=new cc(2e3,0);return t.setFullYear(n.getFullYear(),n.getMonth(),n.getDate()),t},function(n,t){n.setDate(n.getDate()+t)},function(n){return n.getDate()-1}),ac.days=ac.day.range,ac.days.utc=ac.day.utc.range,ac.dayOfYear=function(n){var t=ac.year(n);return Math.floor((n-t-6e4*(n.getTimezoneOffset()-t.getTimezoneOffset()))/864e5)},["sunday","monday","tuesday","wednesday","thursday","friday","saturday"].forEach(function(n,t){t=7-t;var e=ac[n]=Ft(function(n){return(n=ac.day(n)).setDate(n.getDate()-(n.getDay()+t)%7),n},function(n,t){n.setDate(n.getDate()+7*Math.floor(t))},function(n){var e=ac.year(n).getDay();return Math.floor((ac.dayOfYear(n)+(e+t)%7)/7)-(e!==t)});ac[n+"s"]=e.range,ac[n+"s"].utc=e.utc.range,ac[n+"OfYear"]=function(n){var e=ac.year(n).getDay();return Math.floor((ac.dayOfYear(n)+(e+t)%7)/7)}}),ac.week=ac.sunday,ac.weeks=ac.sunday.range,ac.weeks.utc=ac.sunday.utc.range,ac.weekOfYear=ac.sundayOfYear;var sc={"-":"",_:" ",0:"0"},fc=/^\s*\d+/,hc=/^%/;ta.locale=function(n){return{numberFormat:Pt(n),timeFormat:Ot(n)}};var gc=ta.locale({decimal:".",thousands:",",grouping:[3],currency:["$",""],dateTime:"%a %b %e %X %Y",date:"%m/%d/%Y",time:"%H:%M:%S",periods:["AM","PM"],days:["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"],shortDays:["Sun","Mon","Tue","Wed","Thu","Fri","Sat"],months:["January","February","March","April","May","June","July","August","September","October","November","December"],shortMonths:["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]});ta.format=gc.numberFormat,ta.geo={},ce.prototype={s:0,t:0,add:function(n){le(n,this.t,pc),le(pc.s,this.s,this),this.s?this.t+=pc.t:this.s=pc.t diff --git a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/librariesList.ssp b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/librariesList.ssp index 442958b2c5ca531b87d40dedde06b112e7f7bec4..b3cd1060ff9ae22ea31496e1aab329d07cbeb6ae 100644 --- a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/librariesList.ssp +++ b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/librariesList.ssp @@ -6,7 +6,7 @@ <table class="table"> <thead><tr><th>Libraries</th></tr></thead> <tbody> -#for (lib <- summary.libraries(sampleId.get)) +#for (lib <- summary.libraries(sampleId.get).toList.sorted) <tr><td><a href="${rootPath}Samples/${sampleId}/Libraries/${lib}/index.html">${lib}</a></td></tr> #end </tbody> diff --git a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/main.ssp b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/main.ssp index 356ddc00ecd606bdf3753d078d9eb58f36bad4fe..0544a334088d0e9bd0970d518860295014ee1891 100644 --- a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/main.ssp +++ b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/main.ssp @@ -12,22 +12,37 @@ val buffer: StringBuffer = new StringBuffer() if (page.subPages.nonEmpty){ - buffer.append("<ul class=\"dropdown-menu\">") + buffer.append("<ul class=\"dropdown-menu list-group\">") } - for (subPage <- page.subPages) { + for (subPage <- page.subPages.sortBy(_._1)) { + val href: String = { if (path.isEmpty) rootPath + subPage._1 + "/index.html" else rootPath + path.mkString("","/","/") + subPage._1 + "/index.html" } - buffer.append("<li") - if(subPage._2.subPages.nonEmpty) buffer.append(" class=\"dropdown-submenu\"") - buffer.append("><a href=\"" + href + "\"") - if (first) buffer.append(" tabindex=\"-1\"") - buffer.append(">" + subPage._1 + "</a>") - buffer.append(createMenu(subPage._2, path ::: subPage._1 :: Nil, first = false)) - buffer.append("</li>") +// buffer.append("<li") +// if(subPage._2.subPages.nonEmpty) buffer.append(" class=\"dropdown-submenu list-group-item\"") +// buffer.append("><span class=\"badge\">%d</span><a href=\"%s\"".format(subPage._2.subPages.size, href)) +// if (first) buffer.append(" tabindex=\"-1\"") +// buffer.append(">%s</a>".format(subPage._1)) +// buffer.append(createMenu(subPage._2, path ::: subPage._1 :: Nil, first = false)) +// buffer.append("</li>") + + val listSubmenu = if(subPage._2.subPages.nonEmpty) "dropdown-submenu" else "" +// val subMenuBadgeCount = if(subPage._2.subPages.nonEmpty && first) "<span class='badge'>%d</span>".format(subPage._2.subPages.size) else "" + val tabIndex = if (first) " tabindex='-1'" else "" +// val listGroupA = if(subPage._2.subPages.nonEmpty) "list-group-item" else "" + + var menuItem: String = "<li class='%s'>".format(listSubmenu) + + "<a href='%s' class='%s'%s>".format(href, "", tabIndex) + + "%s".format(subPage._1) + + "</a>" + + createMenu(subPage._2, path ::: subPage._1 :: Nil, first = false) + + "</li>" + buffer.append(menuItem) + } if(page.subPages.nonEmpty) { buffer.append("</ul>\n") @@ -74,7 +89,7 @@ Sortable.init() $('body').scrollspy({ - target: '.bs-sidebar', + target: '.bs-sidebar' }); }); @@ -113,14 +128,14 @@ <a href="${rootPath}index.html">Home #if (indexPage.subPages.nonEmpty) <b class="caret"></b> #end </a> - ${unescape(createMenu(indexPage))} + ${unescape(createMenu(indexPage, Nil, false))} </li> #else <li class="root #if (t == path.size) active #end"> <a href="${rootPath}${path.slice(0,t).mkString("", "/", "/")}index.html">${path( t - 1 )} #if (getSubPage(path.slice(0, t)).subPages.nonEmpty) <b class="caret"></b> #end </a> - ${unescape(createMenu(getSubPage(path.slice(0, t)), path.slice(0, t)))} + ${unescape(createMenu(getSubPage(path.slice(0, t)), path.slice(0, t), false))} </li> #end #end diff --git a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/reference.ssp b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/reference.ssp index b12db503c8217bacf150c52a0e8c32b9949a4f6b..e42e85991c98ef05c97b0fcdcc14c19791c2468a 100644 --- a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/reference.ssp +++ b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/reference.ssp @@ -5,7 +5,7 @@ <%@ var pipeline: String %> #{ - val contigs = summary.getValue(pipeline, "settings", "reference", "contigs").get.asInstanceOf[Map[String, Map[String, Any]]] + val contigs = summary.getValue(pipeline, "settings", "reference", "contigs").getOrElse(Map.empty).asInstanceOf[Map[String, Map[String, Any]]] }# <table class="table"> diff --git a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/samplesList.ssp b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/samplesList.ssp index 20f6945618a17055380a9289d082c1655939f872..a289560919beea4ed2fd53a376a356ae6f158f65 100644 --- a/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/samplesList.ssp +++ b/public/biopet-core/src/main/resources/nl/lumc/sasc/biopet/core/report/samplesList.ssp @@ -5,7 +5,7 @@ <table class="table sortable-theme-bootstrap" data-sortable> <thead><tr><th data-sorted="true" data-sorted-direction="ascending">Sample</th></tr></thead> <tbody> - #for (sample <- summary.samples) + #for (sample <- summary.samples.toList.sorted) <tr><td><a href="${rootPath}Samples/${sample}/index.html">${sample}</a></td></tr> #end </tbody> diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetCommandLineFunction.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetCommandLineFunction.scala index 71c6f7f466b65b37bd808c9706eb6841ea7b8b9b..4a41248e5653ee9dd6a1e32e64845447b4198b06 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetCommandLineFunction.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetCommandLineFunction.scala @@ -83,13 +83,14 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => /** Set default output file, threads and vmem for current job */ final def internalBeforeGraph(): Unit = { - pipesJobs.foreach(_.beforeGraph()) - pipesJobs.foreach(_.internalBeforeGraph()) + _pipesJobs.foreach(_.beforeGraph()) + _pipesJobs.foreach(_.internalBeforeGraph()) } /** * Can override this value is executable may not be converted to CanonicalPath + * * @deprecated */ val executableToCanonicalPath = true @@ -121,6 +122,7 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => /** * This operator sends stdout to `that` and combine this into 1 command line function + * * @param that Function that will read from stdin * @return BiopetPipe function */ @@ -141,6 +143,7 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => /** * This operator can be used to give a program a file as stdin + * * @param file File that will become stdin for this program * @return It's own class */ @@ -152,6 +155,7 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => /** * This operator can be used to give a program a file write it's atdout + * * @param file File that will become stdout for this program * @return It's own class */ @@ -169,6 +173,7 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => /** * This function needs to be implemented to define the command that is executed + * * @return Command to run */ protected[core] def cmdLine: String @@ -176,6 +181,7 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => /** * implementing a final version of the commandLine from org.broadinstitute.gatk.queue.function.CommandLineFunction * User needs to implement cmdLine instead + * * @return Command to run */ override final def commandLine: String = { @@ -187,10 +193,11 @@ trait BiopetCommandLineFunction extends CommandLineResources { biopetFunction => cmd } - private[core] var pipesJobs: List[BiopetCommandLineFunction] = Nil + private[core] var _pipesJobs: List[BiopetCommandLineFunction] = Nil + def pipesJobs = _pipesJobs def addPipeJob(job: BiopetCommandLineFunction) { - pipesJobs :+= job - pipesJobs = pipesJobs.distinct + _pipesJobs :+= job + _pipesJobs = _pipesJobs.distinct } } diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetFifoPipe.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetFifoPipe.scala index 8b4f1f801681f8ead4a58e2e02264acf8680c57d..287064130a8a055b7457a2e583517c8e60a1b5df 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetFifoPipe.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetFifoPipe.scala @@ -67,8 +67,8 @@ class BiopetFifoPipe(val root: Configurable, deps :::= inputs.values.toList.flatten.filter(!fifoFiles.contains(_)) deps = deps.distinct - pipesJobs :::= commands - pipesJobs = pipesJobs.distinct + _pipesJobs :::= commands + _pipesJobs = _pipesJobs.distinct } override def beforeCmd(): Unit = { diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetPipe.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetPipe.scala index 9cba222297d84187f197bfd444c5edc3a57a869b..936096971632109dea09bfac739ae5e3f8c1e5e1 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetPipe.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetPipe.scala @@ -41,7 +41,7 @@ class BiopetPipe(val commands: List[BiopetCommandLineFunction]) extends BiopetCo case e: Exception => Nil } - pipesJobs :::= commands + _pipesJobs :::= commands override def beforeGraph() { super.beforeGraph() @@ -61,7 +61,7 @@ class BiopetPipe(val commands: List[BiopetCommandLineFunction]) extends BiopetCo } override def setResources(): Unit = { - combineResources(pipesJobs) + combineResources(_pipesJobs) } override def setupRetry(): Unit = { diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetQScript.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetQScript.scala index e6faba53829911ec48a9b43add4c176daa4cecc0..5d53e77e51e0d70deb3480919cbc0dbd6ed15874 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetQScript.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/BiopetQScript.scala @@ -93,12 +93,12 @@ trait BiopetQScript extends Configurable with GatkLogging { qscript: QScript => if (outputDir.getParentFile.canWrite || (outputDir.exists && outputDir.canWrite)) globalConfig.writeReport(qSettings.runName, new File(outputDir, ".log/" + qSettings.runName)) - else Logging.addError("Parent of output dir: '" + outputDir.getParent + "' is not writeable, outputdir can not be created") + else Logging.addError("Parent of output dir: '" + outputDir.getParent + "' is not writeable, output directory cannot be created") inputFiles.foreach { i => if (!i.file.exists()) Logging.addError(s"Input file does not exist: ${i.file}") if (!i.file.canRead) Logging.addError(s"Input file can not be read: ${i.file}") - if (!i.file.isAbsolute) Logging.addError(s"Input file should be an absulute path: ${i.file}") + if (!i.file.isAbsolute) Logging.addError(s"Input file should be an absolute path: ${i.file}") } functions.filter(_.jobOutputFile == null).foreach(f => { diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/PipelineCommand.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/PipelineCommand.scala index f7ecc8bf1e612bb104d3eb567978c30b63183455..6e5ae4e47f9364e1024dea10ddcdf8218a4ad878 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/PipelineCommand.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/PipelineCommand.scala @@ -84,7 +84,7 @@ trait PipelineCommand extends MainCommand with GatkLogging with ImplicitConversi } val a = new WriterAppender(new PatternLayout("%-5p [%d] [%C{1}] - %m%n"), new PrintWriter(logFile)) - logger.addAppender(a) + Logging.logger.addAppender(a) var argv: Array[String] = Array() argv ++= Array("-S", pipeline) diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/Reference.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/Reference.scala index 94f511f7a8207c043b5e306149a3abe92e172a07..6a39ca3eb9c68205cf1b5354829b4bb87944bfa9 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/Reference.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/Reference.scala @@ -83,8 +83,8 @@ trait Reference extends Configurable { val defaults = ConfigUtils.mergeMaps(this.defaults, this.internalDefaults) def getReferences(map: Map[String, Any]): Set[(String, String)] = (for ( - (species, species_content: Map[String, Any]) <- map.getOrElse("references", Map[String, Any]()).asInstanceOf[Map[String, Any]].toList; - (reference_name, _) <- species_content.toList + (species, species_content) <- map.getOrElse("references", Map[String, Any]()).asInstanceOf[Map[String, Any]].toList; + (reference_name, _) <- species_content.asInstanceOf[Map[String, Any]].toList ) yield (species, reference_name)).toSet val references = getReferences(defaults) ++ getReferences(Config.global.map) diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/annotations/Annotations.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/annotations/Annotations.scala index c2f688804cd7917bbe73cae1fd05d282fa61b30a..ab31690cd27cfc1578e7a1fd5a1393af82b63461 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/annotations/Annotations.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/annotations/Annotations.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.annotations import nl.lumc.sasc.biopet.core.BiopetQScript @@ -16,6 +31,14 @@ trait AnnotationGtf extends BiopetQScript { qscript: QScript => file } } +trait AnnotationGff extends BiopetQScript { qscript: QScript => + /** GFF reference file in GFF3 format */ + lazy val annotationGff: File = { + val file: File = config("annotation_gff", freeVar = true) + inputFiles :+ InputFile(file, config("annotation_gff_md5", freeVar = true)) + file + } +} trait AnnotationRefFlat extends BiopetQScript { qscript: QScript => /** GTF reference file */ diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/extensions/PythonCommandLineFunction.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/extensions/PythonCommandLineFunction.scala index 4ca0eedf03ed763530e19b1967a0189c09d85497..873aa6f994d9040b7c6450eebc692253c4a9e394 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/extensions/PythonCommandLineFunction.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/extensions/PythonCommandLineFunction.scala @@ -22,22 +22,22 @@ import org.broadinstitute.gatk.utils.commandline.Input trait PythonCommandLineFunction extends BiopetCommandLineFunction { @Input(doc = "Python script", required = false) - var python_script: File = _ + var pythonScript: File = _ executable = config("exe", default = "python", configNamespace = "python", freeVar = false) - protected var python_script_name: String = _ + protected var pythonScriptName: String = _ /** * checks if script already exist in jar otherwise try to fetch from the jar * @param script name / location of script */ def setPythonScript(script: String) { - python_script = new File(script) - if (!python_script.exists()) { + pythonScript = new File(script) + if (!pythonScript.exists()) { setPythonScript(script, "") } else { - python_script_name = script + pythonScriptName = script } } @@ -47,17 +47,17 @@ trait PythonCommandLineFunction extends BiopetCommandLineFunction { * @param subpackage location of script in jar */ def setPythonScript(script: String, subpackage: String) { - python_script_name = script - python_script = new File(".queue/tmp/" + subpackage + python_script_name) - if (!python_script.getParentFile.exists) python_script.getParentFile.mkdirs - val is = getClass.getResourceAsStream(subpackage + python_script_name) - val os = new FileOutputStream(python_script) + pythonScriptName = script + pythonScript = new File(".queue/tmp/" + subpackage + pythonScriptName) + if (!pythonScript.getParentFile.exists) pythonScript.getParentFile.mkdirs + val is = getClass.getResourceAsStream(subpackage + pythonScriptName) + val os = new FileOutputStream(pythonScript) org.apache.commons.io.IOUtils.copy(is, os) os.close() } /** return basic command to prefix the complete command with */ def getPythonCommand: String = { - required(executable) + required(python_script) + required(executable) + required(pythonScript) } } diff --git a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilder.scala b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilder.scala index d4ad550d3407b59a1183406af701e9b69c9d9ecc..0f76d03c956ff5168f095ce4fc98f37959f0e893 100644 --- a/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilder.scala +++ b/public/biopet-core/src/main/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilder.scala @@ -36,7 +36,7 @@ trait MultisampleReportBuilder extends ReportBuilder { def libraryPage(sampleId: String, libraryId: String, args: Map[String, Any]): ReportPage /** Default list of libraries, can be override */ - def libririesSections: List[(String, ReportSection)] = { + def librariesSections: List[(String, ReportSection)] = { List( ("Libraries", ReportSection("/nl/lumc/sasc/biopet/core/report/librariesList.ssp")) ) @@ -60,6 +60,6 @@ trait MultisampleReportBuilder extends ReportBuilder { val libPages = summary.libraries(sampleId) .map(libId => libId -> libraryPage(sampleId, libId, args ++ Map("libId" -> Some(libId)))) .toList - ReportPage(libPages, libririesSections, args) + ReportPage(libPages, librariesSections, args) } } diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/MultiSampleQScriptTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/MultiSampleQScriptTest.scala index 9bb7bec2657a571c9b98fc08f62d427386c9097c..d16ef4848eb1ff71527e0c2451ba541348ffedcc 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/MultiSampleQScriptTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/MultiSampleQScriptTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core import java.io.File diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/ReferenceTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/ReferenceTest.scala index 79741a2c2eb73a39cce67d8ad48b5680ba939163..af47d735d4c8bb13bf3c44023224191d796d6132 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/ReferenceTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/ReferenceTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core import java.nio.file.Paths diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilderTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilderTest.scala index 20fab25d85bf438872e3c09c9c62c45d56b807b9..69ce7b26f30f922eb3fbcb63435927cf37b6f8df 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilderTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/MultisampleReportBuilderTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.report import java.io.File diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportBuilderTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportBuilderTest.scala index fd321d65ff8b1625f7b4a4dd25fff3bcf7754105..876d39f547e6d3f82850298317a116ee72327ab2 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportBuilderTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportBuilderTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.report import java.io.File diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportSectionTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportSectionTest.scala index f6ffd79e59c2d66b07f1b30d4d9c767734a75b1b..334974f00b73659e67b5ca19ef139bf699596568 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportSectionTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/report/ReportSectionTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.report import org.scalatest.Matchers diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummarizableTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummarizableTest.scala index 1be487ff6d1309685d004d4b2f8f8b5523a93d75..847f407807db930b1df087146dbd8ca43357112d 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummarizableTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummarizableTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.summary import java.io.File diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummaryQScriptTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummaryQScriptTest.scala index a53fe068f9537cdce91a75f45ef9393631377d0d..85a8a282a92770323ee063be838c65a08c050445 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummaryQScriptTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/SummaryQScriptTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.summary import java.io.File diff --git a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/WriteSummaryTest.scala b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/WriteSummaryTest.scala index 52f08f005c703e4e008b1e71d70f723d71803a18..e4c31c51aa07581ba0bd6a3781af205e5b15238c 100644 --- a/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/WriteSummaryTest.scala +++ b/public/biopet-core/src/test/scala/nl/lumc/sasc/biopet/core/summary/WriteSummaryTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.core.summary import java.io.{ PrintWriter, File } diff --git a/public/biopet-extensions/pom.xml b/public/biopet-extensions/pom.xml index 26cc102b5df164ac6fc4df51fe89eb33b87ffe27..10e0d2c4705256070f1e6480f34c14c0951edc19 100644 --- a/public/biopet-extensions/pom.xml +++ b/public/biopet-extensions/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cufflinks.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cufflinks.scala index 9177b7a6b8a7c6828c574d3d88b2d4610137e998..4eb907f3c56744bba48e4f8e5d6b3551877f38fe 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cufflinks.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cufflinks.scala @@ -44,29 +44,29 @@ class Cufflinks(val root: Configurable) extends BiopetCommandLineFunction with V @Output(doc = "Output GTF file") lazy val outputGtf: File = { - require(input != null && output_dir != null, + require(input != null && outputDir != null, "Can not set Cufflinks GTF output while input file and/or output directory is not defined") // cufflinks always outputs a transcripts.gtf file in the output directory - new File(output_dir, "transcripts.gtf") + new File(outputDir, "transcripts.gtf") } @Output(doc = "Output isoform FPKM file") lazy val outputIsoformsFpkm: File = { - require(input != null && output_dir != null, + require(input != null && outputDir != null, "Can not set Cufflinks isoforms.fpkm_tracking output while input file and/or output directory is not defined") - new File(output_dir, "isoforms.fpkm_tracking") + new File(outputDir, "isoforms.fpkm_tracking") } @Output(doc = "Output GTF file") lazy val outputGenesFpkm: File = { - require(input != null && output_dir != null, + require(input != null && outputDir != null, "Can not set Cufflinks genes.fpkm_tracking output while input file and/or output directory is not defined") // cufflinks always outputs a genes.fpkm_tracking file in the output directory - new File(output_dir, "genes.fpkm_tracking") + new File(outputDir, "genes.fpkm_tracking") } /** write all output files to this directory [./] */ - var output_dir: File = config("output_dir", default = new File(".")) + var outputDir: File = config("output_dir", default = new File(".")) /** value of random number generator seed [0] */ var seed: Option[Int] = config("seed") @@ -75,106 +75,106 @@ class Cufflinks(val root: Configurable) extends BiopetCommandLineFunction with V var GTF: Option[File] = config("GTF") /** use reference transcript annotation to guide assembly */ - var GTF_guide: Option[File] = config("GTF_guide") + var gtfGuide: Option[File] = config("GTF_guide") /** ignore all alignment within transcripts in this file */ - var mask_file: Option[File] = config("mask_file") + var maskFile: Option[File] = config("mask_file") /** use bias correction - reference fasta required [NULL] */ - var frag_bias_correct: Option[String] = config("frag_bias_correct") + var fragBiasCorrect: Option[String] = config("frag_bias_correct") /** use 'rescue method' for multi-reads (more accurate) [FALSE] */ - var multi_read_correct: Boolean = config("multi_read_correct", default = false) + var multiReadCorrect: Boolean = config("multi_read_correct", default = false) /** library prep used for input reads [below] */ - var library_type: Option[String] = config("library_type") + var libraryType: Option[String] = config("library_type") /** Method used to normalize library sizes [below] */ - var library_norm_method: Option[String] = config("library_norm_method") + var libraryNormMethod: Option[String] = config("library_norm_method") /** average fragment length (unpaired reads only) [200] */ - var frag_len_mean: Option[Int] = config("frag_len_mean") + var fragLenMean: Option[Int] = config("frag_len_mean") /** fragment length std deviation (unpaired reads only) [80] */ - var frag_len_std_dev: Option[Int] = config("frag_len_std_dev") + var fragLenStdDev: Option[Int] = config("frag_len_std_dev") /** maximum iterations allowed for MLE calculation [5000] */ - var max_mle_iterations: Option[Int] = config("max_mle_iterations") + var maxMleIterations: Option[Int] = config("max_mle_iterations") /** count hits compatible with reference RNAs only [FALSE] */ - var compatible_hits_norm: Boolean = config("compatible_hits_norm", default = false) + var compatibleHitsNorm: Boolean = config("compatible_hits_norm", default = false) /** count all hits for normalization [TRUE] */ - var total_hits_norm: Boolean = config("total_hits_norm", default = true) + var totalHitsNorm: Boolean = config("total_hits_norm", default = true) /** Number of fragment generation samples [100] */ - var num_frag_count_draws: Option[Int] = config("num_frag_count_draws") + var numFragCountDraws: Option[Int] = config("num_frag_count_draws") /** Number of fragment assignment samples per generation [50] */ - var num_frag_assign_draws: Option[Int] = config("num_frag_assign_draws") + var numFragAssignDraws: Option[Int] = config("num_frag_assign_draws") /** Maximum number of alignments allowed per fragment [unlim] */ - var max_frag_multihits: Option[Int] = config("max_frag_multihits") + var maxFragMultihits: Option[Int] = config("max_frag_multihits") /** No effective length correction [FALSE] */ - var no_effective_length_correction: Boolean = config("no_effective_length_correction", default = false) + var noEffectiveLengthCorrection: Boolean = config("no_effective_length_correction", default = false) /** No length correction [FALSE] */ - var no_length_correction: Boolean = config("no_length_correction", default = false) + var noLengthCorrection: Boolean = config("no_length_correction", default = false) /** assembled transcripts have this ID prefix [CUFF] */ var label: Option[String] = config("label") /** suppress transcripts below this abundance level [0.10] */ - var min_isoform_fraction: Option[Float] = config("min_isoform_fraction") + var minIsoformFraction: Option[Float] = config("min_isoform_fraction") /** suppress intra-intronic transcripts below this level [0.15] */ - var pre_mrna_fraction: Option[Float] = config("pre_mrna_fraction") + var preMrnaFraction: Option[Float] = config("pre_mrna_fraction") /** ignore alignments with gaps longer than this [300000] */ - var max_intron_length: Option[Int] = config("max_intron_length") + var maxIntronLength: Option[Int] = config("max_intron_length") /** alpha for junction binomial test filter [0.001] */ - var junc_alpha: Option[Float] = config("junc_alpha") + var juncAlpha: Option[Float] = config("junc_alpha") /** percent read overhang taken as 'suspiciously small' [0.09] */ - var small_anchor_fraction: Option[Float] = config("small_anchor_fraction") + var smallAnchorFraction: Option[Float] = config("small_anchor_fraction") /** minimum number of fragments needed for new transfrags [10] */ - var min_frags_per_transfrag: Option[Int] = config("min_frags_per_transfrag") + var minFragsPerTransfrag: Option[Int] = config("min_frags_per_transfrag") /** number of terminal exon bp to tolerate in introns [8] */ - var overhang_tolerance: Option[Int] = config("overhang_tolerance") + var overhangTolerance: Option[Int] = config("overhang_tolerance") /** maximum genomic length allowed for a given bundle [3500000] */ - var max_bundle_length: Option[Int] = config("max_bundle_length") + var maxBundleLength: Option[Int] = config("max_bundle_length") /** maximum fragments allowed in a bundle before skipping [500000] */ - var max_bundle_frags: Option[Int] = config("max_bundle_frags") + var maxBundleFrags: Option[Int] = config("max_bundle_frags") /** minimum intron size allowed in genome [50] */ - var min_intron_length: Option[Int] = config("min_intron_length") + var minIntronLength: Option[Int] = config("min_intron_length") /** minimum avg coverage required to attempt 3' trimming [10] */ - var trim_3_avgcov_thresh: Option[Int] = config("trim_3_avgcov_thresh") + var trim3AvgCovThresh: Option[Int] = config("trim_3_avgcov_thresh") /** fraction of avg coverage below which to trim 3' end [0.1] */ - var trim_3_dropoff_frac: Option[Float] = config("trim_3_dropoff_frac") + var trim3DropOffFrac: Option[Float] = config("trim_3_dropoff_frac") /** maximum fraction of allowed multireads per transcript [0.75] */ - var max_multiread_fraction: Option[Float] = config("max_multiread_fraction") + var maxMultireadFraction: Option[Float] = config("max_multiread_fraction") /** maximum gap size to fill between transfrags (in bp) [50] */ - var overlap_radius: Option[Int] = config("overlap_radius") + var overlapRadius: Option[Int] = config("overlap_radius") /** disable tiling by faux reads [FALSE] */ - var no_faux_reads: Boolean = config("no_faux_reads", default = false) + var noFauxReads: Boolean = config("no_faux_reads", default = false) /** overhang allowed on 3' end when merging with reference [600] */ - var flag_3_overhang_tolerance: Option[Int] = config("flag_3_overhang_tolerance") + var flag3OverhangTolerance: Option[Int] = config("flag_3_overhang_tolerance") /** overhang allowed inside reference intron when merging [30] */ - var intron_overhang_tolerance: Option[Int] = config("intron_overhang_tolerance") + var intronOverhangTolerance: Option[Int] = config("intron_overhang_tolerance") /** log-friendly verbose processing (no progress bar) [FALSE] */ var verbose: Boolean = config("verbose", default = false) @@ -183,7 +183,7 @@ class Cufflinks(val root: Configurable) extends BiopetCommandLineFunction with V var quiet: Boolean = config("quiet", default = false) /** do not contact server to check for update availability [FALSE] */ - var no_update_check: Boolean = config("no_update_check", default = false) + var noUpdateCheck: Boolean = config("no_update_check", default = false) def versionRegex = """cufflinks v(.*)""".r def versionCommand = executable @@ -191,46 +191,46 @@ class Cufflinks(val root: Configurable) extends BiopetCommandLineFunction with V def cmdLine = required(executable) + - required("--output-dir", output_dir) + + required("--output-dir", outputDir) + optional("--num-threads", threads) + optional("--seed", seed) + optional("--GTF", GTF) + - optional("--GTF-guide", GTF_guide) + - optional("--mask-file", mask_file) + - optional("--frag-bias-correct", frag_bias_correct) + - conditional(multi_read_correct, "--multi-read-correct") + - optional("--library-type", library_type) + - optional("--library-norm-method", library_norm_method) + - optional("--frag-len-mean", frag_len_mean) + - optional("--frag-len-std-dev", frag_len_std_dev) + - optional("--max-mle-iterations", max_mle_iterations) + - conditional(compatible_hits_norm, "--compatible-hits-norm") + - conditional(total_hits_norm, "--total-hits-norm") + - optional("--num-frag-count-draws", num_frag_count_draws) + - optional("--num-frag-assign-draws", num_frag_assign_draws) + - optional("--max-frag-multihits", max_frag_multihits) + - conditional(no_effective_length_correction, "--no-effective-length-correction") + - conditional(no_length_correction, "--no-length-correction") + + optional("--GTF-guide", gtfGuide) + + optional("--mask-file", maskFile) + + optional("--frag-bias-correct", fragBiasCorrect) + + conditional(multiReadCorrect, "--multi-read-correct") + + optional("--library-type", libraryType) + + optional("--library-norm-method", libraryNormMethod) + + optional("--frag-len-mean", fragLenMean) + + optional("--frag-len-std-dev", fragLenStdDev) + + optional("--max-mle-iterations", maxMleIterations) + + conditional(compatibleHitsNorm, "--compatible-hits-norm") + + conditional(totalHitsNorm, "--total-hits-norm") + + optional("--num-frag-count-draws", numFragCountDraws) + + optional("--num-frag-assign-draws", numFragAssignDraws) + + optional("--max-frag-multihits", maxFragMultihits) + + conditional(noEffectiveLengthCorrection, "--no-effective-length-correction") + + conditional(noLengthCorrection, "--no-length-correction") + optional("--label", label) + - optional("--min-isoform-fraction", min_isoform_fraction) + - optional("--pre-mrna-fraction", pre_mrna_fraction) + - optional("--max-intron-length", max_intron_length) + - optional("--junc-alpha", junc_alpha) + - optional("--small-anchor-fraction", small_anchor_fraction) + - optional("--min-frags-per-transfrag", min_frags_per_transfrag) + - optional("--overhang-tolerance", overhang_tolerance) + - optional("--max-bundle-length", max_bundle_length) + - optional("--max-bundle-frags", max_bundle_frags) + - optional("--min-intron-length", min_intron_length) + - optional("--trim-3-avgcov-thresh", trim_3_avgcov_thresh) + - optional("--trim-3-dropoff-frac", trim_3_dropoff_frac) + - optional("--max-multiread-fraction", max_multiread_fraction) + - optional("--overlap-radius", overlap_radius) + - conditional(no_faux_reads, "--no-faux-reads") + - optional("--flag-3-overhang-tolerance", flag_3_overhang_tolerance) + - optional("--intron-overhang-tolerance", intron_overhang_tolerance) + + optional("--min-isoform-fraction", minIsoformFraction) + + optional("--pre-mrna-fraction", preMrnaFraction) + + optional("--max-intron-length", maxIntronLength) + + optional("--junc-alpha", juncAlpha) + + optional("--small-anchor-fraction", smallAnchorFraction) + + optional("--min-frags-per-transfrag", minFragsPerTransfrag) + + optional("--overhang-tolerance", overhangTolerance) + + optional("--max-bundle-length", maxBundleLength) + + optional("--max-bundle-frags", maxBundleFrags) + + optional("--min-intron-length", minIntronLength) + + optional("--trim-3-avgcov-thresh", trim3AvgCovThresh) + + optional("--trim-3-dropoff-frac", trim3DropOffFrac) + + optional("--max-multiread-fraction", maxMultireadFraction) + + optional("--overlap-radius", overlapRadius) + + conditional(noFauxReads, "--no-faux-reads") + + optional("--flag-3-overhang-tolerance", flag3OverhangTolerance) + + optional("--intron-overhang-tolerance", intronOverhangTolerance) + conditional(verbose, "--verbose") + conditional(quiet, "--quiet") + - conditional(no_update_check, "--no-update-check") + + conditional(noUpdateCheck, "--no-update-check") + required(input) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cuffquant.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cuffquant.scala index d7ead689a92b19dbb38dccee1eb8b96312736923..6ac11bc43f9ff0250669d1a9fbcd1a6e04c9bc3f 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cuffquant.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cuffquant.scala @@ -40,46 +40,46 @@ class Cuffquant(val root: Configurable) extends BiopetCommandLineFunction with V /** input GTF file */ @Input(doc = "Input GTF file", required = true) - var transcripts_gtf: File = null + var transcriptsGtf: File = null /** output file, computed automatically from output directory */ @Output(doc = "Output CXB file") lazy val outputCxb: File = { - require(output_dir != null, + require(outputDir != null, "Can not set Cuffquant CXB output while input file and/or output directory is not defined") // cufflinks always outputs a transcripts.gtf file in the output directory - new File(output_dir, "abundances.cxb") + new File(outputDir, "abundances.cxb") } /** write all output files to this directory [./] */ - var output_dir: File = config("output_dir", default = new File(".")) + var outputDir: File = config("output_dir", default = new File(".")) /** ignore all alignment within transcripts in this file */ - var mask_file: Option[File] = config("mask_file") + var maskFile: Option[File] = config("mask_file") /** use bias correction - reference fasta required [NULL] */ - var frag_bias_correct: Option[String] = config("frag_bias_correct") + var fragBiasCorrect: Option[String] = config("frag_bias_correct") /** use 'rescue method' for multi-reads (more accurate) [FALSE] */ - var multi_read_correct: Boolean = config("multi_read_correct", default = false) + var multiReadCorrect: Boolean = config("multi_read_correct", default = false) /** number of threads used during analysis [1] */ - var num_threads: Option[Int] = config("num_threads") + var numThreads: Option[Int] = config("num_threads") /** library prep used for input reads [below] */ - var library_type: Option[String] = config("library_type") + var libraryType: Option[String] = config("library_type") /** average fragment length (unpaired reads only) [200] */ - var frag_len_mean: Option[Int] = config("frag_len_mean") + var fragLenMean: Option[Int] = config("frag_len_mean") /** fragment length std deviation (unpaired reads only) [80] */ - var frag_len_std_dev: Option[Int] = config("frag_len_std_dev") + var fragLenStdDev: Option[Int] = config("frag_len_std_dev") /** minimum number of alignments in a locus for testing [10] */ - var min_alignment_count: Option[Int] = config("min_alignment_count") + var minAlignmentCount: Option[Int] = config("min_alignment_count") /** maximum iterations allowed for MLE calculation [5000] */ - var max_mle_iterations: Option[Int] = config("max_mle_iterations") + var maxMleIterations: Option[Int] = config("max_mle_iterations") /** log-friendly verbose processing (no progress bar) [FALSE] */ var verbose: Boolean = config("verbose", default = false) @@ -91,31 +91,31 @@ class Cuffquant(val root: Configurable) extends BiopetCommandLineFunction with V var seed: Option[Int] = config("seed") /** do not contact server to check for update availability [FALSE] */ - var no_update_check: Boolean = config("no_update_check", default = false) + var noUpdateCheck: Boolean = config("no_update_check", default = false) /** maximum fragments allowed in a bundle before skipping [500000] */ - var max_bundle_frags: Option[Int] = config("max_bundle_frags") + var maxBundleFrags: Option[Int] = config("max_bundle_frags") /** Maximum number of alignments allowed per fragment [unlim] */ - var max_frag_multihits: Option[Int] = config("max_frag_multihits") + var maxFragMultihits: Option[Int] = config("max_frag_multihits") /** No effective length correction [FALSE] */ - var no_effective_length_correction: Boolean = config("no_effective_length_correction", default = false) + var noEffectiveLengthCorrection: Boolean = config("no_effective_length_correction", default = false) /** No length correction [FALSE] */ - var no_length_correction: Boolean = config("no_length_correction", default = false) + var noLengthCorrection: Boolean = config("no_length_correction", default = false) /** Skip a random subset of reads this size [0.0] */ - var read_skip_fraction: Option[Double] = config("read_skip_fraction") + var readSkipFraction: Option[Double] = config("read_skip_fraction") /** Break all read pairs [FALSE] */ - var no_read_pairs: Boolean = config("no_read_pairs", default = false) + var noReadPairs: Boolean = config("no_read_pairs", default = false) /** Trim reads to be this long (keep 5' end) [none] */ - var trim_read_length: Option[Int] = config("trim_read_length") + var trimReadLength: Option[Int] = config("trim_read_length") /** Disable SCV correction */ - var no_scv_correction: Boolean = config("no_scv_correction", default = false) + var noScvCorrection: Boolean = config("no_scv_correction", default = false) def versionRegex = """cuffquant v(.*)""".r def versionCommand = executable @@ -123,28 +123,28 @@ class Cuffquant(val root: Configurable) extends BiopetCommandLineFunction with V def cmdLine = required(executable) + - required("--output-dir", output_dir) + - optional("--mask-file", mask_file) + - optional("--frag-bias-correct", frag_bias_correct) + - conditional(multi_read_correct, "--multi-read-correct") + - optional("--num-threads", num_threads) + - optional("--library-type", library_type) + - optional("--frag-len-mean", frag_len_mean) + - optional("--frag-len-std-dev", frag_len_std_dev) + - optional("--min-alignment-count", min_alignment_count) + - optional("--max-mle-iterations", max_mle_iterations) + + required("--output-dir", outputDir) + + optional("--mask-file", maskFile) + + optional("--frag-bias-correct", fragBiasCorrect) + + conditional(multiReadCorrect, "--multi-read-correct") + + optional("--num-threads", numThreads) + + optional("--library-type", libraryType) + + optional("--frag-len-mean", fragLenMean) + + optional("--frag-len-std-dev", fragLenStdDev) + + optional("--min-alignment-count", minAlignmentCount) + + optional("--max-mle-iterations", maxMleIterations) + conditional(verbose, "--verbose") + conditional(quiet, "--quiet") + optional("--seed", seed) + - conditional(no_update_check, "--no-update-check") + - optional("--max-bundle-frags", max_bundle_frags) + - optional("--max-frag-multihits", max_frag_multihits) + - conditional(no_effective_length_correction, "--no-effective-length-correction") + - conditional(no_length_correction, "--no-length-correction") + - optional("--read-skip-fraction", read_skip_fraction) + - conditional(no_read_pairs, "--no-read-pairs") + - optional("--trim-read-length", trim_read_length) + - conditional(no_scv_correction, "--no-scv-correction") + - required(transcripts_gtf) + + conditional(noUpdateCheck, "--no-update-check") + + optional("--max-bundle-frags", maxBundleFrags) + + optional("--max-frag-multihits", maxFragMultihits) + + conditional(noEffectiveLengthCorrection, "--no-effective-length-correction") + + conditional(noLengthCorrection, "--no-length-correction") + + optional("--read-skip-fraction", readSkipFraction) + + conditional(noReadPairs, "--no-read-pairs") + + optional("--trim-read-length", trimReadLength) + + conditional(noScvCorrection, "--no-scv-correction") + + required(transcriptsGtf) + required(input.map(_.mkString(";").mkString(" "))) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Curl.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Curl.scala index 62d87a672dd8dc89b0c9adcb78fc7a20aa002982..b0ccb54876d696795698f2497b3b9b88ba335ad2 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Curl.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Curl.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cutadapt.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cutadapt.scala index 23d1311113c24a617c0d9d6974a6edffaadddcaa..60c25a5a69820b72a7e5bbd0f17cc8b5f0dac3fe 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cutadapt.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Cutadapt.scala @@ -17,9 +17,9 @@ package nl.lumc.sasc.biopet.extensions import java.io.File -import nl.lumc.sasc.biopet.core.{ Version, BiopetCommandLineFunction } -import nl.lumc.sasc.biopet.utils.config.Configurable import nl.lumc.sasc.biopet.core.summary.Summarizable +import nl.lumc.sasc.biopet.core.{ BiopetCommandLineFunction, Version } +import nl.lumc.sasc.biopet.utils.config.Configurable import org.broadinstitute.gatk.utils.commandline.{ Input, Output } import scala.collection.mutable @@ -165,28 +165,66 @@ class Cutadapt(val root: Configurable) extends BiopetCommandLineFunction with Su /** Output summary stats */ def summaryStats: Map[String, Any] = { - val trimR = """.*Trimmed reads: *(\d*) .*""".r - val tooShortR = """.*Too short reads: *(\d*) .*""".r - val tooLongR = """.*Too long reads: *(\d*) .*""".r - val adapterR = """Adapter '([C|T|A|G]*)'.*trimmed (\d*) times.""".r - - val stats: mutable.Map[String, Int] = mutable.Map("trimmed" -> 0, "tooshort" -> 0, "toolong" -> 0) - val adapter_stats: mutable.Map[String, Int] = mutable.Map() - - if (statsOutput.exists) for (line <- Source.fromFile(statsOutput).getLines()) { - line match { - case trimR(m) => stats += ("trimmed" -> m.toInt) - case tooShortR(m) => stats += ("tooshort" -> m.toInt) - case tooLongR(m) => stats += ("toolong" -> m.toInt) - case adapterR(adapter, count) => adapter_stats += (adapter -> count.toInt) - case _ => + /** + * The following regex is specific for Cutadapt 1.7+ + */ + + val processedReads = """Total reads processed: *([,\d]+).*""".r + val withAdapters = """.* with adapters: *([,\d]+) .*""".r + val readsPassingFilters = """.* written \(passing filters\): *([,\d]+) .*""".r + + val tooShortR = """.* that were too short: *([,\d]+) .*""".r + val tooLongR = """.* that were too long: *([,\d]+) .*""".r + + val tooManyN = """.* with too many N: *([,\d]+) .*""".r + val adapterR = """Sequence ([C|T|A|G]*);.*Trimmed: ([,\d]+) times.""".r + + val basePairsProcessed = """Total basepairs processed: *([,\d]+) bp""".r + val basePairsWritten = """Total written \(filtered\): *([,\d]+) bp .*""".r + + val stats: mutable.Map[String, Long] = mutable.Map( + "processed" -> 0, + "withadapters" -> 0, + "passingfilters" -> 0, + "tooshort" -> 0, + "toolong" -> 0, + "bpinput" -> 0, + "bpoutput" -> 0, + "toomanyn" -> 0 + ) + val adapterStats: mutable.Map[String, Long] = mutable.Map() + + if (statsOutput.exists) { + val statsFile = Source.fromFile(statsOutput) + for (line <- statsFile.getLines()) { + line match { + case processedReads(m) => stats("processed") = m.replaceAll(",", "").toLong + case withAdapters(m) => stats("withadapters") = m.replaceAll(",", "").toLong + case readsPassingFilters(m) => stats("passingfilters") = m.replaceAll(",", "").toLong + case tooShortR(m) => stats("tooshort") = m.replaceAll(",", "").toLong + case tooLongR(m) => stats("toolong") = m.replaceAll(",", "").toLong + case tooManyN(m) => stats("toomanyn") = m.replaceAll(",", "").toLong + case basePairsProcessed(m) => stats("bpinput") = m.replaceAll(",", "").toLong + case basePairsWritten(m) => stats("bpoutput") = m.replaceAll(",", "").toLong + case adapterR(adapter, count) => adapterStats += (adapter -> count.toLong) + case _ => + } } } - Map("num_reads_affected" -> stats("trimmed"), + val cleanReads = stats("processed") - stats("withadapters") + val trimmed = stats("passingfilters") - cleanReads + + Map("num_reads_affected" -> trimmed, + "num_reads_input" -> stats("processed"), + "num_reads_with_adapters" -> stats("withadapters"), + "num_reads_output" -> stats("passingfilters"), "num_reads_discarded_too_short" -> stats("tooshort"), "num_reads_discarded_too_long" -> stats("toolong"), - adaptersStatsName -> adapter_stats.toMap + "num_reads_discarded_many_n" -> stats("toomanyn"), + "num_bases_input" -> stats("bpinput"), + "num_based_output" -> stats("bpoutput"), + adaptersStatsName -> adapterStats.toMap ) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Fastqc.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Fastqc.scala index 9beb0377c325c13ffc6c1ae79391dc74c41c5cc7..8f32e947b395a60234a4fd629711e4f1db63bd2d 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Fastqc.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Fastqc.scala @@ -40,7 +40,7 @@ class Fastqc(val root: Configurable) extends BiopetCommandLineFunction with Vers var output: File = null executable = config("exe", default = "fastqc") - var java_exe: String = config("exe", default = "java", configNamespace = "java", freeVar = false) + var javaExe: String = config("exe", default = "java", configNamespace = "java", freeVar = false) var kmers: Option[Int] = config("kmers") var quiet: Boolean = config("quiet", default = false) var noextract: Boolean = config("noextract", default = false) diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Flash.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Flash.scala index 4013b8aeada8962ebe5893bed017e95056715317..7e4ef448859682383759a46083e66e12515106e1 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Flash.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Flash.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Freebayes.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Freebayes.scala index f2a387a08b50e37f44dcb21911acff65bab808c3..e61b4c1c18ae907681f60ef6567629959babbf3d 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Freebayes.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Freebayes.scala @@ -37,7 +37,92 @@ class Freebayes(val root: Configurable) extends BiopetCommandLineFunction with R @Output(required = true) var outputVcf: File = null - var ploidy: Option[Int] = config("ploidy") + @Input(required = false) + var bamList: Option[File] = config("bam_list") + + @Input(required = false) + var targets: Option[File] = config("targets", freeVar = false) + + @Input(required = false) + var samples: Option[File] = config("samples", freeVar = false) + + @Input(required = false) + var populations: Option[File] = config("populations", freeVar = false) + + @Input(required = false) + var cnvMap: Option[File] = config("cnv_map", freeVar = false) + + @Input(required = false) + var trace: Option[File] = config("trace", freeVar = false) + + @Input(required = false) + var failedAlleles: Option[File] = config("failed_alleles", freeVar = false) + + @Input(required = false) + var observationBias: Option[File] = config("observation_bias") + + @Input(required = false) + var contaminationEstimates: Option[File] = config("contamination_estimates") + + @Input(required = false) + var variantInput: Option[File] = config("variant_input", freeVar = false) + + @Input(required = false) + var haplotypeBasisAlleles: Option[File] = config("haplotype_basis_alleles", freeVar = false) + + var pvar: Option[Int] = config("pvar", freeVar = false) + var theta: Option[Int] = config("theta", freeVar = false) + var ploidy: Option[Int] = config("ploidy", freeVar = false) + var useBestNAlleles: Option[Int] = config("use_best_n_alleles") + var maxComplexGap: Option[Int] = config("max_complex_gap") + var minRepeatSize: Option[Int] = config("min_repeat_size") + var minRepeatEntropy: Option[Int] = config("min_repeat_entropy") + var readMismatchLimit: Option[Int] = config("read_mismatch_limit") + var readMaxMismatchFraction: Option[Int] = config("read_max_mismatch_fraction") + var readSnpLimit: Option[Int] = config("read_snp_limit") + var readIndelLimit: Option[Int] = config("read_indel_limit") + var minAlternateFraction: Option[Double] = config("min_alternate_fraction") + var minAlternateCount: Option[Int] = config("min_alternate_count") + var minAlternateQsum: Option[Int] = config("min_alternate_qsum") + var minAlternateTotal: Option[Int] = config("min_alternate_total") + var minCoverage: Option[Int] = config("min_coverage") + var genotypingMaxIterations: Option[Int] = config("genotyping_max_iterations") + var genotypingMaxBanddepth: Option[Int] = config("genotyping_max_banddepth") + var genotypeVariantThreshold: Option[Int] = config("genotype_variant_threshold") + var readDependenceFactor: Option[Int] = config("read_dependence_factor") + var minMappingQuality: Option[Double] = config("min_mapping_quality") + var minBaseQuality: Option[Double] = config("min_base_quality") + var minSupportingAlleleQsum: Option[Double] = config("min_supporting_allele_qsum") + var minSupportingMappingQsum: Option[Double] = config("min_supporting_mapping_qsum") + var mismatchBaseQualityThreshold: Option[Double] = config("mismatch_base_quality_threshold") + var baseQualityCap: Option[Double] = config("base_quality_cap") + var probContamination: Option[Double] = config("prob_contamination") + var onlyUseInputAlleles: Boolean = config("only_use_input_alleles", default = false) + var reportAllHaplotypeAlleles: Boolean = config("report_all_haplotype_alleles", default = false) + var reportMonomorphic: Boolean = config("report_monomorphic", default = false) + var pooledDiscrete: Boolean = config("pooled_discrete", default = false) + var pooledContinuous: Boolean = config("pooled_continuous", default = false) + var useReferenceAllele: Boolean = config("use_reference_allele", default = false) + var noSnps: Boolean = config("no_snps", default = false) + var noIndels: Boolean = config("no_indels", default = false) + var noMnps: Boolean = config("no_mnps", default = false) + var noComplex: Boolean = config("no_complex", default = false) + var noPartialObservations: Boolean = config("no_partial_observations", default = false) + var dontLeftAlignIndels: Boolean = config("dont_left_align_indels", default = false) + var useDuplicateReads: Boolean = config("use_duplicate_reads", default = false) + var standardFilters: Boolean = config("standard_filters", default = false) + var noPopulationPriors: Boolean = config("no_population_priors", default = false) + var hwePriorsOff: Boolean = config("hwe_priors_off", default = false) + var binomialObsPriorsOff: Boolean = config("binomial_obs_priors_off", default = false) + var alleleBalancePriorsOff: Boolean = config("allele_balance_priors_off", default = false) + var legacyGls: Boolean = config("legacy_gls", default = false) + var reportGenotypeLikelihoodMax: Boolean = config("report_genotype_likelihood_max", default = false) + var excludeUnobservedGenotypes: Boolean = config("exclude_unobserved_genotypes", default = false) + var useMappingQuality: Boolean = config("use_mapping_quality", default = false) + var harmonicIndelQuality: Boolean = config("harmonic_indel_quality", default = false) + var genotypeQualities: Boolean = config("genotype_qualities", default = false) + var debug: Boolean = config("debug", default = logger.isDebugEnabled) + var haplotypeLength: Option[Int] = config("haplotype_length") executable = config("exe", default = "freebayes") @@ -52,7 +137,70 @@ class Freebayes(val root: Configurable) extends BiopetCommandLineFunction with R def cmdLine = executable + required("--fasta-reference", reference) + repeat("--bam", bamfiles) + - optional("--vcf", outputVcf) + + optional("--bam-list", bamList) + + optional("--targets", targets) + + optional("--samples", samples) + + optional("--populations", populations) + + optional("--cnv-map", cnvMap) + + optional("--trace", trace) + + optional("--failed-alleles", failedAlleles) + + optional("--observation-bias", observationBias) + + optional("--contamination-estimates", contaminationEstimates) + + optional("--variant-input", variantInput) + + optional("--haplotype-basis-alleles", haplotypeBasisAlleles) + + optional("--pvar", pvar) + + optional("--theta", theta) + optional("--ploidy", ploidy) + - optional("--haplotype-length", haplotypeLength) + optional("--use-best-n-alleles", useBestNAlleles) + + optional("--max-complex-gap", maxComplexGap) + + optional("--min-repeat-size", minRepeatSize) + + optional("--min-repeat-entropy", minRepeatEntropy) + + optional("--read-mismatch-limit", readMismatchLimit) + + optional("--read-max-mismatch-fraction", readMaxMismatchFraction) + + optional("--read-snp-limit", readSnpLimit) + + optional("--read-indel-limit", readIndelLimit) + + optional("--min-alternate-fraction", minAlternateFraction) + + optional("--min-alternate-count", minAlternateCount) + + optional("--min-alternate-qsum", minAlternateQsum) + + optional("--min-alternate-total", minAlternateTotal) + + optional("--min-coverage", minCoverage) + + optional("--genotyping-max-iterations", genotypingMaxIterations) + + optional("--genotyping-max-banddepth", genotypingMaxBanddepth) + + optional("--genotype-variant-threshold", genotypeVariantThreshold) + + optional("--read-dependence-factor", readDependenceFactor) + + optional("--min-mapping-quality", minMappingQuality) + + optional("--min-base-quality", minBaseQuality) + + optional("--min-supporting-allele-qsum", minSupportingAlleleQsum) + + optional("--min-supporting-mapping-qsum", minSupportingMappingQsum) + + optional("--mismatch-base-quality-threshold", mismatchBaseQualityThreshold) + + optional("--base-quality-cap", baseQualityCap) + + optional("--prob-contamination", probContamination) + + conditional(onlyUseInputAlleles, "--only-use-input-alleles") + + conditional(reportAllHaplotypeAlleles, "--report-all-haplotype-alleles") + + conditional(reportMonomorphic, "--report-monomorphic") + + conditional(pooledDiscrete, "--pooled-discrete") + + conditional(pooledContinuous, "--pooled-continuous") + + conditional(useReferenceAllele, "--use-reference-allele") + + conditional(noSnps, "--no-snps") + + conditional(noIndels, "--no-indels") + + conditional(noMnps, "--no-mnps") + + conditional(noComplex, "--no-complex") + + conditional(noPartialObservations, "--no-partial-observations") + + conditional(dontLeftAlignIndels, "--dont-left-align-indels") + + conditional(useDuplicateReads, "--use-duplicate-reads") + + conditional(standardFilters, "--standard-filters") + + conditional(noPopulationPriors, "--no-population-priors") + + conditional(hwePriorsOff, "--hwe-priors-off") + + conditional(binomialObsPriorsOff, "--binomial-obs-priors-off") + + conditional(alleleBalancePriorsOff, "--allele-balance-priors-off") + + conditional(legacyGls, "--legacy-gls") + + conditional(reportGenotypeLikelihoodMax, "--report-genotype-likelihood-max") + + conditional(excludeUnobservedGenotypes, "--exclude-unobserved-genotypes") + + conditional(useMappingQuality, "--use-mapping-quality") + + conditional(harmonicIndelQuality, "--harmonic-indel-quality") + + conditional(genotypeQualities, "--genotype-qualities") + + conditional(debug, "--debug") + + optional("--haplotype-length", haplotypeLength) + + (if (inputAsStdin) required("--stdin") else "") + + (if (outputAsStsout) "" else optional("--vcf", outputVcf)) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Pysvtools.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Pysvtools.scala new file mode 100644 index 0000000000000000000000000000000000000000..d9ee8e68a5e9237341a3838718523741b9c90465 --- /dev/null +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Pysvtools.scala @@ -0,0 +1,74 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ +package nl.lumc.sasc.biopet.extensions + +import java.io.File + +import nl.lumc.sasc.biopet.core.BiopetCommandLineFunction +import nl.lumc.sasc.biopet.utils.Logging +import nl.lumc.sasc.biopet.utils.config.Configurable +import org.broadinstitute.gatk.utils.commandline._ + +/** + * Created by wyleung on 8-1-16. + */ +class Pysvtools(val root: Configurable) extends BiopetCommandLineFunction { + + @Input(doc = "Input file", required = true) + var input: List[File] = Nil + + @Argument(doc = "Set flanking amount") + var flanking: Option[Int] = config("flanking") + + var exclusionRegions: List[File] = config("exclusion_regions") + var translocationsOnly: Boolean = config("translocations_only", default = false) + + @Output(doc = "Unzipped file", required = true) + var output: File = _ + + var tsvoutput: File = _ + var bedoutput: File = _ + var regionsoutput: File = _ + + executable = config("exe", default = "vcf_merge_sv_events") + + def versionRegex = """PySVtools (.*)""".r + def versionCommand = executable + " --version" + override def defaultThreads = 2 + + override def beforeGraph(): Unit = { + // TODO: we might want to validate the VCF before we start to tool? or is this a responsibility of the tool itself? + if (input.isEmpty) { + Logging.addError("No input VCF is given") + } + + // redefine the tsv, bed and regions output + val outputNamePrefix = output.getAbsolutePath.stripSuffix(".vcf") + tsvoutput = new File(outputNamePrefix + ".tsv") + bedoutput = new File(outputNamePrefix + ".bed") + regionsoutput = new File(outputNamePrefix + ".regions.bed") + } + + /** return commandline to execute */ + def cmdLine = required(executable) + + repeat("-c", input) + + optional("-f", flanking) + + "-i " + repeat(input) + + "-o " + required(tsvoutput) + + "-b " + required(bedoutput) + + "-v " + required(output) + + "-r " + required(regionsoutput) +} \ No newline at end of file diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Sickle.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Sickle.scala index f6571f22065809adacdee46f345997b344897018..9ed1b053822259090a0793045339de270af72f03 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Sickle.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Sickle.scala @@ -31,22 +31,22 @@ import scala.io.Source */ class Sickle(val root: Configurable) extends BiopetCommandLineFunction with Summarizable with Version { @Input(doc = "R1 input") - var input_R1: File = _ + var inputR1: File = _ @Input(doc = "R2 input", required = false) - var input_R2: File = _ + var inputR2: File = _ @Output(doc = "R1 output", required = false) - var output_R1: File = _ + var outputR1: File = _ @Output(doc = "R2 output", required = false) - var output_R2: File = _ + var outputR2: File = _ @Output(doc = "singles output", required = false) - var output_singles: File = _ + var outputSingles: File = _ @Output(doc = "stats output") - var output_stats: File = _ + var outputStats: File = _ executable = config("exe", default = "sickle", freeVar = false) var qualityType: Option[String] = config("qualitytype") @@ -67,22 +67,22 @@ class Sickle(val root: Configurable) extends BiopetCommandLineFunction with Summ /** Return command to execute */ def cmdLine = { var cmd: String = required(executable) - if (input_R2 != null) { + if (inputR2 != null) { cmd += required("pe") + - required("-r", input_R2) + - required("-p", output_R2) + - required("-s", output_singles) + required("-r", inputR2) + + required("-p", outputR2) + + required("-s", outputSingles) } else cmd += required("se") cmd + - (if (inputAsStdin) required("-f", new File("/dev/stdin")) else required("-f", input_R1)) + + (if (inputAsStdin) required("-f", new File("/dev/stdin")) else required("-f", inputR1)) + required("-t", qualityType) + - (if (outputAsStsout) required("-o", new File("/dev/stdout")) else required("-o", output_R1)) + + (if (outputAsStsout) required("-o", new File("/dev/stdout")) else required("-o", outputR1)) + optional("-q", qualityThreshold) + optional("-l", lengthThreshold) + conditional(noFiveprime, "-x") + conditional(discardN, "-n") + conditional(quiet || outputAsStsout, "--quiet") + - (if (outputAsStsout) "" else " > " + required(output_stats)) + (if (outputAsStsout) "" else " > " + required(outputStats)) } /** returns stats map for summary */ @@ -98,7 +98,7 @@ class Sickle(val root: Configurable) extends BiopetCommandLineFunction with Summ var stats: mutable.Map[String, Int] = mutable.Map() - if (output_stats.exists) for (line <- Source.fromFile(output_stats).getLines()) { + if (outputStats.exists) for (line <- Source.fromFile(outputStats).getLines()) { line match { // single run case sKept(num) => stats += ("num_reads_kept" -> num.toInt) diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Star.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Star.scala index 916e2b4c9d8465426fe693512ae739d2af393979..913c95b7b2a8a40e51682b891816e167128f7e21 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Star.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Star.scala @@ -146,24 +146,24 @@ object Star { outputDir: File, isIntermediate: Boolean = false, deps: List[File] = Nil): (File, List[Star]) = { - val starCommand_pass1 = Star(configurable, R1, R2, new File(outputDir, "aln-pass1")) - starCommand_pass1.isIntermediate = isIntermediate - starCommand_pass1.deps = deps - starCommand_pass1.beforeGraph() - - val starCommand_reindex = new Star(configurable) - starCommand_reindex.sjdbFileChrStartEnd = starCommand_pass1.outputTab - starCommand_reindex.outputDir = new File(outputDir, "re-index") - starCommand_reindex.runmode = "genomeGenerate" - starCommand_reindex.isIntermediate = isIntermediate - starCommand_reindex.beforeGraph() - - val starCommand_pass2 = Star(configurable, R1, R2, new File(outputDir, "aln-pass2")) - starCommand_pass2.genomeDir = starCommand_reindex.outputDir - starCommand_pass2.isIntermediate = isIntermediate - starCommand_pass2.deps = deps - starCommand_pass2.beforeGraph() - - (starCommand_pass2.outputSam, List(starCommand_pass1, starCommand_reindex, starCommand_pass2)) + val starCommandPass1 = Star(configurable, R1, R2, new File(outputDir, "aln-pass1")) + starCommandPass1.isIntermediate = isIntermediate + starCommandPass1.deps = deps + starCommandPass1.beforeGraph() + + val starCommandReindex = new Star(configurable) + starCommandReindex.sjdbFileChrStartEnd = starCommandPass1.outputTab + starCommandReindex.outputDir = new File(outputDir, "re-index") + starCommandReindex.runmode = "genomeGenerate" + starCommandReindex.isIntermediate = isIntermediate + starCommandReindex.beforeGraph() + + val starCommandPass2 = Star(configurable, R1, R2, new File(outputDir, "aln-pass2")) + starCommandPass2.genomeDir = starCommandReindex.outputDir + starCommandPass2.isIntermediate = isIntermediate + starCommandPass2.deps = deps + starCommandPass2.beforeGraph() + + (starCommandPass2.outputSam, List(starCommandPass1, starCommandReindex, starCommandPass2)) } } \ No newline at end of file diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/TarExtract.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/TarExtract.scala index 3a0d36c0ff7615b89f632fa51218dd1b215e09af..48d8b6280685a38f16f74788562dd6edf5608f9b 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/TarExtract.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/TarExtract.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Tophat.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Tophat.scala index 495049da762dee0509e4f8ecb4e208ac0c3dfe3a..83a2b386fc5e9de8568e51804b3d2c1ad51ea8e3 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Tophat.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/Tophat.scala @@ -42,330 +42,330 @@ class Tophat(val root: Configurable) extends BiopetCommandLineFunction with Refe var R2: List[File] = List.empty[File] private def checkInputsOk(): Unit = - require(R1.nonEmpty && output_dir != null, "Read 1 input(s) are defined and output directory is defined") + require(R1.nonEmpty && outputDir != null, "Read 1 input(s) are defined and output directory is defined") /** output files, computed automatically from output directory */ @Output(doc = "Output SAM/BAM file") lazy val outputAcceptedHits: File = { checkInputsOk() - new File(output_dir, if (no_convert_bam) "accepted_hits.sam" else "accepted_hits.bam") + new File(outputDir, if (noConvertBam) "accepted_hits.sam" else "accepted_hits.bam") } @Output(doc = "Unmapped SAM/BAM file") lazy val outputUnmapped: File = { checkInputsOk() - new File(output_dir, if (no_convert_bam) "unmapped.sam" else "unmapped.bam") + new File(outputDir, if (noConvertBam) "unmapped.sam" else "unmapped.bam") } @Output(doc = "Deletions BED file") lazy val outputDeletions: File = { checkInputsOk() - new File(output_dir, "deletions.bed") + new File(outputDir, "deletions.bed") } @Output(doc = "Insertions BED file") lazy val outputInsertions: File = { checkInputsOk() - new File(output_dir, "insertions.bed") + new File(outputDir, "insertions.bed") } @Output(doc = "Junctions BED file") lazy val outputJunctions: File = { checkInputsOk() - new File(output_dir, "junctions.bed") + new File(outputDir, "junctions.bed") } @Output(doc = "Alignment summary file") lazy val outputAlignSummary: File = { checkInputsOk() - new File(output_dir, "align_summary.txt") + new File(outputDir, "align_summary.txt") } @Argument(doc = "Bowtie index", shortName = "bti", required = true) - var bowtie_index: String = config("bowtie_index") + var bowtieIndex: String = config("bowtie_index") /** write all output files to this directory [./] */ - var output_dir: File = config("output_dir", default = new File("tophat_out")) + var outputDir: File = config("output_dir", default = new File("tophat_out")) var bowtie1: Boolean = config("bowtie1", default = false) - var read_mismatches: Option[Int] = config("read_mismatches") + var readMismatches: Option[Int] = config("read_mismatches") - var read_gap_length: Option[Int] = config("read_gap_length") + var readGapLength: Option[Int] = config("read_gap_length") - var read_edit_dist: Option[Int] = config("read_edit_dist") + var readEditDist: Option[Int] = config("read_edit_dist") - var read_realign_edit_dist: Option[Int] = config("read_realign_edit_dist") + var readRealignEditDist: Option[Int] = config("read_realign_edit_dist") - var min_anchor: Option[Int] = config("min_anchor") + var minAnchor: Option[Int] = config("min_anchor") - var splice_mismatches: Option[String] = config("splice_mismatches") + var spliceMismatches: Option[String] = config("splice_mismatches") - var min_intron_length: Option[Int] = config("min_intron_length") + var minIntronLength: Option[Int] = config("min_intron_length") - var max_intron_length: Option[Int] = config("max_intron_length") + var maxIntronLength: Option[Int] = config("max_intron_length") - var max_multihits: Option[Int] = config("max_multihits") + var maxMultihits: Option[Int] = config("max_multihits") - var suppress_hits: Boolean = config("suppress_hits", default = false) + var suppressHits: Boolean = config("suppress_hits", default = false) - var transcriptome_max_hits: Option[Int] = config("transcriptome_max_hits") + var transcriptomeMaxHits: Option[Int] = config("transcriptome_max_hits") - var prefilter_multihits: Boolean = config("prefilter_multihits", default = false) + var preFilterMultihits: Boolean = config("prefilter_multihits", default = false) - var max_insertion_length: Option[Int] = config("max_insertion_length") + var maxInsertionLength: Option[Int] = config("max_insertion_length") - var max_deletion_length: Option[Int] = config("max_deletion_length") + var maxDeletionLength: Option[Int] = config("max_deletion_length") - var solexa_quals: Boolean = config("solexa_quals", default = false) + var solexaQuals: Boolean = config("solexa_quals", default = false) - var solexa1_3_quals: Boolean = config("solexa1.3_quals", default = false) + var solexa13Quals: Boolean = config("solexa1.3_quals", default = false) - var phred64_quals: Boolean = config("phred64_quals", default = false) + var phred64Quals: Boolean = config("phred64_quals", default = false) var quals: Boolean = config("quals", default = false) - var integer_quals: Boolean = config("integer_quals", default = false) + var integerQuals: Boolean = config("integer_quals", default = false) var color: Boolean = config("color", default = false) - var color_out: Boolean = config("color_out", default = false) + var colorOut: Boolean = config("color_out", default = false) - var library_type: Option[String] = config("library_type") + var libraryType: Option[String] = config("library_type") var resume: Option[String] = config("resume") var GTF: Option[String] = config("GTF") - var transcriptome_index: Option[String] = config("transcriptome_index") + var transcriptomeIndex: Option[String] = config("transcriptome_index") - var transcriptome_only: Boolean = config("transcriptome_only", default = false) + var transcriptomeOnly: Boolean = config("transcriptome_only", default = false) - var raw_juncs: Option[String] = config("raw_juncs") + var rawJuncs: Option[String] = config("raw_juncs") var insertions: Option[String] = config("insertions") var deletions: Option[String] = config("deletions") - var mate_inner_dist: Option[Int] = config("mate_inner_dist") + var mateInnerDist: Option[Int] = config("mate_inner_dist") - var mate_std_dev: Option[Int] = config("mate_std_dev") + var mateStdDev: Option[Int] = config("mate_std_dev") - var no_novel_juncs: Boolean = config("no_novel_juncs", default = false) + var noNovelJuncs: Boolean = config("no_novel_juncs", default = false) - var no_novel_indels: Boolean = config("no_novel_indels", default = false) + var noNovelIndels: Boolean = config("no_novel_indels", default = false) - var no_gtf_juncs: Boolean = config("no_gtf_juncs", default = false) + var noGtfJuncs: Boolean = config("no_gtf_juncs", default = false) - var no_coverage_search: Boolean = config("no_coverage_search", default = false) + var noCoverageSearch: Boolean = config("no_coverage_search", default = false) - var coverage_search: Boolean = config("coverage_search", default = false) + var coverageSearch: Boolean = config("coverage_search", default = false) - var microexon_search: Boolean = config("microexon_search", default = false) + var microexonSearch: Boolean = config("microexon_search", default = false) - var keep_tmp: Boolean = config("keep_tmp", default = false) + var keepTmp: Boolean = config("keep_tmp", default = false) - var tmp_dir: Option[String] = config("tmp_dir") + var tmpDir: Option[String] = config("tmp_dir") var zpacker: Option[String] = config("zpacker") - var unmapped_fifo: Boolean = config("unmapped_fifo", default = false) + var unmappedFifo: Boolean = config("unmapped_fifo", default = false) - var report_secondary_alignments: Boolean = config("report_secondary_alignments", default = false) + var reportSecondaryAlignments: Boolean = config("report_secondary_alignments", default = false) - var no_discordant: Boolean = config("no_discordant", default = false) + var noDiscordant: Boolean = config("no_discordant", default = false) - var no_mixed: Boolean = config("no_mixed", default = false) + var noMixed: Boolean = config("no_mixed", default = false) - var segment_mismatches: Option[Int] = config("segment_mismatches") + var segmentMismatches: Option[Int] = config("segment_mismatches") - var segment_length: Option[Int] = config("segment_length") + var segmentLength: Option[Int] = config("segment_length") - var bowtie_n: Boolean = config("bowtie_n", default = false) + var bowtieN: Boolean = config("bowtie_n", default = false) - var min_coverage_intron: Option[Int] = config("min_coverage_intron") + var minCoverageIntron: Option[Int] = config("min_coverage_intron") - var max_coverage_intron: Option[Int] = config("max_coverage_intron") + var maxCoverageIntron: Option[Int] = config("max_coverage_intron") - var min_segment_intron: Option[Int] = config("min_segment_intron") + var minSegmentIntron: Option[Int] = config("min_segment_intron") - var max_segment_intron: Option[Int] = config("max_segment_intron") + var maxSegmentIntron: Option[Int] = config("max_segment_intron") - var no_sort_bam: Boolean = config("no_sort_bam", default = false) + var noSortBam: Boolean = config("no_sort_bam", default = false) - var no_convert_bam: Boolean = config("no_convert_bam", default = false) + var noConvertBam: Boolean = config("no_convert_bam", default = false) - var keep_fasta_order: Boolean = config("keep_fasta_order", default = false) + var keepFastaOrder: Boolean = config("keep_fasta_order", default = false) - var allow_partial_mapping: Boolean = config("allow_partial_mapping", default = false) + var allowPartialMapping: Boolean = config("allow_partial_mapping", default = false) - var b2_very_fast: Boolean = config("b2_very_fast", default = false) + var b2VeryFast: Boolean = config("b2_very_fast", default = false) - var b2_fast: Boolean = config("b2_fast", default = false) + var b2Fast: Boolean = config("b2_fast", default = false) - var b2_sensitive: Boolean = config("b2_sensitive", default = false) + var b2Sensitive: Boolean = config("b2_sensitive", default = false) - var b2_very_sensitive: Boolean = config("b2_very_sensitive", default = false) + var b2VerySensitive: Boolean = config("b2_very_sensitive", default = false) - var b2_N: Option[Int] = config("b2_N") + var b2N: Option[Int] = config("b2_N") - var b2_L: Option[Int] = config("b2_L") + var b2L: Option[Int] = config("b2_L") - var b2_i: Option[String] = config("b2_i") + var b2I: Option[String] = config("b2_i") - var b2_n_ceil: Option[String] = config("b2_n_ceil") + var b2NCeil: Option[String] = config("b2_n_ceil") - var b2_gbar: Option[Int] = config("b2_gbar") + var b2Gbar: Option[Int] = config("b2_gbar") - var b2_mp: Option[String] = config("b2_mp") + var b2Mp: Option[String] = config("b2_mp") - var b2_np: Option[Int] = config("b2_np") + var b2Np: Option[Int] = config("b2_np") - var b2_rdg: Option[String] = config("b2_rdg") + var b2Rdg: Option[String] = config("b2_rdg") - var b2_rfg: Option[String] = config("b2_rfg") + var b2Rfg: Option[String] = config("b2_rfg") - var b2_score_min: Option[String] = config("b2_score_min") + var b2ScoreMin: Option[String] = config("b2_score_min") - var b2_D: Option[Int] = config("b2_D") + var b2D: Option[Int] = config("b2_D") - var b2_R: Option[Int] = config("b2_R") + var b2R: Option[Int] = config("b2_R") - var fusion_search: Boolean = config("fusion_search", default = false) + var fusionSearch: Boolean = config("fusion_search", default = false) - var fusion_anchor_length: Option[Int] = config("fusion_anchor_length") + var fusionAnchorLength: Option[Int] = config("fusion_anchor_length") - var fusion_min_dist: Option[Int] = config("fusion_min_dist") + var fusionMinDist: Option[Int] = config("fusion_min_dist") - var fusion_read_mismatches: Option[Int] = config("fusion_read_mismatches") + var fusionReadMismatches: Option[Int] = config("fusion_read_mismatches") - var fusion_multireads: Option[Int] = config("fusion_multireads") + var fusionMultireads: Option[Int] = config("fusion_multireads") - var fusion_multipairs: Option[Int] = config("fusion_multipairs") + var fusionMultipairs: Option[Int] = config("fusion_multipairs") - var fusion_ignore_chromosomes: Option[String] = config("fusion_ignore_chromosomes") + var fusionIgnoreChromosomes: Option[String] = config("fusion_ignore_chromosomes") - var fusion_do_not_resolve_conflicts: Boolean = config("fusion_do_not_resolve_conflicts", default = false) + var fusionDoNotResolveConflicts: Boolean = config("fusion_do_not_resolve_conflicts", default = false) - var rg_id: Option[String] = config("rg_id") + var rgId: Option[String] = config("rg_id") - var rg_sample: Option[String] = config("rg_sample") + var rgSample: Option[String] = config("rg_sample") - var rg_library: Option[String] = config("rg_library") + var rgLibrary: Option[String] = config("rg_library") - var rg_description: Option[String] = config("rg_description") + var rgDescription: Option[String] = config("rg_description") - var rg_platform_unit: Option[String] = config("rg_platform_unit") + var rgPlatformUnit: Option[String] = config("rg_platform_unit") - var rg_center: Option[String] = config("rg_center") + var rgCenter: Option[String] = config("rg_center") - var rg_date: Option[String] = config("rg_date") + var rgDate: Option[String] = config("rg_date") - var rg_platform: Option[String] = config("rg_platform") + var rgPlatform: Option[String] = config("rg_platform") override def beforeGraph: Unit = { super.beforeGraph - if (bowtie1 && !new File(bowtie_index).getParentFile.list().toList - .filter(_.startsWith(new File(bowtie_index).getName)).exists(_.endsWith(".ebwt"))) + if (bowtie1 && !new File(bowtieIndex).getParentFile.list().toList + .filter(_.startsWith(new File(bowtieIndex).getName)).exists(_.endsWith(".ebwt"))) throw new IllegalArgumentException("No bowtie1 index found for tophat") - else if (!new File(bowtie_index).getParentFile.list().toList - .filter(_.startsWith(new File(bowtie_index).getName)).exists(_.endsWith(".bt2"))) + else if (!new File(bowtieIndex).getParentFile.list().toList + .filter(_.startsWith(new File(bowtieIndex).getName)).exists(_.endsWith(".bt2"))) throw new IllegalArgumentException("No bowtie2 index found for tophat") } def cmdLine: String = required(executable) + - optional("-o", output_dir) + + optional("-o", outputDir) + conditional(bowtie1, "--bowtie1") + - optional("--read-mismatches", read_mismatches) + - optional("--read-gap-length", read_gap_length) + - optional("--read-edit-dist", read_edit_dist) + - optional("--read-realign-edit-dist", read_realign_edit_dist) + - optional("--min-anchor", min_anchor) + - optional("--splice-mismatches", splice_mismatches) + - optional("--min-intron-length", min_intron_length) + - optional("--max-intron-length", max_intron_length) + - optional("--max-multihits", max_multihits) + - conditional(suppress_hits, "--suppress-hits") + - optional("--transcriptome-max-hits", transcriptome_max_hits) + - conditional(prefilter_multihits, "--prefilter-multihits") + - optional("--max-insertion-length", max_insertion_length) + - optional("--max-deletion-length", max_deletion_length) + - conditional(solexa_quals, "--solexa-quals") + - conditional(solexa1_3_quals, "--solexa1.3-quals") + - conditional(phred64_quals, "--phred64-quals") + + optional("--read-mismatches", readMismatches) + + optional("--read-gap-length", readGapLength) + + optional("--read-edit-dist", readEditDist) + + optional("--read-realign-edit-dist", readRealignEditDist) + + optional("--min-anchor", minAnchor) + + optional("--splice-mismatches", spliceMismatches) + + optional("--min-intron-length", minIntronLength) + + optional("--max-intron-length", maxIntronLength) + + optional("--max-multihits", maxMultihits) + + conditional(suppressHits, "--suppress-hits") + + optional("--transcriptome-max-hits", transcriptomeMaxHits) + + conditional(preFilterMultihits, "--prefilter-multihits") + + optional("--max-insertion-length", maxInsertionLength) + + optional("--max-deletion-length", maxDeletionLength) + + conditional(solexaQuals, "--solexa-quals") + + conditional(solexa13Quals, "--solexa1.3-quals") + + conditional(phred64Quals, "--phred64-quals") + conditional(quals, "--quals") + - conditional(integer_quals, "--integer-quals") + + conditional(integerQuals, "--integer-quals") + conditional(color, "--color") + - conditional(color_out, "--color-out") + - optional("--library-type", library_type) + + conditional(colorOut, "--color-out") + + optional("--library-type", libraryType) + optional("--num-threads", threads) + optional("--resume", resume) + optional("--GTF", GTF) + - optional("--transcriptome-index", transcriptome_index) + - conditional(transcriptome_only, "--transcriptome-only") + - optional("--raw-juncs", raw_juncs) + + optional("--transcriptome-index", transcriptomeIndex) + + conditional(transcriptomeOnly, "--transcriptome-only") + + optional("--raw-juncs", rawJuncs) + optional("--insertions", insertions) + optional("--deletions", deletions) + - optional("--mate-inner-dist", mate_inner_dist) + - optional("--mate-std-dev", mate_std_dev) + - conditional(no_novel_juncs, "--no-novel-juncs") + - conditional(no_novel_indels, "--no-novel-indels") + - conditional(no_gtf_juncs, "--no-gtf-juncs") + - conditional(no_coverage_search, "--no-coverage-search") + - conditional(coverage_search, "--coverage-search") + - conditional(microexon_search, "--microexon-search") + - conditional(keep_tmp, "--keep-tmp") + - optional("--tmp-dir", tmp_dir) + + optional("--mate-inner-dist", mateInnerDist) + + optional("--mate-std-dev", mateStdDev) + + conditional(noNovelJuncs, "--no-novel-juncs") + + conditional(noNovelIndels, "--no-novel-indels") + + conditional(noGtfJuncs, "--no-gtf-juncs") + + conditional(noCoverageSearch, "--no-coverage-search") + + conditional(coverageSearch, "--coverage-search") + + conditional(microexonSearch, "--microexon-search") + + conditional(keepTmp, "--keep-tmp") + + optional("--tmp-dir", tmpDir) + optional("--zpacker", zpacker) + - conditional(unmapped_fifo, "--unmapped-fifo") + - conditional(report_secondary_alignments, "--report-secondary-alignments") + - conditional(no_discordant, "--no-discordant") + - conditional(no_mixed, "--no-mixed") + - optional("--segment-mismatches", segment_mismatches) + - optional("--segment-length", segment_length) + - conditional(bowtie_n, "--bowtie-n") + - optional("--min-coverage-intron", min_coverage_intron) + - optional("--max-coverage-intron", max_coverage_intron) + - optional("--min-segment-intron", min_segment_intron) + - optional("--max-segment-intron", max_segment_intron) + - conditional(no_sort_bam, "--no-sort-bam") + - conditional(no_convert_bam, "--no-convert-bam") + - conditional(keep_fasta_order, "--keep-fasta-order") + - conditional(allow_partial_mapping, "--allow-partial-mapping") + - conditional(b2_very_fast, "--b2-very-fast") + - conditional(b2_fast, "--b2-fast") + - conditional(b2_sensitive, "--b2-sensitive") + - conditional(b2_very_sensitive, "--b2-very-sensitive") + - optional("--b2-N", b2_N) + - optional("--b2-L", b2_L) + - optional("--b2-i", b2_i) + - optional("--b2-n-ceil", b2_n_ceil) + - optional("--b2-gbar", b2_gbar) + - optional("--b2-mp", b2_mp) + - optional("--b2-np", b2_np) + - optional("--b2-rdg", b2_rdg) + - optional("--b2-rfg", b2_rfg) + - optional("--b2-score-min", b2_score_min) + - optional("--b2-D", b2_D) + - optional("--b2-R", b2_R) + - conditional(fusion_search, "--fusion-search") + - optional("--fusion-anchor-length", fusion_anchor_length) + - optional("--fusion-min-dist", fusion_min_dist) + - optional("--fusion-read-mismatches", fusion_read_mismatches) + - optional("--fusion-multireads", fusion_multireads) + - optional("--fusion-multipairs", fusion_multipairs) + - optional("--fusion-ignore-chromosomes", fusion_ignore_chromosomes) + - conditional(fusion_do_not_resolve_conflicts, "--fusion-do-not-resolve-conflicts") + - optional("--rg-id", rg_id) + - optional("--rg-sample", rg_sample) + - optional("--rg-library", rg_library) + - optional("--rg-description", rg_description) + - optional("--rg-platform-unit", rg_platform_unit) + - optional("--rg-center", rg_center) + - optional("--rg-date", rg_date) + - optional("--rg-platform", rg_platform) + - required(bowtie_index) + + conditional(unmappedFifo, "--unmapped-fifo") + + conditional(reportSecondaryAlignments, "--report-secondary-alignments") + + conditional(noDiscordant, "--no-discordant") + + conditional(noMixed, "--no-mixed") + + optional("--segment-mismatches", segmentMismatches) + + optional("--segment-length", segmentLength) + + conditional(bowtieN, "--bowtie-n") + + optional("--min-coverage-intron", minCoverageIntron) + + optional("--max-coverage-intron", maxCoverageIntron) + + optional("--min-segment-intron", minSegmentIntron) + + optional("--max-segment-intron", maxSegmentIntron) + + conditional(noSortBam, "--no-sort-bam") + + conditional(noConvertBam, "--no-convert-bam") + + conditional(keepFastaOrder, "--keep-fasta-order") + + conditional(allowPartialMapping, "--allow-partial-mapping") + + conditional(b2VeryFast, "--b2-very-fast") + + conditional(b2Fast, "--b2-fast") + + conditional(b2Sensitive, "--b2-sensitive") + + conditional(b2VerySensitive, "--b2-very-sensitive") + + optional("--b2-N", b2N) + + optional("--b2-L", b2L) + + optional("--b2-i", b2I) + + optional("--b2-n-ceil", b2NCeil) + + optional("--b2-gbar", b2Gbar) + + optional("--b2-mp", b2Mp) + + optional("--b2-np", b2Np) + + optional("--b2-rdg", b2Rdg) + + optional("--b2-rfg", b2Rfg) + + optional("--b2-score-min", b2ScoreMin) + + optional("--b2-D", b2D) + + optional("--b2-R", b2R) + + conditional(fusionSearch, "--fusion-search") + + optional("--fusion-anchor-length", fusionAnchorLength) + + optional("--fusion-min-dist", fusionMinDist) + + optional("--fusion-read-mismatches", fusionReadMismatches) + + optional("--fusion-multireads", fusionMultireads) + + optional("--fusion-multipairs", fusionMultipairs) + + optional("--fusion-ignore-chromosomes", fusionIgnoreChromosomes) + + conditional(fusionDoNotResolveConflicts, "--fusion-do-not-resolve-conflicts") + + optional("--rg-id", rgId) + + optional("--rg-sample", rgSample) + + optional("--rg-library", rgLibrary) + + optional("--rg-description", rgDescription) + + optional("--rg-platform-unit", rgPlatformUnit) + + optional("--rg-center", rgCenter) + + optional("--rg-date", rgDate) + + optional("--rg-platform", rgPlatform) + + required(bowtieIndex) + required(R1.mkString(",")) + optional(R2.mkString(",")) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictor.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictor.scala index fb77c9138d8fd851b08c48d11395c9062270cb0f..7ea274534d66532340866be327a934bf1276b6f2 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictor.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictor.scala @@ -17,16 +17,20 @@ package nl.lumc.sasc.biopet.extensions import java.io.File +import nl.lumc.sasc.biopet.core.summary.Summarizable import nl.lumc.sasc.biopet.utils.Logging import nl.lumc.sasc.biopet.utils.config.Configurable import nl.lumc.sasc.biopet.core.{ Version, BiopetCommandLineFunction, Reference } +import nl.lumc.sasc.biopet.utils.tryToParseNumber import org.broadinstitute.gatk.utils.commandline.{ Input, Output } +import scala.io.Source + /** * Extension for VariantEffectPredictor * Created by ahbbollen on 15-1-15. */ -class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFunction with Reference with Version { +class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFunction with Reference with Version with Summarizable { executable = config("exe", configNamespace = "perl", default = "perl") var vepScript: String = config("vep_script") @@ -44,21 +48,21 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu var v: Boolean = config("v", default = true, freeVar = false) var q: Boolean = config("q", default = false, freeVar = false) var offline: Boolean = config("offline", default = false) - var no_progress: Boolean = config("no_progress", default = false) + var noProgress: Boolean = config("no_progress", default = false) var everything: Boolean = config("everything", default = false) var force: Boolean = config("force", default = false) - var no_stats: Boolean = config("no_stats", default = false) - var stats_text: Boolean = config("stats_text", default = false) + var noStats: Boolean = config("no_stats", default = false) + var statsText: Boolean = config("stats_text", default = true) var html: Boolean = config("html", default = false) var cache: Boolean = config("cache", default = false) var humdiv: Boolean = config("humdiv", default = false) var regulatory: Boolean = config("regulatory", default = false) - var cell_type: Boolean = config("cell_type", default = false) + var cellType: Boolean = config("cell_type", default = false) var phased: Boolean = config("phased", default = false) - var allele_number: Boolean = config("allele_number", default = false) + var alleleNumber: Boolean = config("allele_number", default = false) var numbers: Boolean = config("numbers", default = false) var domains: Boolean = config("domains", default = false) - var no_escape: Boolean = config("no_escape", default = false) + var noEscape: Boolean = config("no_escape", default = false) var hgvs: Boolean = config("hgvs", default = false) var protein: Boolean = config("protein", default = false) var symbol: Boolean = config("symbol", default = false) @@ -67,50 +71,50 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu var tsl: Boolean = config("tsl", default = false) var canonical: Boolean = config("canonical", default = false) var biotype: Boolean = config("biotype", default = false) - var xref_refseq: Boolean = config("xref_refseq", default = false) - var check_existing: Boolean = config("check_existing", default = false) - var check_alleles: Boolean = config("check_alleles", default = false) - var check_svs: Boolean = config("svs", default = false) + var xrefRefseq: Boolean = config("xref_refseq", default = false) + var checkExisting: Boolean = config("check_existing", default = false) + var checkAlleles: Boolean = config("check_alleles", default = false) + var checkSvs: Boolean = config("svs", default = false) var gmaf: Boolean = config("gmaf", default = false) - var maf_1kg: Boolean = config("maf_1kg", default = false) - var maf_esp: Boolean = config("maf_esp", default = false) - var old_map: Boolean = config("old_maf", default = false) + var maf1kg: Boolean = config("maf_1kg", default = false) + var mafEsp: Boolean = config("maf_esp", default = false) + var oldMap: Boolean = config("old_maf", default = false) var pubmed: Boolean = config("pubmed", default = false) var vcf: Boolean = config("vcf", default = true, freeVar = false) var json: Boolean = config("json", default = false, freeVar = false) var gvf: Boolean = config("gvf", default = false) - var check_ref: Boolean = config("check_ref", default = false) - var coding_only: Boolean = config("coding_only", default = false) - var no_intergenic: Boolean = config("no_intergenic", default = false) + var checkRef: Boolean = config("check_ref", default = false) + var codingOnly: Boolean = config("coding_only", default = false) + var noIntergenic: Boolean = config("no_intergenic", default = false) var pick: Boolean = config("pick", default = false) - var pick_allele: Boolean = config("pick_allele", default = false) - var flag_pick: Boolean = config("flag_pick", default = false) - var flag_pick_allele: Boolean = config("flag_pick_allele", default = false) - var per_gene: Boolean = config("per_gene", default = false) - var most_severe: Boolean = config("most_severe", default = false) + var pickAllele: Boolean = config("pick_allele", default = false) + var flagPick: Boolean = config("flag_pick", default = false) + var flagPickAllele: Boolean = config("flag_pick_allele", default = false) + var perGene: Boolean = config("per_gene", default = false) + var mostSevere: Boolean = config("most_severe", default = false) var summary: Boolean = config("summary", default = false) - var filter_common: Boolean = config("filter_common", default = false) - var check_frequency: Boolean = config("check_frequency", default = false) - var allow_non_variant: Boolean = config("allow_non_variant", default = false) + var filterCommon: Boolean = config("filter_common", default = false) + var checkFrequency: Boolean = config("check_frequency", default = false) + var allowNonVariant: Boolean = config("allow_non_variant", default = false) var database: Boolean = config("database", default = false) var genomes: Boolean = config("genomes", default = false) - var gencode_basic: Boolean = config("gencode_basic", default = false) + var gencodeBasic: Boolean = config("gencode_basic", default = false) var refseq: Boolean = config("refseq", default = false) var merged: Boolean = config("merged", default = false) - var all_refseq: Boolean = config("all_refseq", default = false) + var allRefseq: Boolean = config("all_refseq", default = false) var lrg: Boolean = config("lrg", default = false) - var no_whole_genome: Boolean = config("no_whole_genome", default = false) - var skip_db_check: Boolean = config("skip_db_check", default = false) + var noWholeGenome: Boolean = config("no_whole_genome", default = false) + var skibDbCheck: Boolean = config("skip_db_check", default = false) // Textual args - var vep_config: Option[String] = config("config", freeVar = false) + var vepConfig: Option[String] = config("config", freeVar = false) var species: Option[String] = config("species", freeVar = false) var assembly: Option[String] = config("assembly") var format: Option[String] = config("format") var dir: Option[String] = config("dir") - var dir_cache: Option[String] = config("dir_cache") - var dir_plugins: Option[String] = config("dir_plugins") + var dirCache: Option[String] = config("dir_cache") + var dirPlugins: Option[String] = config("dir_plugins") var fasta: Option[String] = config("fasta") var sift: Option[String] = config("sift") var polyphen: Option[String] = config("polyphen") @@ -121,10 +125,10 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu var convert: Option[String] = config("convert") var terms: Option[String] = config("terms") var chr: Option[String] = config("chr") - var pick_order: Option[String] = config("pick_order") - var freq_pop: Option[String] = config("check_pop") - var freq_gt_lt: Option[String] = config("freq_gt_lt") - var freq_filter: Option[String] = config("freq_filter") + var pickOrder: Option[String] = config("pick_order") + var freqPop: Option[String] = config("check_pop") + var freqGtLt: Option[String] = config("freq_gt_lt") + var freqFilter: Option[String] = config("freq_filter") var filter: Option[String] = config("filter") var host: Option[String] = config("host") var user: Option[String] = config("user") @@ -132,18 +136,20 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu var registry: Option[String] = config("registry") var build: Option[String] = config("build") var compress: Option[String] = config("compress") - var cache_region_size: Option[String] = config("cache_region_size") + var cacheRegionSize: Option[String] = config("cache_region_size") // Numeric args override def defaultThreads: Int = config("fork", default = 2) - var cache_version: Option[Int] = config("cache_version") - var freq_freq: Option[Float] = config("freq_freq") + var cacheVersion: Option[Int] = config("cache_version") + var freqFreq: Option[Float] = config("freq_freq") var port: Option[Int] = config("port") - var db_version: Option[Int] = config("db_version") - var buffer_size: Option[Int] = config("buffer_size") + var dbVersion: Option[Int] = config("db_version") + var bufferSize: Option[Int] = config("buffer_size") // ought to be a flag, but is BUG in VEP; becomes numeric ("1" is true) var failed: Option[Int] = config("failed") + override def defaultCoreMemory = 4.0 + override def beforeGraph(): Unit = { super.beforeGraph() if (!cache && !database) { @@ -151,6 +157,7 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu } else if (cache && dir.isEmpty) { Logging.addError("Must supply dir to cache for VariantEffectPredictor") } + if (statsText) outputFiles :+= new File(output.getAbsolutePath + "_summary.txt") } /** Returns command to execute */ @@ -161,21 +168,21 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu conditional(v, "-v") + conditional(q, "-q") + conditional(offline, "--offline") + - conditional(no_progress, "--no_progress") + + conditional(noProgress, "--no_progress") + conditional(everything, "--everything") + conditional(force, "--force_overwrite") + - conditional(no_stats, "--no_stats") + - conditional(stats_text, "--stats_text") + + conditional(noStats, "--no_stats") + + conditional(statsText, "--stats_text") + conditional(html, "--html") + conditional(cache, "--cache") + conditional(humdiv, "--humdiv") + conditional(regulatory, "--regulatory") + - conditional(cell_type, "--cel_type") + + conditional(cellType, "--cel_type") + conditional(phased, "--phased") + - conditional(allele_number, "--allele_number") + + conditional(alleleNumber, "--allele_number") + conditional(numbers, "--numbers") + conditional(domains, "--domains") + - conditional(no_escape, "--no_escape") + + conditional(noEscape, "--no_escape") + conditional(hgvs, "--hgvs") + conditional(protein, "--protein") + conditional(symbol, "--symbol") + @@ -184,46 +191,46 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu conditional(tsl, "--tsl") + conditional(canonical, "--canonical") + conditional(biotype, "--biotype") + - conditional(xref_refseq, "--xref_refseq") + - conditional(check_existing, "--check_existing") + - conditional(check_alleles, "--check_alleles") + - conditional(check_svs, "--check_svs") + + conditional(xrefRefseq, "--xref_refseq") + + conditional(checkExisting, "--check_existing") + + conditional(checkAlleles, "--check_alleles") + + conditional(checkSvs, "--check_svs") + conditional(gmaf, "--gmaf") + - conditional(maf_1kg, "--maf_1kg") + - conditional(maf_esp, "--maf_esp") + + conditional(maf1kg, "--maf_1kg") + + conditional(mafEsp, "--maf_esp") + conditional(pubmed, "--pubmed") + conditional(vcf, "--vcf") + conditional(json, "--json") + conditional(gvf, "--gvf") + - conditional(check_ref, "--check_ref") + - conditional(coding_only, "--coding_only") + - conditional(no_intergenic, "--no_intergenic") + + conditional(checkRef, "--check_ref") + + conditional(codingOnly, "--coding_only") + + conditional(noIntergenic, "--no_intergenic") + conditional(pick, "--pick") + - conditional(pick_allele, "--pick_allele") + - conditional(flag_pick, "--flag_pick") + - conditional(flag_pick_allele, "--flag_pick_allele") + - conditional(per_gene, "--per_gene") + - conditional(most_severe, "--most_severe") + + conditional(pickAllele, "--pick_allele") + + conditional(flagPick, "--flag_pick") + + conditional(flagPickAllele, "--flag_pick_allele") + + conditional(perGene, "--per_gene") + + conditional(mostSevere, "--most_severe") + conditional(summary, "--summary") + - conditional(filter_common, "--filter_common") + - conditional(check_frequency, "--check_frequency") + - conditional(allow_non_variant, "--allow_non_variant") + + conditional(filterCommon, "--filter_common") + + conditional(checkFrequency, "--check_frequency") + + conditional(allowNonVariant, "--allow_non_variant") + conditional(database, "--database") + conditional(genomes, "--genomes") + - conditional(gencode_basic, "--gencode_basic") + + conditional(gencodeBasic, "--gencode_basic") + conditional(refseq, "--refseq") + conditional(merged, "--merged") + - conditional(all_refseq, "--all_refseq") + + conditional(allRefseq, "--all_refseq") + conditional(lrg, "--lrg") + - conditional(no_whole_genome, "--no_whole_genome") + - conditional(skip_db_check, "--skip_db_check") + - optional("--config", vep_config) + + conditional(noWholeGenome, "--no_whole_genome") + + conditional(skibDbCheck, "--skip_db_check") + + optional("--config", vepConfig) + optional("--species", species) + optional("--assembly", assembly) + optional("--format", format) + optional("--dir", dir) + - optional("--dir_cache", dir_cache) + - optional("--dir_plugins", dir_plugins) + + optional("--dir_cache", dirCache) + + optional("--dir_plugins", dirPlugins) + optional("--fasta", fasta) + optional("--sift", sift) + optional("--polyphen", polyphen) + @@ -234,10 +241,10 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu optional("--convert", convert) + optional("--terms", terms) + optional("--chr", chr) + - optional("--pick_order", pick_order) + - optional("--freq_pop", freq_pop) + - optional("--freq_gt_lt", freq_gt_lt) + - optional("--freq_filter", freq_filter) + + optional("--pick_order", pickOrder) + + optional("--freq_pop", freqPop) + + optional("--freq_gt_lt", freqGtLt) + + optional("--freq_filter", freqFilter) + optional("--filter", filter) + optional("--host", host) + optional("--user", user) + @@ -245,13 +252,56 @@ class VariantEffectPredictor(val root: Configurable) extends BiopetCommandLineFu optional("--registry", registry) + optional("--build", build) + optional("--compress", compress) + - optional("--cache_region_size", cache_region_size) + + optional("--cache_region_size", cacheRegionSize) + optional("--fork", threads) + - optional("--cache_version", cache_version) + - optional("--freq_freq", freq_freq) + + optional("--cache_version", cacheVersion) + + optional("--freq_freq", freqFreq) + optional("--port", port) + - optional("--db_version", db_version) + - optional("--buffer_size", buffer_size) + + optional("--db_version", dbVersion) + + optional("--buffer_size", bufferSize) + optional("--failed", failed) + def summaryFiles: Map[String, File] = Map() + + def summaryStats: Map[String, Any] = { + if (statsText) { + val statsFile: File = new File(output.getAbsolutePath + "_summary.txt") + parseStatsFile(statsFile) + } else { + Map() + } + } + + def parseStatsFile(file: File): Map[String, Any] = { + val contents = Source.fromFile(file).getLines().toList + val headers = getHeadersFromStatsFile(contents) + headers.foldLeft(Map.empty[String, Any])((acc, x) => acc + (x.replace(" ", "_") -> getBlockFromStatsFile(contents, x))) + } + + def getBlockFromStatsFile(contents: List[String], header: String): Map[String, Any] = { + var inBlock = false + var theMap: Map[String, Any] = Map() + for (x <- contents) { + val stripped = x.stripPrefix("[").stripSuffix("]") + if (stripped == header) { + inBlock = true + } else { + if (inBlock) { + val key = stripped.split('\t').head.replace(" ", "_") + val value = stripped.split('\t').last + theMap ++= Map(key -> tryToParseNumber(value, fallBack = true).getOrElse(value)) + } + } + if (stripped == "") { + inBlock = false + } + } + theMap + } + + def getHeadersFromStatsFile(contents: List[String]): List[String] = { + // block headers are of format '[block]' + contents.filter(_.startsWith("[")).filter(_.endsWith("]")).map(_.stripPrefix("[")).map(_.stripSuffix("]")) + } + } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bcftools/BcftoolsView.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bcftools/BcftoolsView.scala index 4bb87d332e15abc0e93f71f43e2329eb574e95ca..bde8bbfd11bf7cdbb364ea4ad2394e39b18a90b9 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bcftools/BcftoolsView.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bcftools/BcftoolsView.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.bcftools import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bedtools/BedtoolsMerge.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bedtools/BedtoolsMerge.scala index f72411a6ab7b83c7661d4f36ec38d12bca3136c1..262772c6d5c4723b46ed6fef2339982034f8bbab 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bedtools/BedtoolsMerge.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bedtools/BedtoolsMerge.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.bedtools import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie.scala index 429c65b985980f2874dc53ff4a97c546d4c29f13..92abaded51b2b28c5fbad52329e19df724c9ea0d 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie.scala @@ -46,7 +46,7 @@ class Bowtie(val root: Configurable) extends BiopetCommandLineFunction with Refe override def defaultThreads = 8 var sam: Boolean = config("sam", default = false) - var sam_RG: Option[String] = config("sam-RG") + var samRg: Option[String] = config("sam-RG") var seedlen: Option[Int] = config("seedlen") var seedmms: Option[Int] = config("seedmms") var k: Option[Int] = config("k") @@ -80,7 +80,7 @@ class Bowtie(val root: Configurable) extends BiopetCommandLineFunction with Refe conditional(largeIndex, "--large-index") + conditional(best, "--best") + conditional(strata, "--strata") + - optional("--sam-RG", sam_RG) + + optional("--sam-RG", samRg) + optional("--seedlen", seedlen) + optional("--seedmms", seedmms) + optional("-k", k) + diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2.scala index 793fd7d159a64c5f432afd575d5a04b7d4fa7c09..71424372dc48bf650676a905b0ebb9ac43ec1c1a 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.bowtie import java.io.File @@ -45,20 +60,20 @@ class Bowtie2(val root: Configurable) extends BiopetCommandLineFunction with Ref var trim3: Option[Int] = config("trim3") var phred33: Boolean = config("phred33", default = false) var phred64: Boolean = config("phred64", default = false) - var int_quals: Boolean = config("int_quals", default = false) + var intQuals: Boolean = config("int_quals", default = false) /* Alignment options */ var N: Option[Int] = config("N") var L: Option[Int] = config("L") var i: Option[String] = config("i") - var n_ceil: Option[String] = config("n_ceil") + var nCeil: Option[String] = config("n_ceil") var dpad: Option[Int] = config("dpad") var gbar: Option[Int] = config("gbar") - var ignore_quals: Boolean = config("ignore_quals", default = false) + var ignoreQuals: Boolean = config("ignore_quals", default = false) var nofw: Boolean = config("nofw", default = false) var norc: Boolean = config("norc", default = false) - var no_1mm_upfront: Boolean = config("no_1mm_upfront", default = false) - var end_to_end: Boolean = config("end_to_end", default = false) + var no1MmUpfront: Boolean = config("no_1mm_upfront", default = false) + var endToEnd: Boolean = config("end_to_end", default = false) var local: Boolean = config("local", default = false) /* Scoring */ @@ -67,7 +82,7 @@ class Bowtie2(val root: Configurable) extends BiopetCommandLineFunction with Ref var np: Option[Int] = config("np") var rdg: Option[String] = config("rdg") var rfg: Option[String] = config("rfg") - var score_min: Option[String] = config("score_min") + var scoreMin: Option[String] = config("score_min") /* Reporting */ var k: Option[Int] = config("k") @@ -83,59 +98,64 @@ class Bowtie2(val root: Configurable) extends BiopetCommandLineFunction with Ref var fr: Boolean = config("fr", default = false) var rf: Boolean = config("rf", default = false) var ff: Boolean = config("ff", default = false) - var no_mixed: Boolean = config("no_mixed", default = false) - var no_discordant: Boolean = config("no_discordant", default = false) - var no_dovetail: Boolean = config("no_dovetail", default = false) - var no_contain: Boolean = config("no_contain", default = false) - var no_overlap: Boolean = config("no_overlap", default = false) + var noMixed: Boolean = config("no_mixed", default = false) + var noDiscordant: Boolean = config("no_discordant", default = false) + var noDovetail: Boolean = config("no_dovetail", default = false) + var noContain: Boolean = config("no_contain", default = false) + var noOverlap: Boolean = config("no_overlap", default = false) /* Output */ var time: Boolean = config("no_overlap", default = false) var un: Option[String] = config("un") var al: Option[String] = config("al") - var un_conc: Option[String] = config("un_conc") - var al_conc: Option[String] = config("al_conc") + var unConc: Option[String] = config("un_conc") + var alConc: Option[String] = config("al_conc") - var un_gz: Option[String] = config("un_gz") - var al_gz: Option[String] = config("al_gz") - var un_conc_gz: Option[String] = config("un_conc_gz") - var al_conc_gz: Option[String] = config("al_conc_gz") + var unGz: Option[String] = config("un_gz") + var alGz: Option[String] = config("al_gz") + var unConcGz: Option[String] = config("un_conc_gz") + var alConcGz: Option[String] = config("al_conc_gz") - var un_bz2: Option[String] = config("un_bz2") - var al_bz2: Option[String] = config("al_bz2") - var un_conc_bz2: Option[String] = config("un_conc_bz2") - var al_conc_bz2: Option[String] = config("al_conc_bz2") + var unBz2: Option[String] = config("un_bz2") + var alBz2: Option[String] = config("al_bz2") + var unConcBz2: Option[String] = config("un_conc_bz2") + var alConcBz2: Option[String] = config("al_conc_bz2") var quiet: Boolean = config("quiet", default = false) - var met_file: Option[String] = config("met_file") - var met_stderr: Boolean = config("met_stderr", default = false) + var metFile: Option[String] = config("met_file") + var metStderr: Boolean = config("met_stderr", default = false) var met: Option[Int] = config("met") - var no_unal: Boolean = config("no_unal", default = false) - var no_head: Boolean = config("no_head", default = false) - var no_sq: Boolean = config("no_sq", default = false) + var noUnal: Boolean = config("no_unal", default = false) + var noHead: Boolean = config("no_head", default = false) + var noSq: Boolean = config("no_sq", default = false) - var rg_id: Option[String] = config("rg_id") + var rgId: Option[String] = config("rg_id") var rg: List[String] = config("rg", default = Nil) - var omit_sec_seq: Boolean = config("omit_sec_seq", default = false) + var omitSecSeq: Boolean = config("omit_sec_seq", default = false) /* Performance */ var reorder: Boolean = config("reorder", default = false) var mm: Boolean = config("mm", default = false) /* Other */ - var qc_filter: Boolean = config("qc_filter", default = false) + var qcFilter: Boolean = config("qc_filter", default = false) var seed: Option[Int] = config("seed") - var non_deterministic: Boolean = config("non_deterministic", default = false) + var nonDeterministic: Boolean = config("non_deterministic", default = false) override def beforeGraph() { super.beforeGraph() val indexDir = new File(bowtieIndex).getParentFile val basename = bowtieIndex.stripPrefix(indexDir.getPath + File.separator) if (indexDir.exists()) { - if (!indexDir.list().toList.filter(_.startsWith(basename)).exists(_.endsWith(".bt2"))) + if (!indexDir.list() + .toList + .filter(_.startsWith(basename)) + .exists({ p => + p.endsWith(".bt2") || p.endsWith(".bt2l") + })) Logging.addError(s"No index files found for bowtie2 in: $indexDir with basename: $basename") } } @@ -153,19 +173,19 @@ class Bowtie2(val root: Configurable) extends BiopetCommandLineFunction with Ref optional("--trim5", trim5) + conditional(phred33, "--phred33") + conditional(phred64, "--phred64") + - conditional(int_quals, "--int-quals") + + conditional(intQuals, "--int-quals") + /* Alignment options */ optional("-N", N) + optional("-L", L) + optional("-i", i) + - optional("--n-ceil", n_ceil) + + optional("--n-ceil", nCeil) + optional("--dpad", dpad) + optional("--gbar", gbar) + - conditional(ignore_quals, "--ignore-quals") + + conditional(ignoreQuals, "--ignore-quals") + conditional(nofw, "--nofw") + conditional(norc, "--norc") + - conditional(no_1mm_upfront, "--no-1mm-upfront") + - conditional(end_to_end, "--end-to-end") + + conditional(no1MmUpfront, "--no-1mm-upfront") + + conditional(endToEnd, "--end-to-end") + conditional(local, "--local") + /* Scoring */ optional("--ma", ma) + @@ -173,7 +193,7 @@ class Bowtie2(val root: Configurable) extends BiopetCommandLineFunction with Ref optional("--np", np) + optional("--rdg", rdg) + optional("--rfg", rfg) + - optional("--score-min", score_min) + + optional("--score-min", scoreMin) + /* Reporting */ optional("-k", k) + optional("--all", all) + @@ -186,43 +206,43 @@ class Bowtie2(val root: Configurable) extends BiopetCommandLineFunction with Ref conditional(fr, "--fr") + conditional(rf, "--rf") + conditional(ff, "--ff") + - conditional(no_mixed, "--no-mixed") + - conditional(no_discordant, "--no-discordant") + - conditional(no_dovetail, "--no-dovetail") + - conditional(no_contain, "--no-contain") + - conditional(no_overlap, "--no-overlap") + + conditional(noMixed, "--no-mixed") + + conditional(noDiscordant, "--no-discordant") + + conditional(noDovetail, "--no-dovetail") + + conditional(noContain, "--no-contain") + + conditional(noOverlap, "--no-overlap") + /* Output */ conditional(time, "--time") + optional("--un", un) + optional("--al", al) + - optional("--un-conc", un_conc) + - optional("--al-conc", al_conc) + - optional("--un-gz", un_gz) + - optional("--al-gz", al_gz) + - optional("--un-conc-gz", un_conc_gz) + - optional("--al-conc-gz", al_conc_gz) + - optional("--un-bz2", un_bz2) + - optional("--al-bz2", al_bz2) + - optional("--un-conc-bz2", un_conc_bz2) + - optional("--al-conc-bz2", al_conc_bz2) + + optional("--un-conc", unConc) + + optional("--al-conc", alConc) + + optional("--un-gz", unGz) + + optional("--al-gz", alGz) + + optional("--un-conc-gz", unConcGz) + + optional("--al-conc-gz", alConcGz) + + optional("--un-bz2", unBz2) + + optional("--al-bz2", alBz2) + + optional("--un-conc-bz2", unConcBz2) + + optional("--al-conc-bz2", alConcBz2) + conditional(quiet, "--quiet") + - optional("--met-file", met_file) + - conditional(met_stderr, "--met-stderr") + + optional("--met-file", metFile) + + conditional(metStderr, "--met-stderr") + optional("--met", met) + - conditional(no_unal, "--no-unal") + - conditional(no_head, "--no-head") + - conditional(no_sq, "--no-sq") + - optional("--rg-id", rg_id) + + conditional(noUnal, "--no-unal") + + conditional(noHead, "--no-head") + + conditional(noSq, "--no-sq") + + optional("--rg-id", rgId) + repeat("--rg", rg) + - conditional(omit_sec_seq, "--omit-sec-seq") + + conditional(omitSecSeq, "--omit-sec-seq") + /* Performance */ optional("--threads", threads) + conditional(reorder, "--reorder") + conditional(mm, "--mm") + /* Other */ - conditional(qc_filter, "--qc-filter") + + conditional(qcFilter, "--qc-filter") + optional("--seed", seed) + - conditional(non_deterministic, "--non-deterministic") + + conditional(nonDeterministic, "--non-deterministic") + /* Required */ required("-x", bowtieIndex) + (R2 match { diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2Build.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2Build.scala index 0e8ce7b9faeaf3f9c4ed0ec4a6342ed3ebb82f93..8d2b9ecaea5c8caef2182f52472de1a1a8e936da 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2Build.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/Bowtie2Build.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.bowtie import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/BowtieBuild.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/BowtieBuild.scala index ed4c8b830aafb07a1a0ebfb83d88f6331a061b1f..6e589c589800ccfcf821080eee85f1c917c0eee0 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/BowtieBuild.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/bowtie/BowtieBuild.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.bowtie import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/breakdancer/BreakdancerConfig.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/breakdancer/BreakdancerConfig.scala index 2b310aaf8c6b38933f4c11badedfbf7d57084bef..7986c837428b80132581438ae33636914e2b1c99 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/breakdancer/BreakdancerConfig.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/breakdancer/BreakdancerConfig.scala @@ -30,26 +30,26 @@ class BreakdancerConfig(val root: Configurable) extends BiopetCommandLineFunctio @Output(doc = "Output File") var output: File = _ - var min_mq: Option[Int] = config("min_mq", default = 20) // minimum of MQ to consider for taking read into histogram - var use_mq: Boolean = config("use_mq", default = false) - var min_insertsize: Option[Int] = config("min_insertsize") - var solid_data: Boolean = config("solid", default = false) - var sd_cutoff: Option[Int] = config("sd_cutoff") // Cutoff in unit of standard deviation [4] + var minMq: Option[Int] = config("min_mq", default = 20) // minimum of MQ to consider for taking read into histogram + var useMq: Boolean = config("use_mq", default = false) + var minInsertsize: Option[Int] = config("min_insertsize") + var solidData: Boolean = config("solid", default = false) + var sdCutoff: Option[Int] = config("sd_cutoff") // Cutoff in unit of standard deviation [4] // we set this to a higher number to avoid biases in small numbers in sorted bams - var min_observations: Option[Int] = config("min_observations") // Number of observation required to estimate mean and s.d. insert size [10_000] - var coefvar_cutoff: Option[Int] = config("coef_cutoff") // Cutoff on coefficients of variation [1] - var histogram_bins: Option[Int] = config("histogram_bins") // Number of bins in the histogram [50] + var minObservations: Option[Int] = config("min_observations") // Number of observation required to estimate mean and s.d. insert size [10_000] + var coefvarCutoff: Option[Int] = config("coef_cutoff") // Cutoff on coefficients of variation [1] + var histogramBins: Option[Int] = config("histogram_bins") // Number of bins in the histogram [50] def cmdLine = required(executable) + - optional("-q", min_mq) + - conditional(use_mq, "-m") + - optional("-s", min_insertsize) + - conditional(solid_data, "-s") + - optional("-c", sd_cutoff) + - optional("-n", min_observations) + - optional("-v", coefvar_cutoff) + - optional("-b", histogram_bins) + + optional("-q", minMq) + + conditional(useMq, "-m") + + optional("-s", minInsertsize) + + conditional(solidData, "-s") + + optional("-c", sdCutoff) + + optional("-n", minObservations) + + optional("-v", coefvarCutoff) + + optional("-b", histogramBins) + required(input) + " 1> " + required(output) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/Conifer.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/Conifer.scala index 5f517d7b338f0236fe051edd320ea9e4d4775fdb..f241c576ee1fa2ff7b7169f5e7cee7abc265045e 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/Conifer.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/Conifer.scala @@ -24,7 +24,7 @@ abstract class Conifer extends PythonCommandLineFunction with Version { setPythonScript(config("script", default = "conifer")) def versionRegex = """(.*)""".r override def versionExitcode = List(0) - def versionCommand = executable + " " + python_script + " --version" + def versionCommand = executable + " " + pythonScript + " --version" override def defaultCoreMemory = 5.0 override def defaultThreads = 1 diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/ConiferAnalyze.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/ConiferAnalyze.scala index 284d0e059dea542f2550db22b086b4a4db9837da..40790f0ac3cad8ae4f4c5af309b5b078bbd99002 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/ConiferAnalyze.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/conifer/ConiferAnalyze.scala @@ -36,7 +36,7 @@ class ConiferAnalyze(val root: Configurable) extends Conifer { var svd: Option[Int] = config("svd", default = 1) @Argument(doc = "Minimum population median RPKM per probe", required = false) - var min_rpkm: Option[Double] = config("min_rpkm") + var minRpkm: Option[Double] = config("min_rpkm") override def cmdLine = super.cmdLine + " analyze " + @@ -44,5 +44,5 @@ class ConiferAnalyze(val root: Configurable) extends Conifer { " --rpkm_dir" + required(rpkmDir) + " --output" + required(output) + optional("--svd", svd) + - optional("--min_rpkm", min_rpkm) + optional("--min_rpkm", minRpkm) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gatk/broad/GatkGeneral.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gatk/broad/GatkGeneral.scala index d2a0c86160e498f0cba8bc1f4a4681946c6fa0de..d44ba4957f4c716539607d36e8e76a5312f2d264 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gatk/broad/GatkGeneral.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gatk/broad/GatkGeneral.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ /** * Due to the license issue with GATK, this part of Biopet can only be used inside the * LUMC. Please refer to https://git.lumc.nl/biopet/biopet/wikis/home for instructions diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gmap/Gsnap.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gmap/Gsnap.scala index 7b92be3b16946a6ae74dbe1537031a0ac864e534..d8059551cfd9504ce778150384fda67151500397 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gmap/Gsnap.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/gmap/Gsnap.scala @@ -53,7 +53,7 @@ class Gsnap(val root: Configurable) extends BiopetCommandLineFunction with Refer var db: String = config("db") /** whether to use a suffix array, which will give increased speed */ - var use_sarray: Option[Int] = config("use_sarray") + var useSarray: Option[Int] = config("use_sarray") /** kmer size to use in genome database (allowed values: 16 or less) */ var kmer: Option[Int] = config("kmer") @@ -65,28 +65,28 @@ class Gsnap(val root: Configurable) extends BiopetCommandLineFunction with Refer var part: Option[String] = config("part") /** size of input buffer (program reads this many sequences at a time)*/ - var input_buffer_size: Option[Int] = config("input_buffer_size") + var inputBufferSize: Option[Int] = config("input_buffer_size") /** amount of barcode to remove from start of read */ - var barcode_length: Option[Int] = config("barcode_length") + var barcodeLength: Option[Int] = config("barcode_length") /** orientation of paired-end reads */ var orientation: Option[String] = config("orientation") /** starting position of identifier in fastq header, space-delimited (>= 1) */ - var fastq_id_start: Option[Int] = config("fastq_id_start") + var fastqIdStart: Option[Int] = config("fastq_id_start") /** ending position of identifier in fastq header, space-delimited (>= 1) */ - var fastq_id_end: Option[Int] = config("fastq_id_end") + var fastqIdEnd: Option[Int] = config("fastq_id_end") /** when multiple fastq files are provided on the command line, gsnap assumes */ - var force_single_end: Boolean = config("force_single_end", default = false) + var forceSingleEnd: Boolean = config("force_single_end", default = false) /** skips reads marked by the illumina chastity program. expecting a string */ - var filter_chastity: Option[String] = config("filter_chastity") + var filterChastity: Option[String] = config("filter_chastity") /** allows accession names of reads to mismatch in paired-end files */ - var allow_pe_name_mismatch: Boolean = config("allow_pe_name_mismatch", default = false) + var allowPeNameMismatch: Boolean = config("allow_pe_name_mismatch", default = false) /** uncompress gzipped input files */ var gunzip: Boolean = config("gunzip", default = false) @@ -98,61 +98,61 @@ class Gsnap(val root: Configurable) extends BiopetCommandLineFunction with Refer var batch: Option[Int] = config("batch") /** whether to expand the genomic offsets index */ - var expand_offsets: Option[Int] = config("expand_offsets") + var expandOffsets: Option[Int] = config("expand_offsets") /** maximum number of mismatches allowed (if not specified, then */ - var max_mismatches: Option[Float] = config("max_mismatches") + var maxMismatches: Option[Float] = config("max_mismatches") /** whether to count unknown (n) characters in the query as a mismatch */ - var query_unk_mismatch: Option[Int] = config("query_unk_mismatch") + var queryUnkMismatch: Option[Int] = config("query_unk_mismatch") /** whether to count unknown (n) characters in the genome as a mismatch */ - var genome_unk_mismatch: Option[Int] = config("genome_unk_mismatch") + var genomeUnkMismatch: Option[Int] = config("genome_unk_mismatch") /** maximum number of alignments to find (default 1000) */ var maxsearch: Option[Int] = config("maxsearch") /** threshold for computing a terminal alignment (from one end of the */ - var terminal_threshold: Option[Int] = config("terminal_threshold") + var terminalThreshold: Option[Int] = config("terminal_threshold") /** threshold alignment length in bp for a terminal alignment result to be printed (in bp) */ - var terminal_output_minlength: Option[Int] = config("terminal_output_minlength") + var terminalOutputMinlength: Option[Int] = config("terminal_output_minlength") /** penalty for an indel (default 2) */ - var indel_penalty: Option[Int] = config("indel_penalty") + var indelPenalty: Option[Int] = config("indel_penalty") /** minimum length at end required for indel alignments (default 4) */ - var indel_endlength: Option[Int] = config("indel_endlength") + var indelEndlength: Option[Int] = config("indel_endlength") /** maximum number of middle insertions allowed (default 9) */ - var max_middle_insertions: Option[Int] = config("max_middle_insertions") + var maxMiddleInsertions: Option[Int] = config("max_middle_insertions") /** maximum number of middle deletions allowed (default 30) */ - var max_middle_deletions: Option[Int] = config("max_middle_deletions") + var maxMiddleDeletions: Option[Int] = config("max_middle_deletions") /** maximum number of end insertions allowed (default 3) */ - var max_end_insertions: Option[Int] = config("max_end_insertions") + var maxEndInsertions: Option[Int] = config("max_end_insertions") /** maximum number of end deletions allowed (default 6) */ - var max_end_deletions: Option[Int] = config("max_end_deletions") + var maxEndDeletions: Option[Int] = config("max_end_deletions") /** report suboptimal hits beyond best hit (default 0) */ - var suboptimal_levels: Option[Int] = config("suboptimal_levels") + var suboptimalLevels: Option[Int] = config("suboptimal_levels") /** method for removing adapters from reads. currently allowed values: off, paired */ - var adapter_strip: Option[String] = config("adapter_strip") + var adapterStrip: Option[String] = config("adapter_strip") /** score to use for mismatches when trimming at ends (default is -3; */ - var trim_mismatch_score: Option[Int] = config("trim_mismatch_score") + var trimMismatchScore: Option[Int] = config("trim_mismatch_score") /** score to use for indels when trimming at ends (default is -4; */ - var trim_indel_score: Option[Int] = config("trim_indel_score") + var trimIndelScore: Option[Int] = config("trim_indel_score") /** directory for snps index files (created using snpindex) (default is */ var snpsdir: Option[String] = config("snpsdir") /** use database containing known snps (in <string>.iit, built */ - var use_snps: Option[String] = config("use_snps") + var useSnps: Option[String] = config("use_snps") /** directory for methylcytosine index files (created using cmetindex) */ var cmetdir: Option[String] = config("cmetdir") @@ -167,166 +167,166 @@ class Gsnap(val root: Configurable) extends BiopetCommandLineFunction with Refer var tallydir: Option[String] = config("tallydir") /** use this tally iit file to resolve concordant multiple results */ - var use_tally: Option[String] = config("use_tally") + var useTally: Option[String] = config("use_tally") /** directory for runlength iit file to resolve concordant multiple results (default is */ - var runlengthdir: Option[String] = config("runlengthdir") + var runLengthDir: Option[String] = config("runlengthdir") /** use this runlength iit file to resolve concordant multiple results */ - var use_runlength: Option[String] = config("use_runlength") + var useRunlength: Option[String] = config("use_runlength") /** cases to use gmap for complex alignments containing multiple splices or indels */ - var gmap_mode: Option[String] = config("gmap_mode") + var gmapMode: Option[String] = config("gmap_mode") /** try gmap pairsearch on nearby genomic regions if best score (the total */ - var trigger_score_for_gmap: Option[Int] = config("trigger_score_for_gmap") + var triggerScoreForGmap: Option[Int] = config("trigger_score_for_gmap") /** keep gmap hit only if it has this many consecutive matches (default 20) */ - var gmap_min_match_length: Option[Int] = config("gmap_min_match_length") + var gmapMinMatchLength: Option[Int] = config("gmap_min_match_length") /** extra mismatch/indel score allowed for gmap alignments (default 3) */ - var gmap_allowance: Option[Int] = config("gmap_allowance") + var gmapAllowance: Option[Int] = config("gmap_allowance") /** perform gmap pairsearch on nearby genomic regions up to this many */ - var max_gmap_pairsearch: Option[Int] = config("max_gmap_pairsearch") + var maxGmapPairsearch: Option[Int] = config("max_gmap_pairsearch") /** perform gmap terminal on nearby genomic regions up to this many */ - var max_gmap_terminal: Option[Int] = config("max_gmap_terminal") + var maxGmapTerminal: Option[Int] = config("max_gmap_terminal") /** perform gmap improvement on nearby genomic regions up to this many */ - var max_gmap_improvement: Option[Int] = config("max_gmap_improvement") + var maxGmapImprovement: Option[Int] = config("max_gmap_improvement") /** allow microexons only if one of the splice site probabilities is */ - var microexon_spliceprob: Option[Float] = config("microexon_spliceprob") + var microExonSpliceprob: Option[Float] = config("microexon_spliceprob") /** look for novel splicing (0=no (default), 1=yes) */ - var novelsplicing: Option[Int] = config("novelsplicing") + var novelSplicing: Option[Int] = config("novelsplicing") /** directory for splicing involving known sites or known introns, */ - var splicingdir: Option[String] = config("splicingdir") + var splicingDir: Option[String] = config("splicingdir") /** look for splicing involving known sites or known introns */ - var use_splicing: Option[String] = config("use_splicing") + var useSplicing: Option[String] = config("use_splicing") /** for ambiguous known splicing at ends of the read, do not clip at the */ - var ambig_splice_noclip: Boolean = config("ambig_splice_noclip", default = false) + var ambigSpliceNoclip: Boolean = config("ambig_splice_noclip", default = false) /** definition of local novel splicing event (default 200000) */ - var localsplicedist: Option[Int] = config("localsplicedist") + var localSpliceDist: Option[Int] = config("localsplicedist") /** distance to look for novel splices at the ends of reads (default 50000) */ - var novelend_splicedist: Option[Int] = config("novelend_splicedist") + var novelEndSplicedist: Option[Int] = config("novelend_splicedist") /** penalty for a local splice (default 0). counts against mismatches allowed */ - var local_splice_penalty: Option[Int] = config("local_splice_penalty") + var localSplicePenalty: Option[Int] = config("local_splice_penalty") /** penalty for a distant splice (default 1). a distant splice is one where */ - var distant_splice_penalty: Option[Int] = config("distant_splice_penalty") + var distantSplicePenalty: Option[Int] = config("distant_splice_penalty") /** minimum length at end required for distant spliced alignments (default 20, min */ - var distant_splice_endlength: Option[Int] = config("distant_splice_endlength") + var distantSpliceEndlength: Option[Int] = config("distant_splice_endlength") /** minimum length at end required for short-end spliced alignments (default 2, */ - var shortend_splice_endlength: Option[Int] = config("shortend_splice_endlength") + var shortendSpliceEndlength: Option[Int] = config("shortend_splice_endlength") /** minimum identity at end required for distant spliced alignments (default 0.95) */ - var distant_splice_identity: Option[Float] = config("distant_splice_identity") + var distantSpliceIdentity: Option[Float] = config("distant_splice_identity") /** (not currently implemented) */ - var antistranded_penalty: Option[Int] = config("antistranded_penalty") + var antiStrandedPenalty: Option[Int] = config("antistranded_penalty") /** report distant splices on the same chromosome as a single splice, if possible */ - var merge_distant_samechr: Boolean = config("merge_distant_samechr", default = false) + var mergeDistantSamechr: Boolean = config("merge_distant_samechr", default = false) /** max total genomic length for dna-seq paired reads, or other reads */ - var pairmax_dna: Option[Int] = config("pairmax_dna") + var pairmaxDna: Option[Int] = config("pairmax_dna") /** max total genomic length for rna-seq paired reads, or other reads */ - var pairmax_rna: Option[Int] = config("pairmax_rna") + var pairmaxRna: Option[Int] = config("pairmax_rna") /** expected paired-end length, used for calling splices in medial part of */ - var pairexpect: Option[Int] = config("pairexpect") + var pairExpect: Option[Int] = config("pairexpect") /** allowable deviation from expected paired-end length, used for */ - var pairdev: Option[Int] = config("pairdev") + var pairDev: Option[Int] = config("pairdev") /** protocol for input quality scores. allowed values: */ - var quality_protocol: Option[String] = config("quality_protocol") + var qualityProtocol: Option[String] = config("quality_protocol") /** fastq quality scores are zero at this ascii value */ - var quality_zero_score: Option[Int] = config("quality_zero_score") + var qualityZeroScore: Option[Int] = config("quality_zero_score") /** shift fastq quality scores by this amount in output */ - var quality_print_shift: Option[Int] = config("quality_print_shift") + var qualityPrintShift: Option[Int] = config("quality_print_shift") /** maximum number of paths to print (default 100) */ var npaths: Option[Int] = config("npaths") /** if more than maximum number of paths are found, */ - var quiet_if_excessive: Boolean = config("quiet_if_excessive", default = false) + var quietIfExcessive: Boolean = config("quiet_if_excessive", default = false) /** print output in same order as input (relevant */ var ordered: Boolean = config("ordered", default = false) /** for gsnap output in snp-tolerant alignment, shows all differences */ - var show_refdiff: Boolean = config("show_refdiff", default = false) + var showRefdiff: Boolean = config("show_refdiff", default = false) /** for paired-end reads whose alignments overlap, clip the overlapping region */ - var clip_overlap: Boolean = config("clip_overlap", default = false) + var clipOverlap: Boolean = config("clip_overlap", default = false) /** print detailed information about snps in reads (works only if -v also selected) */ - var print_snps: Boolean = config("print_snps", default = false) + var printSnps: Boolean = config("print_snps", default = false) /** print only failed alignments, those with no results */ - var failsonly: Boolean = config("failsonly", default = false) + var failsOnly: Boolean = config("failsonly", default = false) /** exclude printing of failed alignments */ - var nofails: Boolean = config("nofails", default = false) + var noFails: Boolean = config("nofails", default = false) /** print completely failed alignments as input fasta or fastq format */ - var fails_as_input: Boolean = config("fails_as_input", default = false) + var failsAsInput: Boolean = config("fails_as_input", default = false) /** another format type, other than default */ var format: Option[String] = config("format") /** basename for multiple-file output, separately for nomapping, */ - var split_output: Option[String] = config("split_output") + var splitOutput: Option[String] = config("split_output") /** when --split-output is given, this flag will append output to the */ - var append_output: Boolean = config("append_output", default = false) + var appendOutput: Boolean = config("append_output", default = false) /** buffer size, in queries, for output thread (default 1000). when the number */ - var output_buffer_size: Option[Int] = config("output_buffer_size") + var outputBufferSize: Option[Int] = config("output_buffer_size") /** do not print headers beginning with '@' */ - var no_sam_headers: Boolean = config("no_sam_headers", default = false) + var noSamHeaders: Boolean = config("no_sam_headers", default = false) /** print headers only for this batch, as specified by -q */ - var sam_headers_batch: Option[Int] = config("sam_headers_batch") + var samHeadersBatch: Option[Int] = config("sam_headers_batch") /** insert 0m in cigar between adjacent insertions and deletions */ - var sam_use_0M: Boolean = config("sam_use_0M", default = false) + var samUse0M: Boolean = config("sam_use_0M", default = false) /** allows multiple alignments to be marked as primary if they */ - var sam_multiple_primaries: Boolean = config("sam_multiple_primaries", default = false) + var samMultiplePrimaries: Boolean = config("sam_multiple_primaries", default = false) /** for rna-seq alignments, disallows xs:a:? when the sense direction */ - var force_xs_dir: Boolean = config("force_xs_dir", default = false) + var forceXsDir: Boolean = config("force_xs_dir", default = false) /** in md string, when known snps are given by the -v flag, */ - var md_lowercase_snp: Boolean = config("md_lowercase_snp", default = false) + var mdLowercaseSnp: Boolean = config("md_lowercase_snp", default = false) /** value to put into read-group id (rg-id) field */ - var read_group_id: Option[String] = config("read_group_id") + var readGroupId: Option[String] = config("read_group_id") /** value to put into read-group name (rg-sm) field */ - var read_group_name: Option[String] = config("read_group_name") + var readGroupName: Option[String] = config("read_group_name") /** value to put into read-group library (rg-lb) field */ - var read_group_library: Option[String] = config("read_group_library") + var readGroupLibrary: Option[String] = config("read_group_library") /** value to put into read-group library (rg-pl) field */ - var read_group_platform: Option[String] = config("read_group_platform") + var readGroupPlatform: Option[String] = config("read_group_platform") def versionRegex = """.* version (.*)""".r def versionCommand = executable + " --version" @@ -343,99 +343,99 @@ class Gsnap(val root: Configurable) extends BiopetCommandLineFunction with Refer required(executable) + optional("--dir", dir) + optional("--db", db) + - optional("--use-sarray", use_sarray) + + optional("--use-sarray", useSarray) + optional("--kmer", kmer) + optional("--sampling", sampling) + optional("--part", part) + - optional("--input-buffer-size", input_buffer_size) + - optional("--barcode-length", barcode_length) + + optional("--input-buffer-size", inputBufferSize) + + optional("--barcode-length", barcodeLength) + optional("--orientation", orientation) + - optional("--fastq-id-start", fastq_id_start) + - optional("--fastq-id-end", fastq_id_end) + - conditional(force_single_end, "--force-single-end") + - optional("--filter-chastity", filter_chastity) + - conditional(allow_pe_name_mismatch, "--allow-pe-name-mismatch") + + optional("--fastq-id-start", fastqIdStart) + + optional("--fastq-id-end", fastqIdEnd) + + conditional(forceSingleEnd, "--force-single-end") + + optional("--filter-chastity", filterChastity) + + conditional(allowPeNameMismatch, "--allow-pe-name-mismatch") + conditional(gunzip, "--gunzip") + conditional(bunzip2, "--bunzip2") + optional("--batch", batch) + - optional("--expand-offsets", expand_offsets) + - optional("--max-mismatches", max_mismatches) + - optional("--query-unk-mismatch", query_unk_mismatch) + - optional("--genome-unk-mismatch", genome_unk_mismatch) + + optional("--expand-offsets", expandOffsets) + + optional("--max-mismatches", maxMismatches) + + optional("--query-unk-mismatch", queryUnkMismatch) + + optional("--genome-unk-mismatch", genomeUnkMismatch) + optional("--maxsearch", maxsearch) + - optional("--terminal-threshold", terminal_threshold) + - optional("--terminal-output-minlength", terminal_output_minlength) + - optional("--indel-penalty", indel_penalty) + - optional("--indel-endlength", indel_endlength) + - optional("--max-middle-insertions", max_middle_insertions) + - optional("--max-middle-deletions", max_middle_deletions) + - optional("--max-end-insertions", max_end_insertions) + - optional("--max-end-deletions", max_end_deletions) + - optional("--suboptimal-levels", suboptimal_levels) + - optional("--adapter-strip", adapter_strip) + - optional("--trim-mismatch-score", trim_mismatch_score) + - optional("--trim-indel-score", trim_indel_score) + + optional("--terminal-threshold", terminalThreshold) + + optional("--terminal-output-minlength", terminalOutputMinlength) + + optional("--indel-penalty", indelPenalty) + + optional("--indel-endlength", indelEndlength) + + optional("--max-middle-insertions", maxMiddleInsertions) + + optional("--max-middle-deletions", maxMiddleDeletions) + + optional("--max-end-insertions", maxEndInsertions) + + optional("--max-end-deletions", maxEndDeletions) + + optional("--suboptimal-levels", suboptimalLevels) + + optional("--adapter-strip", adapterStrip) + + optional("--trim-mismatch-score", trimMismatchScore) + + optional("--trim-indel-score", trimIndelScore) + optional("--snpsdir", snpsdir) + - optional("--use-snps", use_snps) + + optional("--use-snps", useSnps) + optional("--cmetdir", cmetdir) + optional("--atoidir", atoidir) + optional("--mode", mode) + optional("--tallydir", tallydir) + - optional("--use-tally", use_tally) + - optional("--runlengthdir", runlengthdir) + - optional("--use-runlength", use_runlength) + + optional("--use-tally", useTally) + + optional("--runlengthdir", runLengthDir) + + optional("--use-runlength", useRunlength) + optional("--nthreads", threads) + - optional("--gmap-mode", gmap_mode) + - optional("--trigger-score-for-gmap", trigger_score_for_gmap) + - optional("--gmap-min-match-length", gmap_min_match_length) + - optional("--gmap-allowance", gmap_allowance) + - optional("--max-gmap-pairsearch", max_gmap_pairsearch) + - optional("--max-gmap-terminal", max_gmap_terminal) + - optional("--max-gmap-improvement", max_gmap_improvement) + - optional("--microexon-spliceprob", microexon_spliceprob) + - optional("--novelsplicing", novelsplicing) + - optional("--splicingdir", splicingdir) + - optional("--use-splicing", use_splicing) + - conditional(ambig_splice_noclip, "--ambig-splice-noclip") + - optional("--localsplicedist", localsplicedist) + - optional("--novelend-splicedist", novelend_splicedist) + - optional("--local-splice-penalty", local_splice_penalty) + - optional("--distant-splice-penalty", distant_splice_penalty) + - optional("--distant-splice-endlength", distant_splice_endlength) + - optional("--shortend-splice-endlength", shortend_splice_endlength) + - optional("--distant-splice-identity", distant_splice_identity) + - optional("--antistranded-penalty", antistranded_penalty) + - conditional(merge_distant_samechr, "--merge-distant-samechr") + - optional("--pairmax-dna", pairmax_dna) + - optional("--pairmax-rna", pairmax_rna) + - optional("--pairexpect", pairexpect) + - optional("--pairdev", pairdev) + - optional("--quality-protocol", quality_protocol) + - optional("--quality-zero-score", quality_zero_score) + - optional("--quality-print-shift", quality_print_shift) + + optional("--gmap-mode", gmapMode) + + optional("--trigger-score-for-gmap", triggerScoreForGmap) + + optional("--gmap-min-match-length", gmapMinMatchLength) + + optional("--gmap-allowance", gmapAllowance) + + optional("--max-gmap-pairsearch", maxGmapPairsearch) + + optional("--max-gmap-terminal", maxGmapTerminal) + + optional("--max-gmap-improvement", maxGmapImprovement) + + optional("--microexon-spliceprob", microExonSpliceprob) + + optional("--novelsplicing", novelSplicing) + + optional("--splicingdir", splicingDir) + + optional("--use-splicing", useSplicing) + + conditional(ambigSpliceNoclip, "--ambig-splice-noclip") + + optional("--localsplicedist", localSpliceDist) + + optional("--novelend-splicedist", novelEndSplicedist) + + optional("--local-splice-penalty", localSplicePenalty) + + optional("--distant-splice-penalty", distantSplicePenalty) + + optional("--distant-splice-endlength", distantSpliceEndlength) + + optional("--shortend-splice-endlength", shortendSpliceEndlength) + + optional("--distant-splice-identity", distantSpliceIdentity) + + optional("--antistranded-penalty", antiStrandedPenalty) + + conditional(mergeDistantSamechr, "--merge-distant-samechr") + + optional("--pairmax-dna", pairmaxDna) + + optional("--pairmax-rna", pairmaxRna) + + optional("--pairexpect", pairExpect) + + optional("--pairdev", pairDev) + + optional("--quality-protocol", qualityProtocol) + + optional("--quality-zero-score", qualityZeroScore) + + optional("--quality-print-shift", qualityPrintShift) + optional("--npaths", npaths) + - conditional(quiet_if_excessive, "--quiet-if-excessive") + + conditional(quietIfExcessive, "--quiet-if-excessive") + conditional(ordered, "--ordered") + - conditional(show_refdiff, "--show-refdiff") + - conditional(clip_overlap, "--clip-overlap") + - conditional(print_snps, "--print-snps") + - conditional(failsonly, "--failsonly") + - conditional(nofails, "--nofails") + - conditional(fails_as_input, "--fails-as-input") + + conditional(showRefdiff, "--show-refdiff") + + conditional(clipOverlap, "--clip-overlap") + + conditional(printSnps, "--print-snps") + + conditional(failsOnly, "--failsonly") + + conditional(noFails, "--nofails") + + conditional(failsAsInput, "--fails-as-input") + optional("--format", format) + - optional("--split-output", split_output) + - conditional(append_output, "--append-output") + - optional("--output-buffer-size", output_buffer_size) + - conditional(no_sam_headers, "--no-sam-headers") + - optional("--sam-headers-batch", sam_headers_batch) + - conditional(sam_use_0M, "--sam-use-0M") + - conditional(sam_multiple_primaries, "--sam-multiple-primaries") + - conditional(force_xs_dir, "--force-xs-dir") + - conditional(md_lowercase_snp, "--md-lowercase-snp") + - optional("--read-group-id", read_group_id) + - optional("--read-group-name", read_group_name) + - optional("--read-group-library", read_group_library) + - optional("--read-group-platform", read_group_platform) + + optional("--split-output", splitOutput) + + conditional(appendOutput, "--append-output") + + optional("--output-buffer-size", outputBufferSize) + + conditional(noSamHeaders, "--no-sam-headers") + + optional("--sam-headers-batch", samHeadersBatch) + + conditional(samUse0M, "--sam-use-0M") + + conditional(samMultiplePrimaries, "--sam-multiple-primaries") + + conditional(forceXsDir, "--force-xs-dir") + + conditional(mdLowercaseSnp, "--md-lowercase-snp") + + optional("--read-group-id", readGroupId) + + optional("--read-group-name", readGroupName) + + optional("--read-group-library", readGroupLibrary) + + optional("--read-group-platform", readGroupPlatform) + repeat(input) + " > " + required(output) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/Kraken.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/Kraken.scala index e4470ffec4f37c3b2c71a0b88f095cefb079741e..99f2621ee36e22e208feff99892ac6dfd2fb57f5 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/Kraken.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/Kraken.scala @@ -29,10 +29,10 @@ class Kraken(val root: Configurable) extends BiopetCommandLineFunction with Vers var input: List[File] = _ @Output(doc = "Unidentified reads", required = false) - var unclassified_out: Option[File] = None + var unclassifiedOut: Option[File] = None @Output(doc = "Identified reads", required = false) - var classified_out: Option[File] = None + var classifiedOut: Option[File] = None @Output(doc = "Output with hits per sequence") var output: File = _ @@ -53,7 +53,7 @@ class Kraken(val root: Configurable) extends BiopetCommandLineFunction with Vers def versionCommand = executable + " --version" - override def defaultCoreMemory = 15.0 + override def defaultCoreMemory = 17.0 override def defaultThreads = 4 @@ -69,8 +69,8 @@ class Kraken(val root: Configurable) extends BiopetCommandLineFunction with Vers optional("--threads", nCoresRequest) + conditional(quick, "--quick") + optional("--min_hits", minHits) + - optional("--unclassified-out ", unclassified_out) + - optional("--classified-out ", classified_out) + + optional("--unclassified-out ", unclassifiedOut) + + optional("--classified-out ", classifiedOut) + required("--output", output) + conditional(preLoad, "--preload") + conditional(paired, "--paired") + diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/KrakenReport.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/KrakenReport.scala index aa6e825bbed68c2724e98849019095d5d9ac71ec..00b29970de84add7b4825841c756e25a1adf2cd8 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/KrakenReport.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/kraken/KrakenReport.scala @@ -39,7 +39,7 @@ class KrakenReport(val root: Configurable) extends BiopetCommandLineFunction wit } var db: File = config("db") - var show_zeros: Boolean = config("show_zeros", default = false) + var showZeros: Boolean = config("show_zeros", default = false) @Input(doc = "Input raw kraken analysis") var input: File = _ @@ -49,7 +49,7 @@ class KrakenReport(val root: Configurable) extends BiopetCommandLineFunction wit def cmdLine: String = required(executable) + required("--db", db) + - conditional(show_zeros, "--show-zeros") + + conditional(showZeros, "--show-zeros") + required(input) + " > " + required(output) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/macs2/Macs2CallPeak.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/macs2/Macs2CallPeak.scala index 10fb47713bff4ef480159bcfe4fe3eb787751d96..76a0c4ae56641f861f69fcd133b1e8c40f0f1906 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/macs2/Macs2CallPeak.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/macs2/Macs2CallPeak.scala @@ -29,22 +29,22 @@ class Macs2CallPeak(val root: Configurable) extends Macs2 { var control: File = _ @Output(doc = "Output file NARROWPEAKS") - private var output_narrow: File = _ + private var outputNarrow: File = _ @Output(doc = "Output file BROADPEAKS") - private var output_broad: File = _ + private var outputBroad: File = _ @Output(doc = "Output in Excel format") - private var output_xls: File = _ + private var outputXls: File = _ @Output(doc = "R script with Bimodal model") - private var output_r: File = _ + private var outputR: File = _ @Output(doc = "Output file Bedgraph") - private var output_bdg: File = _ + private var outputBdg: File = _ @Output(doc = "Output file gappedPeak") - private var output_gapped: File = _ + private var outputGapped: File = _ var fileformat: Option[String] = config("fileformat") var gsize: Option[Float] = config("gsize") @@ -77,12 +77,12 @@ class Macs2CallPeak(val root: Configurable) extends Macs2 { override def beforeGraph(): Unit = { if (name.isEmpty) throw new IllegalArgumentException("Name is not defined") if (outputdir == null) throw new IllegalArgumentException("Outputdir is not defined") - output_narrow = new File(outputdir + name.get + ".narrowPeak") - output_broad = new File(outputdir + name.get + ".broadPeak") - output_xls = new File(outputdir + name.get + ".xls") - output_bdg = new File(outputdir + name.get + ".bdg") - output_r = new File(outputdir + name.get + ".r") - output_gapped = new File(outputdir + name.get + ".gappedPeak") + outputNarrow = new File(outputdir + name.get + ".narrowPeak") + outputBroad = new File(outputdir + name.get + ".broadPeak") + outputXls = new File(outputdir + name.get + ".xls") + outputBdg = new File(outputdir + name.get + ".bdg") + outputR = new File(outputdir + name.get + ".r") + outputGapped = new File(outputdir + name.get + ".gappedPeak") } /** Returns command to execute */ diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/Manwe.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/Manwe.scala index ac60ddb64d2dbf179d8a418a07263b6a083b5ca0..a905d4d57ebfd3a781af91e02ee9109968e20998 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/Manwe.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/Manwe.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.{ PrintWriter, File } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateBed.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateBed.scala index f3c3fbb1e22a019a18b7eb8caa6d262e2f3347f8..5b7099707682892e6b2fc4fe7b846082b23c32ff 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateBed.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateBed.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateVcf.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateVcf.scala index 64a849a536d4695facbfd290379426fd3d646287..fce5457b7cc1b8df36fd99ace9d54785b33f9959 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateVcf.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweAnnotateVcf.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesAnnotate.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesAnnotate.scala index 0980b4c23d4450c95b17baa8c1d2f52bfb6a63d5..36a4b16b0ba792a766c6e976de12c2e42fcfdf7c 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesAnnotate.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesAnnotate.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import nl.lumc.sasc.biopet.utils.config.Configurable diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesDownload.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesDownload.scala index d55dd6a4f8c8c181bd565c60d085b938a08fe4a0..4b287a56e9dde5c9a0da6e2643e3613136d2fa97 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesDownload.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesDownload.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import nl.lumc.sasc.biopet.utils.config.Configurable diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesList.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesList.scala index dbcb7b46913e7bb66ba41d74bc141e3ed861a0b0..2573c44be7fc18e31e5e7e414f0d9e8015c34bbe 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesList.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesList.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesShow.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesShow.scala index 662bb51b79e292a352d524ca2437dd953737d894..23ff3a38c68df35500f91f29d371c34c705c3c53 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesShow.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweDataSourcesShow.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesActivate.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesActivate.scala index 3815d64a258dd52b354bb2be2be1610d6eb2d4cb..38ab41565538a1c126789e74ba4c5bb348c653f0 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesActivate.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesActivate.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAdd.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAdd.scala index fc5cababd9c141208b2a8ec945895babaac0dc4a..40561115db9aebec5c19dc701adf39d0b75a8811 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAdd.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAdd.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAnnotateVariations.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAnnotateVariations.scala index 04092cc14b1cd39fd8d8a006c77dfadfa9163627..d4b5e268bdea849682e97e05af44542cb18fa1a5 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAnnotateVariations.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesAnnotateVariations.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImport.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImport.scala index e683f2c2a7d8e31481becf2d9ed4b6567efc744f..c765b0bcedc2f09802dc9faa3bdb8e199b4b8ead 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImport.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImport.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportBed.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportBed.scala index 9eb3f12d0974d95af940b0e34f3b930fd06ba0a1..d0cd1306e5789f19f89279e91a3e98a93105d01d 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportBed.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportBed.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportVcf.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportVcf.scala index 61f6a5f223b4af57d6066ae8f80cb0164cb8cd53..17c33d5991d8ef31b3d5fd84d5bf6a58dc924bbf 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportVcf.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesImportVcf.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesList.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesList.scala index d79b85fe1048210830d532091e4d588dac9c96d4..a8f688c0ea5ca41a1703b1a19c4f670496156f02 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesList.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesList.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesShow.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesShow.scala index 6a73f84a44c44505509fe993a6fbca92da393b66..798e8290fdd1d423528d29789edc5e14a2a3763c 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesShow.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/manwe/ManweSamplesShow.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.manwe import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelCaller.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelCaller.scala index 5769a9763abe460440ad71c62c458ef451a4f986..a3b3fc84b412eab6e50f6dc2c166fb58c0ab5ec0 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelCaller.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelCaller.scala @@ -116,12 +116,12 @@ class PindelCaller(val root: Configurable) extends BiopetCommandLineFunction wit var ploidy: Option[File] = config("ploidy") var detectDD: Boolean = config("detect_DD", default = false) - var MAX_DD_BREAKPOINT_DISTANCE: Option[Int] = config("MAX_DD_BREAKPOINT_DISTANCE") - var MAX_DISTANCE_CLUSTER_READS: Option[Int] = config("MAX_DISTANCE_CLUSTER_READS") - var MIN_DD_CLUSTER_SIZE: Option[Int] = config("MIN_DD_CLUSTER_SIZE") - var MIN_DD_BREAKPOINT_SUPPORT: Option[Int] = config("MIN_DD_BREAKPOINT_SUPPORT") - var MIN_DD_MAP_DISTANCE: Option[Int] = config("MIN_DD_MAP_DISTANCE") - var DD_REPORT_DUPLICATION_READS: Option[Int] = config("DD_REPORT_DUPLICATION_READS") + var maxDdBreakpointDistance: Option[Int] = config("max_dd_breakpoint_distance") + var maxDistanceClusterReads: Option[Int] = config("max_distance_cluster_reads") + var minDdClusterSize: Option[Int] = config("min_dd_cluster_size") + var minDdBreakpointSupport: Option[Int] = config("min_dd_Breakpoint_support") + var minDdMapDistance: Option[Int] = config("min_dd_map_distance") + var ddReportDuplicationReads: Option[Int] = config("dd_report_duplication_reads") override def beforeGraph: Unit = { if (reference == null) reference = referenceFasta() @@ -201,12 +201,12 @@ class PindelCaller(val root: Configurable) extends BiopetCommandLineFunction wit optional("--name_of_logfile", nameOfLogfile) + optional("--Ploidy", ploidy) + conditional(detectDD, "detect_DD") + - optional("--MAX_DD_BREAKPOINT_DISTANCE", MAX_DD_BREAKPOINT_DISTANCE) + - optional("--MAX_DISTANCE_CLUSTER_READS", MAX_DISTANCE_CLUSTER_READS) + - optional("--MIN_DD_CLUSTER_SIZE", MIN_DD_CLUSTER_SIZE) + - optional("--MIN_DD_BREAKPOINT_SUPPORT", MIN_DD_BREAKPOINT_SUPPORT) + - optional("--MIN_DD_MAP_DISTANCE", MIN_DD_MAP_DISTANCE) + - optional("--DD_REPORT_DUPLICATION_READS", DD_REPORT_DUPLICATION_READS) + optional("--MAX_DD_BREAKPOINT_DISTANCE", maxDdBreakpointDistance) + + optional("--MAX_DISTANCE_CLUSTER_READS", maxDistanceClusterReads) + + optional("--MIN_DD_CLUSTER_SIZE", minDdClusterSize) + + optional("--MIN_DD_BREAKPOINT_SUPPORT", minDdBreakpointSupport) + + optional("--MIN_DD_MAP_DISTANCE", minDdMapDistance) + + optional("--DD_REPORT_DUPLICATION_READS", ddReportDuplicationReads) } object PindelCaller { diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelVCF.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelVCF.scala index ee64a7236628ecbc013d6adc38db890d94863216..386dc75ee4abf49f16c7d832269f0942922adabe 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelVCF.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/pindel/PindelVCF.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.pindel import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/AssignTaxonomy.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/AssignTaxonomy.scala index b41dbfd0dcc524f4029da66a4b1d8a6724514399..ee84bf041f4e0dcad6af923e09b071c32aa08358 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/AssignTaxonomy.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/AssignTaxonomy.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -16,38 +31,38 @@ class AssignTaxonomy(val root: Configurable) extends BiopetCommandLineFunction w var inputFasta: File = _ @Input(required = false) - var read_1_seqs_fp: Option[File] = None + var read1SeqsFp: Option[File] = None @Input(required = false) - var read_2_seqs_fp: Option[File] = None + var read2SeqsFp: Option[File] = None @Input(required = false) - var id_to_taxonomy_fp: Option[File] = config("id_to_taxonomy_fp") + var idToTaxonomyFp: Option[File] = config("id_to_taxonomy_fp") @Input(required = false) - var reference_seqs_fp: Option[File] = config("reference_seqs_fp") + var referenceSeqsFp: Option[File] = config("reference_seqs_fp") @Input(required = false) - var training_data_properties_fp: Option[File] = config("training_data_properties_fp") + var trainingDataPropertiesFp: Option[File] = config("training_data_properties_fp") - var single_ok: Boolean = config("single_ok", default = false) - var no_single_ok_generic: Boolean = config("no_single_ok_generic", default = false) + var singleOk: Boolean = config("single_ok", default = false) + var noSingleOkGeneric: Boolean = config("no_single_ok_generic", default = false) - var amplicon_id_regex: Option[String] = config("amplicon_id_regex") - var header_id_regex: Option[String] = config("header_id_regex") - var assignment_method: Option[String] = config("assignment_method") - var sortmerna_db: Option[String] = config("sortmerna_db") - var sortmerna_e_value: Option[String] = config("sortmerna_e_value") - var sortmerna_coverage: Option[String] = config("sortmerna_coverage") - var sortmerna_best_N_alignments: Option[String] = config("sortmerna_best_N_alignments") - var sortmerna_threads: Option[String] = config("sortmerna_threads") - var blast_db: Option[String] = config("blast_db") + var ampliconIdRegex: Option[String] = config("amplicon_id_regex") + var headerIdRegex: Option[String] = config("header_id_regex") + var assignmentMethod: Option[String] = config("assignment_method") + var sortmernaDb: Option[String] = config("sortmerna_db") + var sortmernaEValue: Option[String] = config("sortmerna_e_value") + var sortmernaCoverage: Option[String] = config("sortmerna_coverage") + var sortmernaBestNAlignments: Option[String] = config("sortmerna_best_N_alignments") + var sortmernaThreads: Option[String] = config("sortmerna_threads") + var blastDb: Option[String] = config("blast_db") var confidence: Option[String] = config("confidence") - var min_consensus_fraction: Option[String] = config("min_consensus_fraction") + var minConsensusFraction: Option[String] = config("min_consensus_fraction") var similarity: Option[String] = config("similarity") - var uclust_max_accepts: Option[String] = config("uclust_max_accepts") - var rdp_max_memory: Option[String] = config("rdp_max_memory") - var blast_e_value: Option[String] = config("blast_e_value") + var uclustMaxAccepts: Option[String] = config("uclust_max_accepts") + var rdpMaxMemory: Option[String] = config("rdp_max_memory") + var blastEValue: Option[String] = config("blast_e_value") var outputDir: File = _ def versionCommand = executable + " --version" @@ -61,27 +76,27 @@ class AssignTaxonomy(val root: Configurable) extends BiopetCommandLineFunction w def cmdLine = executable + required("-i", inputFasta) + - optional("--read_1_seqs_fp", read_1_seqs_fp) + - optional("--read_2_seqs_fp", read_2_seqs_fp) + - optional("-t", id_to_taxonomy_fp) + - optional("-r", reference_seqs_fp) + - optional("-p", training_data_properties_fp) + - optional("--amplicon_id_regex", amplicon_id_regex) + - optional("--header_id_regex", header_id_regex) + - optional("--assignment_method", assignment_method) + - optional("--sortmerna_db", sortmerna_db) + - optional("--sortmerna_e_value", sortmerna_e_value) + - optional("--sortmerna_coverage", sortmerna_coverage) + - optional("--sortmerna_best_N_alignments", sortmerna_best_N_alignments) + - optional("--sortmerna_threads", sortmerna_threads) + - optional("--blast_db", blast_db) + + optional("--read_1_seqs_fp", read1SeqsFp) + + optional("--read_2_seqs_fp", read2SeqsFp) + + optional("-t", idToTaxonomyFp) + + optional("-r", referenceSeqsFp) + + optional("-p", trainingDataPropertiesFp) + + optional("--amplicon_id_regex", ampliconIdRegex) + + optional("--header_id_regex", headerIdRegex) + + optional("--assignment_method", assignmentMethod) + + optional("--sortmerna_db", sortmernaDb) + + optional("--sortmerna_e_value", sortmernaEValue) + + optional("--sortmerna_coverage", sortmernaCoverage) + + optional("--sortmerna_best_N_alignments", sortmernaBestNAlignments) + + optional("--sortmerna_threads", sortmernaThreads) + + optional("--blast_db", blastDb) + optional("--confidence", confidence) + - optional("--min_consensus_fraction", min_consensus_fraction) + + optional("--min_consensus_fraction", minConsensusFraction) + optional("--similarity", similarity) + - optional("--uclust_max_accepts", uclust_max_accepts) + - optional("--rdp_max_memory", rdp_max_memory) + - optional("--blast_e_value", blast_e_value) + + optional("--uclust_max_accepts", uclustMaxAccepts) + + optional("--rdp_max_memory", rdpMaxMemory) + + optional("--blast_e_value", blastEValue) + required("--output_dir", outputDir) + - conditional(single_ok, "--single_ok") + - conditional(no_single_ok_generic, "--no_single_ok_generic") + conditional(singleOk, "--single_ok") + + conditional(noSingleOkGeneric, "--no_single_ok_generic") } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuMaps.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuMaps.scala index a538e97a1c281256b007ee37e115da48bfb0e393..db9ed0f7e3c4875afcf94babe19cd686783107de 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuMaps.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuMaps.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -21,7 +36,9 @@ class MergeOtuMaps(val root: Configurable) extends BiopetCommandLineFunction wit @Output(required = true) var outputFile: File = _ - var failures_fp: Option[File] = None + var failuresFp: Option[File] = None + + override def defaultCoreMemory = 4.0 override def beforeGraph(): Unit = { super.beforeGraph() @@ -35,5 +52,5 @@ class MergeOtuMaps(val root: Configurable) extends BiopetCommandLineFunction wit case _ => "" }) + required("-o", outputFile) + - optional("--failures_fp", failures_fp) + optional("--failures_fp", failuresFp) } \ No newline at end of file diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuTables.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuTables.scala index ae3d4edd29793327a5621339cf7463c288a8ae08..1f5d5d1e944546568ae2f89ca3dae71567e0871b 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuTables.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/MergeOtuTables.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -21,6 +36,8 @@ class MergeOtuTables(val root: Configurable) extends BiopetCommandLineFunction w @Output(required = true) var outputFile: File = _ + override def defaultCoreMemory = 4.0 + override def beforeGraph(): Unit = { super.beforeGraph() require(input.nonEmpty) diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickClosedReferenceOtus.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickClosedReferenceOtus.scala index 265a6d21f941bfae7bc3c6b7c742993b9638e6c3..07fef8aef1ee831aa258555b2358660a396dea69 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickClosedReferenceOtus.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickClosedReferenceOtus.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -17,24 +32,24 @@ class PickClosedReferenceOtus(val root: Configurable) extends BiopetCommandLineF var outputDir: File = null - override def defaultThreads = 2 - override def defaultCoreMemory = 10.0 + override def defaultThreads = 3 + override def defaultCoreMemory = 16.0 def versionCommand = executable + " --version" def versionRegex = """Version: (.*)""".r @Input(required = false) - var parameter_fp: Option[File] = config("parameter_fp") + var parameterFp: Option[File] = config("parameter_fp") @Input(required = false) - var reference_fp: Option[File] = config("reference_fp") + var referenceFp: Option[File] = config("reference_fp") @Input(required = false) - var taxonomy_fp: Option[File] = config("taxonomy_fp") + var taxonomyFp: Option[File] = config("taxonomy_fp") - var assign_taxonomy: Boolean = config("assign_taxonomy", default = false) + var assignTaxonomy: Boolean = config("assign_taxonomy", default = false) var force: Boolean = config("force", default = false) - var print_only: Boolean = config("print_only", default = false) - var suppress_taxonomy_assignment: Boolean = config("suppress_taxonomy_assignment", default = false) + var printOnly: Boolean = config("print_only", default = false) + var suppressTaxonomyAssignment: Boolean = config("suppress_taxonomy_assignment", default = false) def otuTable = new File(outputDir, "otu_table.biom") def otuMap = new File(outputDir, "uclust_ref_picked_otus" + File.separator + "seqs_otus.txt") @@ -49,13 +64,13 @@ class PickClosedReferenceOtus(val root: Configurable) extends BiopetCommandLineF def cmdLine = executable + required("-f") + required("-i", inputFasta) + required("-o", outputDir) + - optional("--reference_fp", reference_fp) + - optional("--parameter_fp", parameter_fp) + - optional("--taxonomy_fp", taxonomy_fp) + - conditional(assign_taxonomy, "--assign_taxonomy") + + optional("--reference_fp", referenceFp) + + optional("--parameter_fp", parameterFp) + + optional("--taxonomy_fp", taxonomyFp) + + conditional(assignTaxonomy, "--assign_taxonomy") + conditional(force, "--force") + - conditional(print_only, "--print_only") + - conditional(suppress_taxonomy_assignment, "--suppress_taxonomy_assignment") + + conditional(printOnly, "--print_only") + + conditional(suppressTaxonomyAssignment, "--suppress_taxonomy_assignment") + (if (threads > 1) required("-a") + required("-O", threads) else "") } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickOtus.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickOtus.scala index f83c59aa9dc61ad74297e707f2a1a6452780b9a2..744c52e29c8f1f9a53c166648d09cad6ae25f3a4 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickOtus.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickOtus.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -22,59 +37,59 @@ class PickOtus(val root: Configurable) extends BiopetCommandLineFunction with Ve def versionCommand = executable + " --version" def versionRegex = """Version: (.*)""".r - var otu_picking_method: Option[String] = config("otu_picking_method") - var clustering_algorithm: Option[String] = config("clustering_algorithm") - var max_cdhit_memory: Option[Int] = config("max_cdhit_memory") - var refseqs_fp: Option[String] = config("refseqs_fp") - var blast_db: Option[String] = config("blast_db") - var max_e_value_blast: Option[String] = config("max_e_value_blast") - var sortmerna_db: Option[String] = config("sortmerna_db") - var sortmerna_e_value: Option[Double] = config("sortmerna_e_value") - var sortmerna_coverage: Option[Double] = config("sortmerna_coverage") - var sortmerna_tabular: Boolean = config("sortmerna_tabular", default = false) - var sortmerna_best_N_alignments: Option[Int] = config("sortmerna_best_N_alignments") - var sortmerna_max_pos: Option[Int] = config("sortmerna_max_pos") - var min_aligned_percent: Option[Double] = config("min_aligned_percent") + var otuPickingMethod: Option[String] = config("otu_picking_method") + var clusteringAlgorithm: Option[String] = config("clustering_algorithm") + var maxCdhitMemory: Option[Int] = config("max_cdhit_memory") + var refseqsFp: Option[String] = config("refseqs_fp") + var blastDb: Option[String] = config("blast_db") + var maxEValueBlast: Option[String] = config("max_e_value_blast") + var sortmernaDb: Option[String] = config("sortmerna_db") + var sortmernaEValue: Option[Double] = config("sortmerna_e_value") + var sortmernaCoverage: Option[Double] = config("sortmerna_coverage") + var sortmernaTabular: Boolean = config("sortmerna_tabular", default = false) + var sortmernaBestNAlignments: Option[Int] = config("sortmerna_best_N_alignments") + var sortmernaMaxPos: Option[Int] = config("sortmerna_max_pos") + var minAlignedPercent: Option[Double] = config("min_aligned_percent") var similarity: Option[Double] = config("similarity") - var sumaclust_exact: Option[String] = config("sumaclust_exact") - var sumaclust_l: Boolean = config("sumaclust_l", default = false) - var denovo_otu_id_prefix: Option[String] = config("denovo_otu_id_prefix") - var swarm_resolution: Option[String] = config("swarm_resolution") - var trie_reverse_seqs: Boolean = config("trie_reverse_seqs", default = false) - var prefix_prefilter_length: Option[String] = config("prefix_prefilter_length") - var trie_prefilter: Option[String] = config("trie_prefilter") - var prefix_length: Option[String] = config("prefix_length") - var suffix_length: Option[String] = config("suffix_length") - var enable_rev_strand_match: Boolean = config("enable_rev_strand_match", default = false) - var suppress_presort_by_abundance_uclust: Boolean = config("suppress_presort_by_abundance_uclust", default = false) - var optimal_uclust: Boolean = config("optimal_uclust", default = false) - var exact_uclust: Boolean = config("exact_uclust", default = false) - var user_sort: Boolean = config("user_sort", default = false) - var suppress_new_clusters: Boolean = config("suppress_new_clusters", default = false) - var max_accepts: Option[String] = config("max_accepts") - var max_rejects: Option[String] = config("max_rejects") + var sumaclustExact: Option[String] = config("sumaclust_exact") + var sumaclustL: Boolean = config("sumaclust_l", default = false) + var denovoOtuIdPrefix: Option[String] = config("denovo_otu_id_prefix") + var swarmResolution: Option[String] = config("swarm_resolution") + var trieReverseSeqs: Boolean = config("trie_reverse_seqs", default = false) + var prefixPrefilterLength: Option[String] = config("prefix_prefilter_length") + var triePrefilter: Option[String] = config("trie_prefilter") + var prefixLength: Option[String] = config("prefix_length") + var suffixLength: Option[String] = config("suffix_length") + var enableRevStrandMatch: Boolean = config("enable_rev_strand_match", default = false) + var suppressPresortByAbundanceUclust: Boolean = config("suppress_presort_by_abundance_uclust", default = false) + var optimalUclust: Boolean = config("optimal_uclust", default = false) + var exactUclust: Boolean = config("exact_uclust", default = false) + var userSort: Boolean = config("user_sort", default = false) + var suppressNewClusters: Boolean = config("suppress_new_clusters", default = false) + var maxAccepts: Option[String] = config("max_accepts") + var maxRejects: Option[String] = config("max_rejects") var stepwords: Option[String] = config("stepwords") - var word_length: Option[String] = config("word_length") - var suppress_uclust_stable_sort: Boolean = config("suppress_uclust_stable_sort", default = false) - var suppress_prefilter_exact_match: Boolean = config("suppress_prefilter_exact_match", default = false) - var save_uc_files: Boolean = config("save_uc_files", default = false) - var percent_id_err: Option[String] = config("percent_id_err") - var minsize: Option[String] = config("minsize") - var abundance_skew: Option[String] = config("abundance_skew") - var db_filepath: Option[String] = config("db_filepath") - var perc_id_blast: Option[String] = config("perc_id_blast") - var de_novo_chimera_detection: Boolean = config("de_novo_chimera_detection", default = false) - var suppress_de_novo_chimera_detection: Boolean = config("suppress_de_novo_chimera_detection", default = false) - var reference_chimera_detection: Option[String] = config("reference_chimera_detection") - var suppress_reference_chimera_detection: Option[String] = config("suppress_reference_chimera_detection") - var cluster_size_filtering: Option[String] = config("cluster_size_filtering") - var suppress_cluster_size_filtering: Option[String] = config("suppress_cluster_size_filtering") - var remove_usearch_logs: Boolean = config("remove_usearch_logs", default = false) - var derep_fullseq: Boolean = config("derep_fullseq", default = false) - var non_chimeras_retention: Option[String] = config("non_chimeras_retention") + var wordLength: Option[String] = config("word_length") + var suppressUclustStableSort: Boolean = config("suppress_uclust_stable_sort", default = false) + var suppressPrefilterExactMatch: Boolean = config("suppress_prefilter_exact_match", default = false) + var saveUcFiles: Boolean = config("save_uc_files", default = false) + var percentIdErr: Option[String] = config("percent_id_err") + var minSize: Option[String] = config("minsize") + var abundanceSkew: Option[String] = config("abundance_skew") + var dbFilepath: Option[String] = config("db_filepath") + var percIdBlast: Option[String] = config("perc_id_blast") + var deNovoChimeraDetection: Boolean = config("de_novo_chimera_detection", default = false) + var suppressDeNovoChimeraDetection: Boolean = config("suppress_de_novo_chimera_detection", default = false) + var referenceChimeraDetection: Option[String] = config("reference_chimera_detection") + var suppressReferenceChimeraDetection: Option[String] = config("suppress_reference_chimera_detection") + var clusterSizeFiltering: Option[String] = config("cluster_size_filtering") + var suppressClusterSizeFiltering: Option[String] = config("suppress_cluster_size_filtering") + var removeUsearchLogs: Boolean = config("remove_usearch_logs", default = false) + var derepFullseq: Boolean = config("derep_fullseq", default = false) + var nonChimerasRetention: Option[String] = config("non_chimeras_retention") var minlen: Option[String] = config("minlen") - var usearch_fast_cluster: Boolean = config("usearch_fast_cluster", default = false) - var usearch61_sort_method: Option[String] = config("usearch61_sort_method") + var usearchFastCluster: Boolean = config("usearch_fast_cluster", default = false) + var usearch61SortMethod: Option[String] = config("usearch61_sort_method") var sizeorder: Boolean = config("sizeorder", default = false) private lazy val name = inputFasta.getName.stripSuffix(".fasta").stripSuffix(".fa").stripSuffix(".fna") @@ -93,59 +108,59 @@ class PickOtus(val root: Configurable) extends BiopetCommandLineFunction with Ve def cmdLine = executable + required("-i", inputFasta) + required("-o", outputDir) + - optional("-m", otu_picking_method) + - optional("-c", clustering_algorithm) + - optional("-M", max_cdhit_memory) + - optional("-r", refseqs_fp) + - optional("-b", blast_db) + - optional("-e", max_e_value_blast) + - optional("--sortmerna_db", sortmerna_db) + - optional("--sortmerna_e_value", sortmerna_e_value) + - optional("--sortmerna_coverage", sortmerna_coverage) + - conditional(sortmerna_tabular, "--sortmerna_tabular") + - optional("--sortmerna_best_N_alignments", sortmerna_best_N_alignments) + - optional("--sortmerna_max_pos", sortmerna_max_pos) + - optional("--min_aligned_percent", min_aligned_percent) + + optional("-m", otuPickingMethod) + + optional("-c", clusteringAlgorithm) + + optional("-M", maxCdhitMemory) + + optional("-r", refseqsFp) + + optional("-b", blastDb) + + optional("-e", maxEValueBlast) + + optional("--sortmerna_db", sortmernaDb) + + optional("--sortmerna_e_value", sortmernaEValue) + + optional("--sortmerna_coverage", sortmernaCoverage) + + conditional(sortmernaTabular, "--sortmerna_tabular") + + optional("--sortmerna_best_N_alignments", sortmernaBestNAlignments) + + optional("--sortmerna_max_pos", sortmernaMaxPos) + + optional("--min_aligned_percent", minAlignedPercent) + optional("--similarity", similarity) + - optional("--sumaclust_exact", sumaclust_exact) + - conditional(sumaclust_l, "--sumaclust_l") + - optional("--denovo_otu_id_prefix", denovo_otu_id_prefix) + - optional("--swarm_resolution", swarm_resolution) + - conditional(trie_reverse_seqs, "--trie_reverse_seqs") + - optional("--prefix_prefilter_length", prefix_prefilter_length) + - optional("--trie_prefilter", trie_prefilter) + - optional("--prefix_length", prefix_length) + - optional("--suffix_length", suffix_length) + - conditional(enable_rev_strand_match, "--enable_rev_strand_match") + - conditional(suppress_presort_by_abundance_uclust, "--suppress_presort_by_abundance_uclust") + - conditional(optimal_uclust, "--optimal_uclust") + - conditional(exact_uclust, "--exact_uclust") + - conditional(user_sort, "--user_sort") + - conditional(suppress_new_clusters, "--suppress_new_clusters") + - optional("--max_accepts", max_accepts) + - optional("--max_rejects", max_rejects) + + optional("--sumaclust_exact", sumaclustExact) + + conditional(sumaclustL, "--sumaclust_l") + + optional("--denovo_otu_id_prefix", denovoOtuIdPrefix) + + optional("--swarm_resolution", swarmResolution) + + conditional(trieReverseSeqs, "--trie_reverse_seqs") + + optional("--prefix_prefilter_length", prefixPrefilterLength) + + optional("--trie_prefilter", triePrefilter) + + optional("--prefix_length", prefixLength) + + optional("--suffix_length", suffixLength) + + conditional(enableRevStrandMatch, "--enable_rev_strand_match") + + conditional(suppressPresortByAbundanceUclust, "--suppress_presort_by_abundance_uclust") + + conditional(optimalUclust, "--optimal_uclust") + + conditional(exactUclust, "--exact_uclust") + + conditional(userSort, "--user_sort") + + conditional(suppressNewClusters, "--suppress_new_clusters") + + optional("--max_accepts", maxAccepts) + + optional("--max_rejects", maxRejects) + optional("--stepwords", stepwords) + - optional("--word_length", word_length) + - conditional(suppress_uclust_stable_sort, "--suppress_uclust_stable_sort") + - conditional(suppress_prefilter_exact_match, "--suppress_prefilter_exact_match") + - conditional(save_uc_files, "--save_uc_files") + - optional("--percent_id_err", percent_id_err) + - optional("--minsize", minsize) + - optional("--abundance_skew", abundance_skew) + - optional("--db_filepath", db_filepath) + - optional("--perc_id_blast", perc_id_blast) + - conditional(de_novo_chimera_detection, "--de_novo_chimera_detection") + - conditional(suppress_de_novo_chimera_detection, "--suppress_de_novo_chimera_detection") + - optional("--reference_chimera_detection", reference_chimera_detection) + - optional("--suppress_reference_chimera_detection", suppress_reference_chimera_detection) + - optional("--cluster_size_filtering", cluster_size_filtering) + - optional("--suppress_cluster_size_filtering", suppress_cluster_size_filtering) + - conditional(remove_usearch_logs, "--remove_usearch_logs") + - conditional(derep_fullseq, "--derep_fullseq") + - optional("--non_chimeras_retention", non_chimeras_retention) + + optional("--word_length", wordLength) + + conditional(suppressUclustStableSort, "--suppress_uclust_stable_sort") + + conditional(suppressPrefilterExactMatch, "--suppress_prefilter_exact_match") + + conditional(saveUcFiles, "--save_uc_files") + + optional("--percent_id_err", percentIdErr) + + optional("--minsize", minSize) + + optional("--abundance_skew", abundanceSkew) + + optional("--db_filepath", dbFilepath) + + optional("--perc_id_blast", percIdBlast) + + conditional(deNovoChimeraDetection, "--de_novo_chimera_detection") + + conditional(suppressDeNovoChimeraDetection, "--suppress_de_novo_chimera_detection") + + optional("--reference_chimera_detection", referenceChimeraDetection) + + optional("--suppress_reference_chimera_detection", suppressReferenceChimeraDetection) + + optional("--cluster_size_filtering", clusterSizeFiltering) + + optional("--suppress_cluster_size_filtering", suppressClusterSizeFiltering) + + conditional(removeUsearchLogs, "--remove_usearch_logs") + + conditional(derepFullseq, "--derep_fullseq") + + optional("--non_chimeras_retention", nonChimerasRetention) + optional("--minlen", minlen) + - conditional(usearch_fast_cluster, "--usearch_fast_cluster") + - optional("--usearch61_sort_method", usearch61_sort_method) + + conditional(usearchFastCluster, "--usearch_fast_cluster") + + optional("--usearch61_sort_method", usearch61SortMethod) + conditional(sizeorder, "--sizeorder") + optional("--threads", threads) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickRepSet.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickRepSet.scala index 5496c673cf515df6d735c406ed46cb145d49f389..d7d3481622581d152ee6c8e9a1cc9cbd70372a57 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickRepSet.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/PickRepSet.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -22,7 +37,7 @@ class PickRepSet(val root: Configurable) extends BiopetCommandLineFunction with var logFile: Option[File] = None @Input(required = false) - var reference_seqs_fp: Option[File] = config("reference_seqs_fp") + var referenceSeqsFp: Option[File] = config("reference_seqs_fp") @Input(required = false) var fastaInput: Option[File] = None @@ -32,14 +47,14 @@ class PickRepSet(val root: Configurable) extends BiopetCommandLineFunction with def versionCommand = executable + " --version" def versionRegex = """Version: (.*)""".r - var rep_set_picking_method: Option[String] = config("rep_set_picking_method") + var repSetPickingMethod: Option[String] = config("rep_set_picking_method") def cmdLine = executable + required("-i", inputFile) + required("-o", outputFasta) + - optional("-m", rep_set_picking_method) + + optional("-m", repSetPickingMethod) + optional("-f", fastaInput) + optional("-l", logFile) + optional("-s", sortBy) + - optional("-r", reference_seqs_fp) + optional("-r", referenceSeqsFp) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/SplitLibrariesFastq.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/SplitLibrariesFastq.scala index 25db2dd9ae2347e4439ac482627d547f6a54e310..d9295dee037381a4ff0504f0a7bf465cebd0f6b7 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/SplitLibrariesFastq.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/qiime/SplitLibrariesFastq.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.qiime import java.io.File @@ -22,25 +37,27 @@ class SplitLibrariesFastq(val root: Configurable) extends BiopetCommandLineFunct var v: Option[String] = config("v") var m: Option[String] = config("m") var b: Option[String] = config("b") - var store_qual_scores: Boolean = config("store_qual_scores", default = false) - var sample_ids: List[String] = Nil - var store_demultiplexed_fastq: Boolean = config("store_demultiplexed_fastq", default = false) - var retain_unassigned_reads: Boolean = config("retain_unassigned_reads", default = false) + var storeQualScores: Boolean = config("store_qual_scores", default = false) + var sampleIds: List[String] = Nil + var storeDemultiplexedFastq: Boolean = config("store_demultiplexed_fastq", default = false) + var retainUnassignedReads: Boolean = config("retain_unassigned_reads", default = false) var r: Option[Int] = config("r") var p: Option[Double] = config("p") var n: Option[Int] = config("n") var s: Option[Int] = config("s") - var rev_comp_barcode: Boolean = config("rev_comp_barcode", default = false) - var rev_comp_mapping_barcodes: Boolean = config("rev_comp_mapping_barcodes", default = false) - var rev_comp: Boolean = config("rev_comp", default = false) + var revCompBarcode: Boolean = config("rev_comp_barcode", default = false) + var revCompMappingBarcodes: Boolean = config("rev_comp_mapping_barcodes", default = false) + var revComp: Boolean = config("rev_comp", default = false) var q: Option[Int] = config("q") - var last_bad_quality_char: Option[String] = config("last_bad_quality_char") - var barcode_type: Option[String] = config("barcode_type") - var max_barcode_errors: Option[Double] = config("max_barcode_errors") - var phred_offset: Option[String] = config("phred_offset") + var lastBadQualityChar: Option[String] = config("last_bad_quality_char") + var barcodeType: Option[String] = config("barcode_type") + var maxBarcodeErrors: Option[Double] = config("max_barcode_errors") + var phredOffset: Option[String] = config("phred_offset") def outputSeqs = new File(outputDir, "seqs.fna") + override def defaultCoreMemory = 4.0 + override def beforeGraph(): Unit = { super.beforeGraph() require(input.nonEmpty) @@ -52,25 +69,25 @@ class SplitLibrariesFastq(val root: Configurable) extends BiopetCommandLineFunct optional("-v", v) + optional("-m", m) + optional("-b", b) + - conditional(store_qual_scores, "--store_qual_scores") + - (sample_ids match { + conditional(storeQualScores, "--store_qual_scores") + + (sampleIds match { case l: List[_] if l.nonEmpty => optional("--sample_ids", l.mkString(",")) case _ => "" }) + - conditional(store_demultiplexed_fastq, "--store_demultiplexed_fastq") + - conditional(retain_unassigned_reads, "--retain_unassigned_reads") + + conditional(storeDemultiplexedFastq, "--store_demultiplexed_fastq") + + conditional(retainUnassignedReads, "--retain_unassigned_reads") + optional("-r", r) + optional("-p", p) + optional("-n", n) + optional("-s", s) + - conditional(rev_comp_barcode, "--rev_comp_barcode") + - conditional(rev_comp_mapping_barcodes, "--rev_comp_mapping_barcodes") + - conditional(rev_comp, "--rev_comp") + + conditional(revCompBarcode, "--rev_comp_barcode") + + conditional(revCompMappingBarcodes, "--rev_comp_mapping_barcodes") + + conditional(revComp, "--rev_comp") + optional("-q", q) + - optional("--last_bad_quality_char", last_bad_quality_char) + - optional("--barcode_type", barcode_type) + - optional("--max_barcode_errors", max_barcode_errors) + - optional("--phred_offset", phred_offset) + + optional("--last_bad_quality_char", lastBadQualityChar) + + optional("--barcode_type", barcodeType) + + optional("--max_barcode_errors", maxBarcodeErrors) + + optional("--phred_offset", phredOffset) + (input match { case l: List[_] if l.nonEmpty => required("-i", l.mkString(",")) case _ => "" diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMarkdup.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMarkdup.scala index 2f89774db35e5f9d6f22518aa2ff3588f70d053f..80e999db78762dc44731ca0fb990c6598b074c62 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMarkdup.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMarkdup.scala @@ -30,23 +30,23 @@ class SambambaMarkdup(val root: Configurable) extends Sambamba { @Output(doc = "Markdup output bam") var output: File = _ - var remove_duplicates: Boolean = config("remove_duplicates", default = false) + var removeDuplicates: Boolean = config("remove_duplicates", default = false) // @doc: compression_level 6 is average, 0 = no compression, 9 = best - val compression_level: Option[Int] = config("compression_level", default = 6) - val hash_table_size: Option[Int] = config("hash-table-size", default = 262144) - val overflow_list_size: Option[Int] = config("overflow-list-size", default = 200000) - val io_buffer_size: Option[Int] = config("io-buffer-size", default = 128) + val compressionLevel: Option[Int] = config("compression_level", default = 6) + val hashTableSize: Option[Int] = config("hash-table-size", default = 262144) + val overflowListSize: Option[Int] = config("overflow-list-size", default = 200000) + val ioBufferSize: Option[Int] = config("io-buffer-size", default = 128) /** Returns command to execute */ def cmdLine = required(executable) + required("markdup") + - conditional(remove_duplicates, "--remove-duplicates") + + conditional(removeDuplicates, "--remove-duplicates") + optional("-t", nCoresRequest) + - optional("-l", compression_level) + - optional("--hash-table-size=", hash_table_size, spaceSeparated = false) + - optional("--overflow-list-size=", overflow_list_size, spaceSeparated = false) + - optional("--io-buffer-size=", io_buffer_size, spaceSeparated = false) + + optional("-l", compressionLevel) + + optional("--hash-table-size=", hashTableSize, spaceSeparated = false) + + optional("--overflow-list-size=", overflowListSize, spaceSeparated = false) + + optional("--io-buffer-size=", ioBufferSize, spaceSeparated = false) + required(input) + required(output) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMerge.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMerge.scala index 83464fa4972e6f1aa3b9f74733ff8589985e91d8..4cfb1bc9925ec38c91a9eba8248112fbff145120 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMerge.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaMerge.scala @@ -31,13 +31,13 @@ class SambambaMerge(val root: Configurable) extends Sambamba { var output: File = _ // @doc: compression_level 6 is average, 0 = no compression, 9 = best - val compression_level: Option[Int] = config("compression_level", default = 6) + val compressionLevel: Option[Int] = config("compression_level", default = 6) /** Returns command to execute */ def cmdLine = required(executable) + required("merge") + optional("-t", nCoresRequest) + - optional("-l", compression_level) + + optional("-l", compressionLevel) + required(output) + repeat("", input) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaView.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaView.scala index ca2470c6094192afef8da6de7b76b6ca1809c1e3..d5f9245ee3a8863c2aa6ac9469da7bab1691ecbe 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaView.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/sambamba/SambambaView.scala @@ -33,7 +33,7 @@ class SambambaView(val root: Configurable) extends Sambamba { var filter: Option[String] = _ val format: Option[String] = config("format", default = "bam") val regions: Option[File] = config("regions") - val compression_level: Option[Int] = config("compression_level", default = 6) + val compressionLevel: Option[Int] = config("compression_level", default = 6) /** Returns command to execute */ def cmdLine = required(executable) + @@ -42,7 +42,7 @@ class SambambaView(val root: Configurable) extends Sambamba { optional("--nthreads", nCoresRequest) + optional("--format", format.get) + optional("--regions", regions) + - optional("--compression-level", compression_level) + + optional("--compression-level", compressionLevel) + required("--output-filename", output) + required(input) } diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/samtools/FixMpileup.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/samtools/FixMpileup.scala index 6023f36e707103aa7e2a7fe196c75a83ec7da777..ce6cb19dbbbe4999de0c7dd1bcba6a30544f6eb4 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/samtools/FixMpileup.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/samtools/FixMpileup.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.samtools import nl.lumc.sasc.biopet.core.extensions.PythonCommandLineFunction diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/varscan/FixMpileup.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/varscan/FixMpileup.scala index 827da73e3779003746a8ec16044c7726faa03337..a49ac566cdc051885fe006aef3bee1f040e073da 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/varscan/FixMpileup.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/varscan/FixMpileup.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.varscan import nl.lumc.sasc.biopet.core.extensions.PythonCommandLineFunction diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtDecompose.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtDecompose.scala index c5acb62ff24dbf189318122453a8f8db92f24c25..3afd442e7726671ea5e47ccc372252e0bd224b12 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtDecompose.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtDecompose.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.vt import java.io.File diff --git a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtNormalize.scala b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtNormalize.scala index 26812e2636a4bd794dc8c74e47c73fb137a3740d..a555f77174e913f9159f2336b1f260b8c29565cf 100644 --- a/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtNormalize.scala +++ b/public/biopet-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/vt/VtNormalize.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.vt import java.io.File diff --git a/public/biopet-extensions/src/test/resources/vep.metrics b/public/biopet-extensions/src/test/resources/vep.metrics new file mode 100644 index 0000000000000000000000000000000000000000..c036c2b7dae5017c995132e3e2dd35d7d9cbcc13 --- /dev/null +++ b/public/biopet-extensions/src/test/resources/vep.metrics @@ -0,0 +1,349 @@ +[VEP run statistics] +VEP version (API) 77 (77) +Cache/Database /usr/local/Genomes/H.Sapiens/hg19/vep_cache_75/homo_sapiens/75 +Species homo_sapiens +Command line options -i test.vcf -o test.vep.vcf -v --everything --stats_text --cache --vcf --allow_non_variant --species homo_sapiens --dir /usr/local/Genomes/H.Sapiens/hg19/vep_cache_75 --fasta /usr/local/Genomes/H.Sapiens/hg19/vep_cache_75/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa --fork 2 --cache_version 75 --db_version 75 --failed 1 --offline +Start time 2016-03-03 14:37:44 +End time 2016-03-03 14:37:54 +Run time 10 seconds +Input file (format) test.vcf (VCF) +Output file test.vep.vcf [text] + +[General statistics] +Lines of input read 1000 +Variants processed 874 +Variants remaining after filtering 874 +Lines of output written 874 +Novel / existing variants 105 (12.0%) / 769 (88.0%) +Overlapped genes 132 +Overlapped transcripts 661 +Overlapped regulatory features 164 + +[Variant classes] +insertion 29 +deletion 51 +SNV 794 + +[Consequences (most severe)] +splice_donor_variant 3 +frameshift_variant 1 +inframe_insertion 2 +inframe_deletion 1 +missense_variant 39 +splice_region_variant 11 +synonymous_variant 46 +5_prime_UTR_variant 19 +3_prime_UTR_variant 36 +non_coding_transcript_exon_variant 95 +intron_variant 518 +upstream_gene_variant 48 +downstream_gene_variant 27 +intergenic_variant 28 + +[Consequences (all)] +splice_donor_variant 6 +frameshift_variant 2 +inframe_insertion 15 +inframe_deletion 6 +missense_variant 101 +splice_region_variant 60 +synonymous_variant 173 +coding_sequence_variant 1 +5_prime_UTR_variant 33 +3_prime_UTR_variant 108 +non_coding_transcript_exon_variant 279 +intron_variant 3311 +NMD_transcript_variant 437 +non_coding_transcript_variant 1102 +upstream_gene_variant 1486 +downstream_gene_variant 1779 +TF_binding_site_variant 12 +feature_elongation 97 +regulatory_region_variant 253 +feature_truncation 278 +intergenic_variant 28 + +[Coding consequences] +frameshift_variant 2 +inframe_insertion 15 +inframe_deletion 6 +missense_variant 101 +synonymous_variant 173 +coding_sequence_variant 1 + +[SIFT summary] +deleterious 22 +tolerated 76 + +[PolyPhen summary] +possibly damaging 7 +unknown 12 +probably damaging 12 +benign 70 + +[Variants by chromosome] +1 874 + +[Distribution of variants on chromosome 1] +0 225 +1 407 +2 242 +3 0 +4 0 +5 0 +6 0 +7 0 +8 0 +9 0 +10 0 +11 0 +12 0 +13 0 +14 0 +15 0 +16 0 +17 0 +18 0 +19 0 +20 0 +21 0 +22 0 +23 0 +24 0 +25 0 +26 0 +27 0 +28 0 +29 0 +30 0 +31 0 +32 0 +33 0 +34 0 +35 0 +36 0 +37 0 +38 0 +39 0 +40 0 +41 0 +42 0 +43 0 +44 0 +45 0 +46 0 +47 0 +48 0 +49 0 +50 0 +51 0 +52 0 +53 0 +54 0 +55 0 +56 0 +57 0 +58 0 +59 0 +60 0 +61 0 +62 0 +63 0 +64 0 +65 0 +66 0 +67 0 +68 0 +69 0 +70 0 +71 0 +72 0 +73 0 +74 0 +75 0 +76 0 +77 0 +78 0 +79 0 +80 0 +81 0 +82 0 +83 0 +84 0 +85 0 +86 0 +87 0 +88 0 +89 0 +90 0 +91 0 +92 0 +93 0 +94 0 +95 0 +96 0 +97 0 +98 0 +99 0 +100 0 +101 0 +102 0 +103 0 +104 0 +105 0 +106 0 +107 0 +108 0 +109 0 +110 0 +111 0 +112 0 +113 0 +114 0 +115 0 +116 0 +117 0 +118 0 +119 0 +120 0 +121 0 +122 0 +123 0 +124 0 +125 0 +126 0 +127 0 +128 0 +129 0 +130 0 +131 0 +132 0 +133 0 +134 0 +135 0 +136 0 +137 0 +138 0 +139 0 +140 0 +141 0 +142 0 +143 0 +144 0 +145 0 +146 0 +147 0 +148 0 +149 0 +150 0 +151 0 +152 0 +153 0 +154 0 +155 0 +156 0 +157 0 +158 0 +159 0 +160 0 +161 0 +162 0 +163 0 +164 0 +165 0 +166 0 +167 0 +168 0 +169 0 +170 0 +171 0 +172 0 +173 0 +174 0 +175 0 +176 0 +177 0 +178 0 +179 0 +180 0 +181 0 +182 0 +183 0 +184 0 +185 0 +186 0 +187 0 +188 0 +189 0 +190 0 +191 0 +192 0 +193 0 +194 0 +195 0 +196 0 +197 0 +198 0 +199 0 +200 0 +201 0 +202 0 +203 0 +204 0 +205 0 +206 0 +207 0 +208 0 +209 0 +210 0 +211 0 +212 0 +213 0 +214 0 +215 0 +216 0 +217 0 +218 0 +219 0 +220 0 +221 0 +222 0 +223 0 +224 0 +225 0 +226 0 +227 0 +228 0 +229 0 +230 0 +231 0 +232 0 +233 0 +234 0 +235 0 +236 0 +237 0 +238 0 +239 0 +240 0 +241 0 +242 0 +243 0 +244 0 +245 0 +246 0 +247 0 +248 0 +249 0 + +[Position in protein] +00-10% 29 +10-20% 63 +20-30% 34 +30-40% 20 +40-50% 23 +50-60% 25 +60-70% 31 +70-80% 11 +80-90% 32 +90-100% 30 diff --git a/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/BcfToolsTest.scala b/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/BcfToolsTest.scala index ff7ce95d59adcce87590605b5c5f8239f83af50c..444fa970cfa0440101ab65d8ae863cc8f315da5d 100644 --- a/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/BcfToolsTest.scala +++ b/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/BcfToolsTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions import java.io.File diff --git a/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/ManweTest.scala b/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/ManweTest.scala index 6d1cf3d60c7f2b0d995b9b82b59a1391d37162db..b4ac69c9fda05c31976b8dea7d04cf6a9963a4c6 100644 --- a/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/ManweTest.scala +++ b/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/ManweTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions import java.io.File diff --git a/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictorTest.scala b/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictorTest.scala new file mode 100644 index 0000000000000000000000000000000000000000..eacfcd9a821d13b5b8e2cc3d5b27dfa9620eb8d3 --- /dev/null +++ b/public/biopet-extensions/src/test/scala/nl/lumc/sasc/biopet/extensions/VariantEffectPredictorTest.scala @@ -0,0 +1,50 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ +package nl.lumc.sasc.biopet.extensions + +import java.io.File +import java.nio.file.Paths + +import org.scalatest.Matchers +import org.scalatest.testng.TestNGSuite +import org.testng.annotations.Test + +/** + * Created by ahbbollen on 3-3-16. + */ +class VariantEffectPredictorTest extends TestNGSuite with Matchers { + + @Test + def testSummaryStats = { + val file = new File(Paths.get(getClass.getResource("/vep.metrics").toURI).toString) + + val vep = new VariantEffectPredictor(null) + val stats = vep.parseStatsFile(file) + + stats.contains("VEP_run_statistics") shouldBe true + stats.contains("General_statistics") shouldBe true + stats.contains("Consequences_(most_severe)") shouldBe true + stats.contains("Consequences_(all)") shouldBe true + stats.contains("Coding_consequences") shouldBe true + stats.contains("SIFT_summary") shouldBe true + stats.contains("PolyPhen_summary") shouldBe true + stats.contains("Variants_by_chromosome") shouldBe true + stats.contains("Distribution_of_variants_on_chromosome_1") shouldBe true + stats.contains("Position_in_protein") shouldBe true + + } + +} diff --git a/public/biopet-public-package/pom.xml b/public/biopet-public-package/pom.xml index 32fb4e48b804c665e8abcafd552079a904a3f971..23f18d9b40a5d323fc8c4a387359d383436007ae 100644 --- a/public/biopet-public-package/pom.xml +++ b/public/biopet-public-package/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> @@ -64,6 +64,11 @@ <artifactId>Gentrap</artifactId> <version>${project.version}</version> </dependency> + <dependency> + <groupId>nl.lumc.sasc</groupId> + <artifactId>TinyCap</artifactId> + <version>${project.version}</version> + </dependency> <dependency> <groupId>nl.lumc.sasc</groupId> <artifactId>Sage</artifactId> diff --git a/public/biopet-public-package/src/main/scala/nl/lumc/sasc/biopet/BiopetExecutablePublic.scala b/public/biopet-public-package/src/main/scala/nl/lumc/sasc/biopet/BiopetExecutablePublic.scala index 161a818900de16c86d4821fec774f099d8c0f8c3..a6553001fb63493fda08ab069cba37e1e8a2ebe5 100644 --- a/public/biopet-public-package/src/main/scala/nl/lumc/sasc/biopet/BiopetExecutablePublic.scala +++ b/public/biopet-public-package/src/main/scala/nl/lumc/sasc/biopet/BiopetExecutablePublic.scala @@ -24,6 +24,7 @@ object BiopetExecutablePublic extends BiopetExecutable { nl.lumc.sasc.biopet.pipelines.mapping.Mapping, nl.lumc.sasc.biopet.pipelines.mapping.MultisampleMapping, nl.lumc.sasc.biopet.pipelines.gentrap.Gentrap, + nl.lumc.sasc.biopet.pipelines.tinycap.TinyCap, nl.lumc.sasc.biopet.pipelines.bammetrics.BamMetrics, nl.lumc.sasc.biopet.pipelines.sage.Sage, nl.lumc.sasc.biopet.pipelines.bamtobigwig.Bam2Wig, diff --git a/public/biopet-tools-extensions/pom.xml b/public/biopet-tools-extensions/pom.xml index 7e7a1116cd2fb7e7c75b4873449b16a2aa201bfe..aae525f13cda7c5406d7f94b8ccb85f9c65c999c 100644 --- a/public/biopet-tools-extensions/pom.xml +++ b/public/biopet-tools-extensions/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/BaseCounter.scala b/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/BaseCounter.scala index d15d26a368f25080e7c0e54f668a2af1b2831699..751053b55bac3e60c09e0b3c0566648cbe997f49 100644 --- a/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/BaseCounter.scala +++ b/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/BaseCounter.scala @@ -37,7 +37,7 @@ class BaseCounter(val root: Configurable) extends ToolCommandFunction { var prefix: String = "output" - override def defaultCoreMemory = 3.0 + override def defaultCoreMemory = 5.0 override def defaultThreads = 4 def transcriptTotalCounts = new File(outputDir, s"$prefix.base.transcript.counts") @@ -89,6 +89,7 @@ class BaseCounter(val root: Configurable) extends ToolCommandFunction { nonStrandedMetaExonCounts, strandedMetaExonCounts, strandedSenseMetaExonCounts, strandedAntiSenseMetaExonCounts) jobOutputFile = new File(outputDir, s".$prefix.basecounter.out") + if (bamFile != null) deps :+= new File(bamFile.getAbsolutePath.stripSuffix(".bam") + ".bai") super.beforeGraph() } diff --git a/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/GvcfToBed.scala b/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/GvcfToBed.scala index 5c7e551b2d3a4be6fdefeed5db3119e71ba20845..cf5e0fa0a787119e86a0ffff01e56928e7f7e5e3 100644 --- a/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/GvcfToBed.scala +++ b/public/biopet-tools-extensions/src/main/scala/nl/lumc/sasc/biopet/extensions/tools/GvcfToBed.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.extensions.tools import java.io.File diff --git a/public/biopet-tools-extensions/src/test/scala/VcfFilterTest.scala b/public/biopet-tools-extensions/src/test/scala/VcfFilterTest.scala index 8fddb1c78be5ddff4be3d948236afe94b164ab72..8be13ec8bb160d9204244acd2aaf6bebefb53418 100644 --- a/public/biopet-tools-extensions/src/test/scala/VcfFilterTest.scala +++ b/public/biopet-tools-extensions/src/test/scala/VcfFilterTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ import java.io.File import nl.lumc.sasc.biopet.extensions.tools.VcfFilter @@ -102,85 +117,45 @@ class VcfFilterTest extends TestNGSuite with Matchers { val command = cmd(vcfFilter.cmdLine) var cmdString: List[String] = Nil - if (minSampleDepth.isDefined) { - cmdString = "--minSampleDepth " + minSampleDepth.getOrElse("") :: cmdString - } + if (minSampleDepth.isDefined) cmdString = "--minSampleDepth " + minSampleDepth.getOrElse("") :: cmdString - if (minTotalDepth.isDefined) { - cmdString = "--minTotalDepth " + minTotalDepth.getOrElse("") :: cmdString - } + if (minTotalDepth.isDefined) cmdString = "--minTotalDepth " + minTotalDepth.getOrElse("") :: cmdString - if (minAlternateDepth.isDefined) { - cmdString = "--minAlternateDepth " + minAlternateDepth.getOrElse("") :: cmdString - } + if (minAlternateDepth.isDefined) cmdString = "--minAlternateDepth " + minAlternateDepth.getOrElse("") :: cmdString - if (minSamplesPass.isDefined) { - cmdString = "--minSamplesPass " + minSamplesPass.getOrElse("") :: cmdString - } + if (minSamplesPass.isDefined) cmdString = "--minSamplesPass " + minSamplesPass.getOrElse("") :: cmdString - if (minGenomeQuality.isDefined) { - cmdString = "--minGenomeQuality " + minGenomeQuality.getOrElse("") :: cmdString - } + if (minGenomeQuality.isDefined) cmdString = "--minGenomeQuality " + minGenomeQuality.getOrElse("") :: cmdString - if (filterRefCalls) { - cmdString = "--filterRefCalls" :: cmdString - } + if (filterRefCalls) cmdString = "--filterRefCalls" :: cmdString - if (invertedOutputVcf.isDefined) { - cmdString = "--invertedOutputVcf " + invertedOutputVcf.getOrElse(new File("")).getAbsolutePath :: cmdString - } + if (invertedOutputVcf.isDefined) cmdString = "--invertedOutputVcf " + invertedOutputVcf.getOrElse(new File("")).getAbsolutePath :: cmdString - if (resToDom.isDefined) { - cmdString = "--resToDom " + resToDom.getOrElse("") :: cmdString - } + if (resToDom.isDefined) cmdString = "--resToDom " + resToDom.getOrElse("") :: cmdString - if (trioCompound.isDefined) { - cmdString = "--trioCompound " + trioCompound.getOrElse("") :: cmdString - } + if (trioCompound.isDefined) cmdString = "--trioCompound " + trioCompound.getOrElse("") :: cmdString - if (deNovoInSample.isDefined) { - cmdString = "--deNovoInSample " + deNovoInSample.getOrElse("") :: cmdString - } + if (deNovoInSample.isDefined) cmdString = "--deNovoInSample " + deNovoInSample.getOrElse("") :: cmdString - if (deNovoTrio.isDefined) { - cmdString = "--deNovoTrio " + deNovoTrio.getOrElse("") :: cmdString - } + if (deNovoTrio.isDefined) cmdString = "--deNovoTrio " + deNovoTrio.getOrElse("") :: cmdString - if (trioLossOfHet.isDefined) { - cmdString = "--trioLossOfHet " + trioLossOfHet.getOrElse("") :: cmdString - } + if (trioLossOfHet.isDefined) cmdString = "--trioLossOfHet " + trioLossOfHet.getOrElse("") :: cmdString - if (mustHaveVariant.nonEmpty) { - cmdString = mustHaveVariant.map(x => "--mustHaveVariant " + x) ::: cmdString - } + if (mustHaveVariant.nonEmpty) cmdString = mustHaveVariant.map(x => "--mustHaveVariant " + x) ::: cmdString - if (calledIn.nonEmpty) { - cmdString = calledIn.map(x => "--calledIn " + x) ::: cmdString - } + if (calledIn.nonEmpty) cmdString = calledIn.map(x => "--calledIn " + x) ::: cmdString - if (mustHaveGenotype.nonEmpty) { - cmdString = mustHaveGenotype.map(x => "--mustHaveGenotype " + x) ::: cmdString - } + if (mustHaveGenotype.nonEmpty) cmdString = mustHaveGenotype.map(x => "--mustHaveGenotype " + x) ::: cmdString - if (diffGenotype.nonEmpty) { - cmdString = diffGenotype.map(x => "--diffGenotype " + x) ::: cmdString - } + if (diffGenotype.nonEmpty) cmdString = diffGenotype.map(x => "--diffGenotype " + x) ::: cmdString - if (filterHetVarToHomVar.nonEmpty) { - cmdString = filterHetVarToHomVar.map(x => "--filterHetVarToHomVar " + x) ::: cmdString - } + if (filterHetVarToHomVar.nonEmpty) cmdString = filterHetVarToHomVar.map(x => "--filterHetVarToHomVar " + x) ::: cmdString - if (id.nonEmpty) { - cmdString = id.map(x => "--id " + x) ::: cmdString - } + if (id.nonEmpty) cmdString = id.map(x => "--id " + x) ::: cmdString - if (idFile.isDefined) { - cmdString = "--idFile " + idFile.getOrElse(new File("")).getAbsolutePath :: cmdString - } + if (idFile.isDefined) cmdString = "--idFile " + idFile.getOrElse(new File("")).getAbsolutePath :: cmdString - if (minQualScore.isDefined) { - cmdString = "--minQualScore " + minQualScore.getOrElse("") :: cmdString - } + if (minQualScore.isDefined) cmdString = "--minQualScore " + minQualScore.getOrElse("") :: cmdString cmdString.foreach(x => command.contains(x) shouldBe true) } diff --git a/public/biopet-tools-package/pom.xml b/public/biopet-tools-package/pom.xml index e42db61116c57ffce3057675e94de45ff90ef162..a92dd22f1941febf6f1b9da2c6f8501fa5cfb014 100644 --- a/public/biopet-tools-package/pom.xml +++ b/public/biopet-tools-package/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/biopet-tools-package/src/main/scala/nl/lumc/sasc/biopet/BiopetToolsExecutable.scala b/public/biopet-tools-package/src/main/scala/nl/lumc/sasc/biopet/BiopetToolsExecutable.scala index b6752961cd9d4412ec0215d57955fd574dae58da..5d110be7155bce2cc999f58f092b47c70da831da 100644 --- a/public/biopet-tools-package/src/main/scala/nl/lumc/sasc/biopet/BiopetToolsExecutable.scala +++ b/public/biopet-tools-package/src/main/scala/nl/lumc/sasc/biopet/BiopetToolsExecutable.scala @@ -22,30 +22,31 @@ object BiopetToolsExecutable extends BiopetExecutable { def pipelines: List[MainCommand] = Nil def tools: List[MainCommand] = List( - nl.lumc.sasc.biopet.tools.MergeTables, - nl.lumc.sasc.biopet.tools.WipeReads, - nl.lumc.sasc.biopet.tools.ExtractAlignedFastq, - nl.lumc.sasc.biopet.tools.FastqSync, + nl.lumc.sasc.biopet.tools.AnnotateVcfWithBed, + nl.lumc.sasc.biopet.tools.BaseCounter, + nl.lumc.sasc.biopet.tools.BastyGenerateFasta, + nl.lumc.sasc.biopet.tools.BedtoolsCoverageToCounts, nl.lumc.sasc.biopet.tools.BiopetFlagstat, nl.lumc.sasc.biopet.tools.CheckAllelesVcfInBam, - nl.lumc.sasc.biopet.tools.VcfToTsv, - nl.lumc.sasc.biopet.tools.VcfFilter, - nl.lumc.sasc.biopet.tools.VcfStats, - nl.lumc.sasc.biopet.tools.BaseCounter, + nl.lumc.sasc.biopet.tools.ExtractAlignedFastq, + nl.lumc.sasc.biopet.tools.FastqSplitter, + nl.lumc.sasc.biopet.tools.FastqSync, nl.lumc.sasc.biopet.tools.FindRepeatsPacBio, + nl.lumc.sasc.biopet.tools.GvcfToBed, + nl.lumc.sasc.biopet.tools.MergeAlleles, + nl.lumc.sasc.biopet.tools.MergeTables, nl.lumc.sasc.biopet.tools.MpileupToVcf, - nl.lumc.sasc.biopet.tools.FastqSplitter, - nl.lumc.sasc.biopet.tools.BedtoolsCoverageToCounts, + nl.lumc.sasc.biopet.tools.PrefixFastq, nl.lumc.sasc.biopet.tools.SageCountFastq, - nl.lumc.sasc.biopet.tools.SageCreateLibrary, - nl.lumc.sasc.biopet.tools.SageCreateTagCounts, - nl.lumc.sasc.biopet.tools.BastyGenerateFasta, - nl.lumc.sasc.biopet.tools.MergeAlleles, nl.lumc.sasc.biopet.tools.SamplesTsvToJson, nl.lumc.sasc.biopet.tools.SeqStat, - nl.lumc.sasc.biopet.tools.VepNormalizer, - nl.lumc.sasc.biopet.tools.AnnotateVcfWithBed, - nl.lumc.sasc.biopet.tools.VcfWithVcf, + nl.lumc.sasc.biopet.tools.SquishBed, + nl.lumc.sasc.biopet.tools.SummaryToTsv, nl.lumc.sasc.biopet.tools.ValidateFastq, - nl.lumc.sasc.biopet.tools.KrakenReportToJson) + nl.lumc.sasc.biopet.tools.VcfFilter, + nl.lumc.sasc.biopet.tools.VcfStats, + nl.lumc.sasc.biopet.tools.VcfToTsv, + nl.lumc.sasc.biopet.tools.VcfWithVcf, + nl.lumc.sasc.biopet.tools.VepNormalizer, + nl.lumc.sasc.biopet.tools.WipeReads) } diff --git a/public/biopet-tools/pom.xml b/public/biopet-tools/pom.xml index c161ecff49f337a16954b0908405e66bd9eca1ea..b7e49a3b7c290129cc7bc0082e419c16d8b6aa4b 100644 --- a/public/biopet-tools/pom.xml +++ b/public/biopet-tools/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/BaseCounter.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/BaseCounter.scala index 7a2b0907f83cac2c0c65fe070bf3270b4e395775..0add274bfbc477ab771466f9b4b5cdfc2024a1f1 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/BaseCounter.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/BaseCounter.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.io.{ PrintWriter, File } @@ -55,7 +70,9 @@ object BaseCounter extends ToolCommand { logger.info("Finding overlapping genes") val overlapGenes = groupGenesOnOverlap(geneReader.getAll) - logger.info("Start reading bamFile") + counter = 0 + + logger.info(s"Start reading bamFile divided over ${overlapGenes.values.flatten.size} chunks") val counts = (for (genes <- overlapGenes.values.flatten.par) yield runThread(cmdArgs.bamFile, genes)).toList logger.info("Done reading bamFile") @@ -277,6 +294,8 @@ object BaseCounter extends ToolCommand { else samRecord.getReadNegativeStrandFlag == strand } + private[tools] var counter = 0 + private[tools] case class ThreadOutput(geneCounts: List[GeneCount], nonStrandedMetaExonCounts: List[(String, RegionCount)], strandedMetaExonCounts: List[(String, RegionCount)]) @@ -300,6 +319,8 @@ object BaseCounter extends ToolCommand { } bamReader.close() + counter += 1 + if (counter % 1000 == 0) logger.info(s"${counter} chunks done") ThreadOutput(counts.values.toList, metaExons, plusMetaExons ::: minMetaExons) } diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/GvcfToBed.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/GvcfToBed.scala index f96835488102e2a7c7a43d153a5986faa5e158f7..e9b24a378eef5b06a0a1e10832943b07852985c1 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/GvcfToBed.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/GvcfToBed.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.io.{ File, PrintWriter } diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/MergeOtuMaps.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/MergeOtuMaps.scala index 78bc62cfe7d55e6a3e01b92a4a0bae663e997e56..15b76aa44063bbeeb3f79606225d88b0d4666ed3 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/MergeOtuMaps.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/MergeOtuMaps.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.io.{ PrintWriter, File } diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/SeqStat.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/SeqStat.scala index 2bc5c1a066e8fc7574e6352079a98e4098aebe9d..fe59522374391b9ed5af74df61ed34d58db15339 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/SeqStat.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/SeqStat.scala @@ -84,10 +84,10 @@ object SeqStat extends ToolCommand { */ def detectPhredEncoding(quals: mutable.ArrayBuffer[Long]): Unit = { // substract 1 on high value, because we start from index 0 - val qual_low_boundery = quals.takeWhile(_ == 0).length - val qual_high_boundery = quals.length - 1 + val qualLowBoundery = quals.takeWhile(_ == 0).length + val qualHighBoundery = quals.length - 1 - (qual_low_boundery < 59, qual_high_boundery > 74) match { + (qualLowBoundery < 59, qualHighBoundery > 74) match { case (false, true) => phredEncoding = Solexa // TODO: check this later on // complex case, we cannot tell wheter this is a sanger or solexa @@ -122,7 +122,7 @@ object SeqStat extends ToolCommand { * * @param record FastqRecord */ - def procesRead(record: FastqRecord): Unit = { + def processRead(record: FastqRecord): Unit = { // Adjust/expand the length of baseStat case classes to the size of current // read if the current list is not long enough to store the data @@ -130,9 +130,8 @@ object SeqStat extends ToolCommand { baseStats ++= mutable.ArrayBuffer.fill(record.length - baseStats.length)(BaseStat()) } - if (readStats.lengths.length < record.length) { + if (readStats.lengths.length <= record.length) readStats.lengths ++= mutable.ArrayBuffer.fill(record.length - readStats.lengths.length + 1)(0) - } val readQuality = record.getBaseQualityString val readNucleotides = record.getReadString @@ -166,7 +165,7 @@ object SeqStat extends ToolCommand { def seqStat(fqreader: FastqReader): Long = { var numReads: Long = 0 for (read <- fqreader.iterator.asScala) { - procesRead(read) + processRead(read) numReads += 1 } diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/ValidateFastq.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/ValidateFastq.scala index b4c73ef13dbd848485d064fec8cb7cfbf6fcc743..b2e1aeb9651cc9690ad7926ecb76f4f4a5627d8c 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/ValidateFastq.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/ValidateFastq.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.io.File diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VcfFilter.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VcfFilter.scala index d08b05c534896b33324297dde94d386bc2c863c7..bc91cfc044cb0982d14e00b7a15a1b79a2045919 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VcfFilter.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VcfFilter.scala @@ -276,8 +276,10 @@ object VcfFilter extends ToolCommand { * @return */ def minGenomeQuality(record: VariantContext, minGQ: Int, minSamplesPass: Int = 1): Boolean = { - record.getGenotypes.count(x => if (!x.hasGQ) false - else if (x.getGQ >= minGQ) true else false) >= minSamplesPass + record.getGenotypes.count(x => + if (minGQ == 0) true + else if (!x.hasGQ) false + else if (x.getGQ >= minGQ) true else false) >= minSamplesPass } /** diff --git a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VepNormalizer.scala b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VepNormalizer.scala index f9f0fe472686589f47c52b04d2c3e97a181a026e..50893046b949792ddccedf00752b34c632dc00ee 100644 --- a/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VepNormalizer.scala +++ b/public/biopet-tools/src/main/scala/nl/lumc/sasc/biopet/tools/VepNormalizer.scala @@ -64,13 +64,13 @@ object VepNormalizer extends ToolCommand { versionCheck(header) logger.debug("VCF version OK") logger.debug("Parsing header") - val new_infos = parseCsq(header) + val newInfos = parseCsq(header) header.setWriteCommandLine(true) val writer = new AsyncVariantContextWriter(new VariantContextWriterBuilder(). setOutputFile(output).setReferenceDictionary(header.getSequenceDictionary) build ()) - for (info <- new_infos) { + for (info <- newInfos) { val tmpheaderline = new VCFInfoHeaderLine(info, VCFHeaderLineCount.UNBOUNDED, VCFHeaderLineType.String, "A VEP annotation") header.addMetaDataLine(tmpheaderline) } @@ -81,7 +81,7 @@ object VepNormalizer extends ToolCommand { writer.writeHeader(header) logger.debug("Wrote header to file") - normalize(reader, writer, new_infos, commandArgs.mode, commandArgs.removeCSQ) + normalize(reader, writer, newInfos, commandArgs.mode, commandArgs.removeCSQ) writer.close() logger.debug("Closed writer") reader.close() diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/AnnotateVcfWithBedTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/AnnotateVcfWithBedTest.scala index f128a6b42e38b47b0518668a1916cd057a2ea159..0dfa028e21b2f4ce9c8d1af0b27915206f45d783 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/AnnotateVcfWithBedTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/AnnotateVcfWithBedTest.scala @@ -34,25 +34,25 @@ class AnnotateVcfWithBedTest extends TestNGSuite with MockitoSugar with Matchers Paths.get(getClass.getResource(p).toURI).toString } - val vepped_path = resourcePath("/VEP_oneline.vcf") + val veppedPath = resourcePath("/VEP_oneline.vcf") val bed = resourcePath("/rrna01.bed") val rand = new Random() @Test def testOutputTypeVcf() = { - val tmp_path = "/tmp/VcfFilter_" + rand.nextString(10) + ".vcf" - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path, "-B", bed, "-f", "testing") + val tmpPath = "/tmp/VcfFilter_" + rand.nextString(10) + ".vcf" + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath, "-B", bed, "-f", "testing") main(arguments) } @Test def testOutputTypeBcf() = { - val tmp_path = "/tmp/VcfFilter_" + rand.nextString(10) + ".bcf" - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path, "-B", bed, "-f", "testing") + val tmpPath = "/tmp/VcfFilter_" + rand.nextString(10) + ".bcf" + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath, "-B", bed, "-f", "testing") main(arguments) } @Test def testOutputTypeVcfGz() = { - val tmp_path = "/tmp/VcfFilter_" + rand.nextString(10) + ".vcf.gz" - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path, "-B", bed, "-f", "testing") + val tmpPath = "/tmp/VcfFilter_" + rand.nextString(10) + ".vcf.gz" + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath, "-B", bed, "-f", "testing") main(arguments) } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BaseCounterTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BaseCounterTest.scala index 8d6cdd9f1b61cd7dae69c2f53dd7966c83c6b4b3..c2f906dd9e04777c893b46edcaeff2473864e567 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BaseCounterTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BaseCounterTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.io.File diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BastyGenerateFastaTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BastyGenerateFastaTest.scala index a88dc561a34fe1b25b39f69e34b25ed30a6f393d..f4455ca6ab7b11a3520a41bf75e9c98c096c9987 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BastyGenerateFastaTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/BastyGenerateFastaTest.scala @@ -35,11 +35,11 @@ class BastyGenerateFastaTest extends TestNGSuite with MockitoSugar with Matchers Paths.get(getClass.getResource(p).toURI).toString } - val vepped_path = resourcePath("/VEP_oneline.vcf") - val vepped = new File(vepped_path) - val bam_path = resourcePath("/paired01.bam") - val chrQ_path = resourcePath("/chrQ.vcf.gz") - val chrQRef_path = resourcePath("/fake_chrQ.fa") + val veppedPath = resourcePath("/VEP_oneline.vcf") + val vepped = new File(veppedPath) + val bamPath = resourcePath("/paired01.bam") + val chrQPath = resourcePath("/chrQ.vcf.gz") + val chrQRefPath = resourcePath("/fake_chrQ.fa") val bam = new File(resourcePath("/paired01.bam")) val chrQ = new File(resourcePath("/chrQ.vcf.gz")) val chrQRef = new File(resourcePath("/fake_chrQ.fa")) @@ -50,7 +50,7 @@ class BastyGenerateFastaTest extends TestNGSuite with MockitoSugar with Matchers val tmppath = tmp.getAbsolutePath tmp.deleteOnExit() - val arguments = Array("-V", chrQ_path, "--outputVariants", tmppath, "--sampleName", "Sample_101", "--reference", chrQRef_path, "--outputName", "test") + val arguments = Array("-V", chrQPath, "--outputVariants", tmppath, "--sampleName", "Sample_101", "--reference", chrQRefPath, "--outputName", "test") main(arguments) } @@ -60,7 +60,7 @@ class BastyGenerateFastaTest extends TestNGSuite with MockitoSugar with Matchers val tmppath = tmp.getAbsolutePath tmp.deleteOnExit() - val arguments = Array("-V", chrQ_path, "--outputVariants", tmppath, "--bamFile", bam_path, "--sampleName", "Sample_101", "--reference", chrQRef_path, "--outputName", "test") + val arguments = Array("-V", chrQPath, "--outputVariants", tmppath, "--bamFile", bamPath, "--sampleName", "Sample_101", "--reference", chrQRefPath, "--outputName", "test") main(arguments) } @@ -70,7 +70,7 @@ class BastyGenerateFastaTest extends TestNGSuite with MockitoSugar with Matchers val tmppath = tmp.getAbsolutePath tmp.deleteOnExit() - val arguments = Array("-V", chrQ_path, "--outputConsensus", tmppath, "--outputConsensusVariants", tmppath, "--bamFile", bam_path, "--sampleName", "Sample_101", "--reference", chrQRef_path, "--outputName", "test") + val arguments = Array("-V", chrQPath, "--outputConsensus", tmppath, "--outputConsensusVariants", tmppath, "--bamFile", bamPath, "--sampleName", "Sample_101", "--reference", chrQRefPath, "--outputName", "test") main(arguments) } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/CheckAllelesVcfInBamTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/CheckAllelesVcfInBamTest.scala index 1ee6f38d316e076084479edbbface36eba73dc92..7281e4deabcc76b479c3ba5ce38cb3ac487cb397 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/CheckAllelesVcfInBamTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/CheckAllelesVcfInBamTest.scala @@ -47,24 +47,24 @@ class CheckAllelesVcfInBamTest extends TestNGSuite with MockitoSugar with Matche @Test def testOutputTypeVcf() = { val tmp = File.createTempFile("CheckAllelesVcfInBam", ".vcf") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vcf, "-b", bam, "-s", "sample01", "-o", tmp_path) + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", vcf, "-b", bam, "-s", "sample01", "-o", tmpPath) main(arguments) } @Test def testOutputTypeVcfGz() = { val tmp = File.createTempFile("CheckAllelesVcfInBam", ".vcf.gz") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vcf, "-b", bam, "-s", "sample01", "-o", tmp_path) + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", vcf, "-b", bam, "-s", "sample01", "-o", tmpPath) main(arguments) } @Test def testOutputTypeBcf() = { val tmp = File.createTempFile("CheckAllelesVcfInBam", ".bcf") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vcf, "-b", bam, "-s", "sample01", "-o", tmp_path) + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", vcf, "-b", bam, "-s", "sample01", "-o", tmpPath) main(arguments) } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/GvcfToBedTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/GvcfToBedTest.scala index aac4cb6c7e0ddaa0f7e9479110758d95e6cb3ac6..b875007a06c1a7c2ccdce3c5bcd7fafd4b2399f8 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/GvcfToBedTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/GvcfToBedTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.io.File @@ -27,7 +42,7 @@ class GvcfToBedTest extends TestNGSuite with Matchers with MockitoSugar { val vepped = new File(resourcePath("/VEP_oneline.vcf")) val unvepped = new File(resourcePath("/unvepped.vcf")) - val vepped_path = resourcePath("/VEP_oneline.vcf") + val veppedPath = resourcePath("/VEP_oneline.vcf") @Test def testMinQuality = { val reader = new VCFFileReader(vepped, false) @@ -64,23 +79,23 @@ class GvcfToBedTest extends TestNGSuite with Matchers with MockitoSugar { @Test def testGvcfToBedInvertedOutput = { val tmp = File.createTempFile("gvcf2bedtest", ".bed") - val tmp_inv = File.createTempFile("gvcf2bedtest", ".bed") + val tmpInv = File.createTempFile("gvcf2bedtest", ".bed") tmp.deleteOnExit() - tmp_inv.deleteOnExit() + tmpInv.deleteOnExit() val args: Array[String] = Array("-I", unvepped.getAbsolutePath, "-O", tmp.getAbsolutePath, "-S", "Sample_101", - "--minGenomeQuality", "99", "--invertedOutputBed", tmp_inv.getAbsolutePath) + "--minGenomeQuality", "99", "--invertedOutputBed", tmpInv.getAbsolutePath) main(args) - Source.fromFile(tmp_inv).getLines().size shouldBe 1 + Source.fromFile(tmpInv).getLines().size shouldBe 1 val tmp2 = File.createTempFile("gvcf2bedtest", ".bed") - val tmp2_inv = File.createTempFile("gvcf2bedtest", ".bed") + val tmp2Inv = File.createTempFile("gvcf2bedtest", ".bed") tmp2.deleteOnExit() - tmp2_inv.deleteOnExit() + tmp2Inv.deleteOnExit() val args2: Array[String] = Array("-I", unvepped.getAbsolutePath, "-O", tmp.getAbsolutePath, "-S", "Sample_102", - "--minGenomeQuality", "3", "--invertedOutputBed", tmp2_inv.getAbsolutePath) + "--minGenomeQuality", "3", "--invertedOutputBed", tmp2Inv.getAbsolutePath) main(args2) - Source.fromFile(tmp2_inv).getLines().size shouldBe 0 + Source.fromFile(tmp2Inv).getLines().size shouldBe 0 } } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MergeAllelesTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MergeAllelesTest.scala index b7ea9bb9c9b2d3e76b6d1c312106091a2684f9a2..d9c44abfc1ae29b8c1af7e453fc27d4a66cfc517 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MergeAllelesTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MergeAllelesTest.scala @@ -37,7 +37,7 @@ class MergeAllelesTest extends TestNGSuite with MockitoSugar with Matchers { Paths.get(getClass.getResource(p).toURI).toString } - val vepped_path = resourcePath("/chrQ.vcf.gz") + val veppedPath = resourcePath("/chrQ.vcf.gz") val reference = resourcePath("/fake_chrQ.fa") // These two have to created @@ -51,24 +51,24 @@ class MergeAllelesTest extends TestNGSuite with MockitoSugar with Matchers { @Test def testOutputTypeVcf() = { val tmp = File.createTempFile("MergeAlleles", ".vcf") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vepped_path, "-o", tmp_path, "-R", reference) + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", veppedPath, "-o", tmpPath, "-R", reference) main(arguments) } @Test def testOutputTypeVcfGz() = { val tmp = File.createTempFile("MergeAlleles", ".vcf.gz") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vepped_path, "-o", tmp_path, "-R", reference) + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", veppedPath, "-o", tmpPath, "-R", reference) main(arguments) } @Test def testOutputTypeBcf() = { val tmp = File.createTempFile("MergeAlleles", ".bcf") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vepped_path, "-o", tmp_path, "-R", reference) + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", veppedPath, "-o", tmpPath, "-R", reference) main(arguments) } } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MpileupToVcfTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MpileupToVcfTest.scala index 46e0ffa932eafe3ac20b8642cf52c3033dd5a733..1406cef299c2198ebcce4290dacf238c959ebe2b 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MpileupToVcfTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/MpileupToVcfTest.scala @@ -81,14 +81,14 @@ class MpileupToVcfTest extends TestNGSuite with MockitoSugar with Matchers { for (record <- vcfReader) { val alleles = record.getAlleles.toSet - var ref_alleles = alleles -- record.getAlternateAlleles.toSet + var refAlleles = alleles -- record.getAlternateAlleles.toSet - ref_alleles.size should be >= 1 + refAlleles.size should be >= 1 val realRef = Allele.create(sequenceFile.getSubsequenceAt(record.getContig, record.getStart, record.getEnd).getBases, true) - for (ref <- ref_alleles) { + for (ref <- refAlleles) { record.extraStrictValidation(ref, realRef, Set("")) } } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SeqStatTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SeqStatTest.scala index b60ef10126a30b4ccabb8d8b8595f44e7a3cd889..84c27306168ca6abe67a43b80458c033ade84964 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SeqStatTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SeqStatTest.scala @@ -109,6 +109,21 @@ class SeqStatTest extends TestNGSuite with MockitoSugar with Matchers { } + @Test(dataProvider = "mockReaderProvider", groups = Array("check_readstats"), singleThreaded = true, dependsOnGroups = Array("report")) + def testReadStatsObject(fqMock: FastqReader) = { + when(fqMock.getFile) thenReturn new File("/tmp/test.fq") + when(fqMock.iterator) thenReturn recordsOver("1", "2", "3", "4", "5") + val seqstat = SeqStat + + // the histogram should store the lenght==0 value also, for example sequence length 5 is size 6. + // please note that we already loaded the dataset twice in seqstat. (seqstat.Seqstat is called 2 times in previous steps) + seqstat.readStats.lengths(5) shouldBe 10 + seqstat.readStats.lengths.length shouldBe 6 + + seqstat.readStats.nucs.sum shouldBe 50 + seqstat.readStats.withN shouldBe 10 + } + @Test def testArgsMinimum() = { val args = Array( "-i", resourcePath("/paired01a.fq")) diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SummaryToTsvTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SummaryToTsvTest.scala index a53c4dd31abbaa326ff6c5f251c0ee3eeb3ac1c6..e791d8093876fc36057823b862d7a09cc0d8dfd1 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SummaryToTsvTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/SummaryToTsvTest.scala @@ -77,13 +77,13 @@ class SummaryToTsvTest extends TestNGSuite with MockitoSugar with Matchers { val line = values.head._2.keys.map(x => createLine(paths, values, x)).head line should equal("value\t") - val sample_values = fetchValues(summary, paths, true, false) - val sample_line = sample_values.head._2.keys.map(x => createLine(paths, sample_values, x)).head - sample_line should equal("016\t") + val sampleValues = fetchValues(summary, paths, true, false) + val sampleLine = sampleValues.head._2.keys.map(x => createLine(paths, sampleValues, x)).head + sampleLine should equal("016\t") - val lib_values = fetchValues(summary, paths, false, true) - val lib_line = lib_values.head._2.keys.map(x => createLine(paths, lib_values, x)).head - lib_line should equal("016-L001\tfalse") + val libValues = fetchValues(summary, paths, false, true) + val libLine = libValues.head._2.keys.map(x => createLine(paths, libValues, x)).head + libLine should equal("016-L001\tfalse") } } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/ValidateFastqTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/ValidateFastqTest.scala index d3926e68f59ba80c4df459f2f9fb2aa37077d1bd..c7ac7aecef00e5fdf47ed37ba47186d8cc6ca1ba 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/ValidateFastqTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/ValidateFastqTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.tools import java.nio.file.Paths diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfFilterTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfFilterTest.scala index afe161575dfb1cfd9d01d360e7a49473aeadfdb4..50b342bc9509803fe03bb175d4c50b121b282297 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfFilterTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfFilterTest.scala @@ -39,31 +39,31 @@ class VcfFilterTest extends TestNGSuite with MockitoSugar with Matchers { Paths.get(getClass.getResource(p).toURI).toString } - val vepped_path = resourcePath("/VEP_oneline.vcf") - val vepped = new File(vepped_path) + val veppedPath = resourcePath("/VEP_oneline.vcf") + val vepped = new File(veppedPath) val rand = new Random() @Test def testOutputTypeVcf() = { val tmp = File.createTempFile("VcfFilter", ".vcf") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path) + val tmpPath = tmp.getAbsolutePath + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath) main(arguments) } @Test def testOutputTypeBcf() = { val tmp = File.createTempFile("VcfFilter", ".bcf") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path) + val tmpPath = tmp.getAbsolutePath + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath) main(arguments) } @Test def testOutputTypeVcfGz() = { val tmp = File.createTempFile("VcfFilter", ".vcf.gz") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path) + val tmpPath = tmp.getAbsolutePath + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath) main(arguments) } @@ -73,22 +73,22 @@ class VcfFilterTest extends TestNGSuite with MockitoSugar with Matchers { */ val tmp = File.createTempFile("VCfFilter", ".vcf.gz") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments: Array[String] = Array("-I", vepped_path, "-o", tmp_path, + val tmpPath = tmp.getAbsolutePath + val arguments: Array[String] = Array("-I", veppedPath, "-o", tmpPath, "--mustHaveGenotype", "Sample_101:HET") main(arguments) - val size = new VCFFileReader(new File(tmp_path), false).size + val size = new VCFFileReader(new File(tmpPath), false).size size shouldBe 1 val tmp2 = File.createTempFile("VcfFilter", ".vcf.gz") tmp2.deleteOnExit() - val tmp2_path = tmp2.getAbsolutePath - val arguments2: Array[String] = Array("-I", vepped_path, "-o", tmp2_path, + val tmpPath2 = tmp2.getAbsolutePath + val arguments2: Array[String] = Array("-I", veppedPath, "-o", tmpPath2, "--mustHaveGenotype", "Sample_101:HOM_VAR") main(arguments2) - val size2 = new VCFFileReader(new File(tmp2_path), false).size + val size2 = new VCFFileReader(new File(tmpPath2), false).size size2 shouldBe 0 } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfToTsvTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfToTsvTest.scala index ee327392cf6f073088a5bea9b2d449593b7e3fd3..76379526a63c486d362762245487e6b456e4309e 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfToTsvTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfToTsvTest.scala @@ -44,31 +44,31 @@ class VcfToTsvTest extends TestNGSuite with MockitoSugar with Matchers { @Test def testAllFields() = { val tmp = File.createTempFile("VcfToTsv", ".tsv") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", unvepped, "-o", tmp_path, "--all_info") + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", unvepped, "-o", tmpPath, "--all_info") main(arguments) } @Test def testSpecificField() = { val tmp = File.createTempFile("VcfToTsv", ".tsv") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vepped, "-o", tmp_path, "-i", "CSQ") + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", vepped, "-o", tmpPath, "-i", "CSQ") main(arguments) } @Test def testNewSeparators() = { val tmp = File.createTempFile("VcfToTsv", ".tsv") tmp.deleteOnExit() - val tmp_path = tmp.getAbsolutePath - val arguments = Array("-I", vepped, "-o", tmp_path, "--all_info", "--separator", ",", "--list_separator", "|") + val tmpPath = tmp.getAbsolutePath + val arguments = Array("-I", vepped, "-o", tmpPath, "--all_info", "--separator", ",", "--list_separator", "|") main(arguments) } @Test(expectedExceptions = Array(classOf[IllegalArgumentException])) def testIdenticalSeparators() = { - val tmp_path = "/tmp/VcfToTsv_" + rand.nextString(10) + ".tsv" - val arguments = Array("-I", vepped, "-o", tmp_path, "--all_info", "--separator", ",") + val tmpPath = "/tmp/VcfToTsv_" + rand.nextString(10) + ".tsv" + val arguments = Array("-I", vepped, "-o", tmpPath, "--all_info", "--separator", ",") main(arguments) } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfWithVcfTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfWithVcfTest.scala index a6a70881012480a8e92b9aac1c689e4456f9f8c7..2f1814210f0ccc760da2235969910e5a335524f7 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfWithVcfTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VcfWithVcfTest.scala @@ -105,7 +105,7 @@ class VcfWithVcfTest extends TestNGSuite with MockitoSugar with Matchers { @Test def testFieldMap = { - val unvep_record = new VCFFileReader(new File(unveppedPath)).iterator().next() + val unvepRecord = new VCFFileReader(new File(unveppedPath)).iterator().next() var fields = List(new Fields("FG", "FG")) fields :::= List(new Fields("FD", "FD")) @@ -133,7 +133,7 @@ class VcfWithVcfTest extends TestNGSuite with MockitoSugar with Matchers { fields :::= List(new Fields("VQSLOD", "VQSLOD")) fields :::= List(new Fields("culprit", "culprit")) - val fieldMap = createFieldMap(fields, List(unvep_record)) + val fieldMap = createFieldMap(fields, List(unvepRecord)) fieldMap("FG") shouldBe List("intron") fieldMap("FD") shouldBe List("unknown") @@ -163,26 +163,26 @@ class VcfWithVcfTest extends TestNGSuite with MockitoSugar with Matchers { } @Test def testGetSecondaryRecords = { - val unvep_record = new VCFFileReader(new File(unveppedPath)).iterator().next() - val vep_reader = new VCFFileReader(new File(veppedPath)) - val vep_record = vep_reader.iterator().next() + val unvepRecord = new VCFFileReader(new File(unveppedPath)).iterator().next() + val vepReader = new VCFFileReader(new File(veppedPath)) + val vepRecord = vepReader.iterator().next() - val secRec = getSecondaryRecords(vep_reader, unvep_record, false) + val secRec = getSecondaryRecords(vepReader, unvepRecord, false) - secRec.foreach(x => identicalVariantContext(x, vep_record) shouldBe true) + secRec.foreach(x => identicalVariantContext(x, vepRecord) shouldBe true) } @Test def testCreateRecord = { - val unvep_record = new VCFFileReader(new File(unveppedPath)).iterator().next() - val vep_reader = new VCFFileReader(new File(veppedPath)) - val header = vep_reader.getFileHeader - val vep_record = vep_reader.iterator().next() + val unvepRecord = new VCFFileReader(new File(unveppedPath)).iterator().next() + val vepReader = new VCFFileReader(new File(veppedPath)) + val header = vepReader.getFileHeader + val vepRecord = vepReader.iterator().next() - val secRec = getSecondaryRecords(vep_reader, unvep_record, false) + val secRec = getSecondaryRecords(vepReader, unvepRecord, false) val fieldMap = createFieldMap(List(new Fields("CSQ", "CSQ")), secRec) - val created_record = createRecord(fieldMap, unvep_record, List(new Fields("CSQ", "CSQ")), header) - identicalVariantContext(created_record, vep_record) shouldBe true + val createdRecord = createRecord(fieldMap, unvepRecord, List(new Fields("CSQ", "CSQ")), header) + identicalVariantContext(createdRecord, vepRecord) shouldBe true } } diff --git a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VepNormalizerTest.scala b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VepNormalizerTest.scala index 53aeddfaf1d6ba87f9e89ac29b31edff8fc5e01b..d84db877f68b41bf0861d4e03cd96cb25be9c767 100644 --- a/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VepNormalizerTest.scala +++ b/public/biopet-tools/src/test/scala/nl/lumc/sasc/biopet/tools/VepNormalizerTest.scala @@ -42,49 +42,49 @@ class VepNormalizerTest extends TestNGSuite with MockitoSugar with Matchers { val vepped = new File(resourcePath("/VEP_oneline.vcf")) val unvepped = new File(resourcePath("/unvepped.vcf")) - val vepped_path = resourcePath("/VEP_oneline.vcf") + val veppedPath = resourcePath("/VEP_oneline.vcf") val rand = new Random() @Test def testGzOutputExplode(): Unit = { val tmpFile = File.createTempFile("VepNormalizer_", ".vcf.gz") tmpFile.deleteOnExit() - val arguments: Array[String] = Array("-I", vepped_path, "-O", tmpFile.getAbsolutePath, "-m", "explode") + val arguments: Array[String] = Array("-I", veppedPath, "-O", tmpFile.getAbsolutePath, "-m", "explode") main(arguments) } @Test def testVcfOutputExplode(): Unit = { val tmpFile = File.createTempFile("VepNormalizer_", ".vcf") tmpFile.deleteOnExit() - val arguments: Array[String] = Array("-I", vepped_path, "-O", tmpFile.getAbsolutePath, "-m", "explode") + val arguments: Array[String] = Array("-I", veppedPath, "-O", tmpFile.getAbsolutePath, "-m", "explode") main(arguments) } @Test def testBcfOutputExplode(): Unit = { val tmpFile = File.createTempFile("VepNormalizer_", ".bcf") tmpFile.deleteOnExit() - val arguments: Array[String] = Array("-I", vepped_path, "-O", tmpFile.getAbsolutePath, "-m", "explode") + val arguments: Array[String] = Array("-I", veppedPath, "-O", tmpFile.getAbsolutePath, "-m", "explode") main(arguments) } @Test def testGzOutputStandard(): Unit = { val tmpFile = File.createTempFile("VepNormalizer_", ".vcf.gz") tmpFile.deleteOnExit() - val arguments: Array[String] = Array("-I", vepped_path, "-O", tmpFile.getAbsolutePath, "-m", "standard") + val arguments: Array[String] = Array("-I", veppedPath, "-O", tmpFile.getAbsolutePath, "-m", "standard") main(arguments) } @Test def testVcfOutputStandard(): Unit = { val tmpFile = File.createTempFile("VepNormalizer_", ".vcf") tmpFile.deleteOnExit() - val arguments: Array[String] = Array("-I", vepped_path, "-O", tmpFile.getAbsolutePath, "-m", "standard") + val arguments: Array[String] = Array("-I", veppedPath, "-O", tmpFile.getAbsolutePath, "-m", "standard") main(arguments) } @Test def testBcfOutputStandard(): Unit = { val tmpFile = File.createTempFile("VepNormalizer_", ".bcf") tmpFile.deleteOnExit() - val arguments: Array[String] = Array("-I", vepped_path, "-O", tmpFile.getAbsolutePath, "-m", "standard") + val arguments: Array[String] = Array("-I", veppedPath, "-O", tmpFile.getAbsolutePath, "-m", "standard") main(arguments) } @@ -97,22 +97,22 @@ class VepNormalizerTest extends TestNGSuite with MockitoSugar with Matchers { @Test def testExplodeVEPLength() = { val reader = new VCFFileReader(vepped, false) val header = reader.getFileHeader - val new_infos = parseCsq(header) - explodeTranscripts(reader.iterator().next(), new_infos, removeCsq = true).length should be(11) + val newInfos = parseCsq(header) + explodeTranscripts(reader.iterator().next(), newInfos, removeCsq = true).length should be(11) } @Test def testStandardVEPLength() = { val reader = new VCFFileReader(vepped, false) val header = reader.getFileHeader - val new_infos = parseCsq(header) - Array(standardTranscripts(reader.iterator().next(), new_infos, removeCsq = true)).length should be(1) + val newInfos = parseCsq(header) + Array(standardTranscripts(reader.iterator().next(), newInfos, removeCsq = true)).length should be(1) } @Test def testStandardVEPAttributeLength() = { val reader = new VCFFileReader(vepped, false) val header = reader.getFileHeader - val new_infos = parseCsq(header) - val record = standardTranscripts(reader.iterator().next(), new_infos, removeCsq = true) + val newInfos = parseCsq(header) + val record = standardTranscripts(reader.iterator().next(), newInfos, removeCsq = true) def checkItems(items: Array[String]) = { items.foreach { check } } diff --git a/public/biopet-utils/pom.xml b/public/biopet-utils/pom.xml index 8b70e105c01ea1fd2eda061129fd4cdc8ae1ff3c..722d1e6c629a51fd7f962d14dd5019e8e9201e06 100644 --- a/public/biopet-utils/pom.xml +++ b/public/biopet-utils/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/BamUtils.scala b/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/BamUtils.scala index 10bb4a5880fd4e4628652936bb18bd59cdd2d45e..2540970887f7b920cd46e1ecf74bed46e39fb19a 100644 --- a/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/BamUtils.scala +++ b/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/BamUtils.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.utils import java.io.File diff --git a/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/LazyCheck.scala b/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/LazyCheck.scala index 6eb063d96138ca3c3ef7a559dd625dd5b402d6ab..b6317daef6c260e851a876616af655cdd4c5aa44 100644 --- a/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/LazyCheck.scala +++ b/public/biopet-utils/src/main/scala/nl/lumc/sasc/biopet/utils/LazyCheck.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.utils /** diff --git a/public/biopet-utils/src/test/scala/nl/lumc/sasc/biopet/utils/BamUtilsTest.scala b/public/biopet-utils/src/test/scala/nl/lumc/sasc/biopet/utils/BamUtilsTest.scala index 149f7c8b6f7cc4e2b9b81c5196a3541f269c01d3..e37884f8adb0939d1a95a6206ad1500ec2ec7e69 100644 --- a/public/biopet-utils/src/test/scala/nl/lumc/sasc/biopet/utils/BamUtilsTest.scala +++ b/public/biopet-utils/src/test/scala/nl/lumc/sasc/biopet/utils/BamUtilsTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.utils import java.io.File diff --git a/public/carp/pom.xml b/public/carp/pom.xml index 67f4da89022ba48aff4401fbcfd7ffe52714aa02..ec0e1903e0a6175221dfbcf44030442197d8b765 100644 --- a/public/carp/pom.xml +++ b/public/carp/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/flexiprep/pom.xml b/public/flexiprep/pom.xml index 97e2df77149f5dd060e1253984804c8a716cb3a8..15e140159ca1d60866854b9bb327097343b3ec53 100644 --- a/public/flexiprep/pom.xml +++ b/public/flexiprep/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepBaseSummary.ssp b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepBaseSummary.ssp index 91c2550725fc7536fe58efc2229544ae42403674..5f0e3a13abc232739cd25eac363ba25677ad1616 100644 --- a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepBaseSummary.ssp +++ b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepBaseSummary.ssp @@ -81,9 +81,10 @@ #else <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#QC_BasesTable">Show table</button> #end - <i class="glyphicon glyphicon-file"></i> <a href="QC_Bases_R1.tsv">R1 reads stats</a> + + <a href="QC_Bases_R1.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> R1 base stats</button></a> #if (paired) - - <i class="glyphicon glyphicon-file"></i> <a href="QC_Bases_R2.tsv">R2 reads stats</a> + <a href="QC_Bases_R2.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> R2 base stats</button></a> #end </div> #end @@ -105,8 +106,8 @@ #for (sample <- samples.toList.sorted) #{ val libs = libId match { - case Some(libId) => List(libId.toString) - case _ => summary.libraries(sample).toList + case Some(libId) => List(libId.toString).sorted + case _ => summary.libraries(sample).toList.sorted } val sampleRowspan = { diff --git a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastaqcPlot.ssp b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastQcPlot.ssp similarity index 60% rename from public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastaqcPlot.ssp rename to public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastQcPlot.ssp index 0ad776bd7cd861d02b3e8be11b9fa9dad6234008..f8dd3e533d3ff2eacb2420af05efeee136acf090 100644 --- a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastaqcPlot.ssp +++ b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastQcPlot.ssp @@ -14,14 +14,19 @@ def getPlot(read:String) = { summary.getLibraryValue(sampleId.get, libId.get, "flexiprep", "files", read, plot, "path").collect { - case value => { - val file = new File(value.toString) + case path => { + val file = new File(path.toString) val newFile = new File(outputDir, read + "_" + file.getName) if (file.exists()) FileUtils.copyFile(file, newFile) newFile.getName } } } + + def plotAvailable(read:String) = { + new File(summary.getLibraryValue(sampleId.get, libId.get, "flexiprep", "files", read, plot, "path").get.toString).exists() + } + }# <div class="row"> @@ -31,20 +36,40 @@ </div> <div class="row"> <div class="col-md-1"><b>R1</b></div> - <div class="col-md-5"><img class="img-responsive" src="${getPlot("fastqc_R1")}" /></div> + <div class="col-md-5"> + #if (plotAvailable( "fastqc_R1" )) + <img class="img-responsive" src="${getPlot("fastqc_R1")}" /> + #else + Image was not generated by FastQC + #end + </div> #if (!skipTrim || !skipClip) <div class="col-md-5"> - <img class="img-responsive" src="${getPlot("fastqc_R1_qc")}" /> + #if (plotAvailable( "fastqc_R1_qc" )) + <img class="img-responsive" src="${getPlot("fastqc_R1_qc")}" /> + #else + Image was not generated by FastQC + #end </div> #end </div> #if (paired) <div class="row"> <div class="col-md-1"><b>R2</b></div> - <div class="col-md-5"><img class="img-responsive" src="${getPlot("fastqc_R2")}" /></div> + <div class="col-md-5"> + #if (plotAvailable( "fastqc_R2" )) + <img class="img-responsive" src="${getPlot("fastqc_R2")}" /> + #else + Image was not generated by FastQC + #end + </div> #if (!skipTrim || !skipClip) <div class="col-md-5"> - <img class="img-responsive" src="${getPlot("fastqc_R2_qc")}" /> + #if (plotAvailable( "fastqc_R2_qc" )) + <img class="img-responsive" src="${getPlot("fastqc_R2_qc")}" /> + #else + Image was not generated by FastQC + #end </div> #end </div> diff --git a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepInputfiles.ssp b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepInputfiles.ssp index dc0ee78a3334c19eb30733bb76e0d97b73c789d8..7bba535209ea0971d84cbbf55325932f78aaa817 100644 --- a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepInputfiles.ssp +++ b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepInputfiles.ssp @@ -35,8 +35,8 @@ #for (sample <- samples.toList.sorted) #{ val libs = libId match { - case Some(libId) => List(libId.toString) - case _ => summary.libraries(sample).toList + case Some(libId) => List(libId.toString).sorted + case _ => summary.libraries(sample).toList.sorted } val sampleRowspan = { diff --git a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepOutputfiles.ssp b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepOutputfiles.ssp index f91ba1ea26cadc287ec469a6d4205d7aa6ad5c5e..0d9fca92abdc3c37326b57062484a3a7328ed844 100644 --- a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepOutputfiles.ssp +++ b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepOutputfiles.ssp @@ -35,8 +35,8 @@ #for (sample <- samples.toList.sorted) #{ val libs = libId match { - case Some(libId) => List(libId.toString) - case _ => summary.libraries(sample).toList + case Some(libId) => List(libId.toString).sorted + case _ => summary.libraries(sample).toList.sorted } val sampleRowspan = { diff --git a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepReadSummary.ssp b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepReadSummary.ssp index f9b77db7cd38ab93643f7c97e3a7c1363f051861..fc8b8ce55816d6b45b9937157ca6ae048e8f7a3c 100644 --- a/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepReadSummary.ssp +++ b/public/flexiprep/src/main/resources/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepReadSummary.ssp @@ -13,8 +13,8 @@ <%@ var multisample: Boolean = true %> #{ val samples = sampleId match { - case Some(sample) => List(sample.toString) - case _ => summary.samples.toList + case Some(sample) => List(sample.toString) + case _ => summary.samples.toList } val trimCount = summary.getLibraryValues("flexiprep", "settings", "skip_trim").count(_._2 == Some(false)) val clipCount = summary.getLibraryValues("flexiprep", "settings", "skip_clip").count(_._2 == Some(false)) @@ -44,17 +44,17 @@ #end </p> <p> - #if(sampleId.isDefined && libId.isDefined) - Here we show aggregated quality statistics for sequencing library ${libId} for sample ${sampleId}. It shows the total number of reads used after quality control, and the total number of reads discarded during quality control. This is done for both forward and reverse reads. - #elseif(sampleId.isDefined) - Here we show aggregated quality statistics for every sequencing library for sample ${sampleId}. It shows the total number of reads used after quality control, and the total number of reads discarded during quality control. This is done for both forward and reverse reads. - #else - Here we show aggregated quality statistics for every sequencing library. It shows the total number of reads used after quality control, and the total number of reads discarded during quality control. This is done for both forward and reverse reads. - We show two plots; one for the forward read in the pair, and another one of the reverse read in the pair. - Red denotes number of reads left after QC. Green denotes reads filtered by adaptor clipping. - Blue denotes number of reads filtered by read trimming. - Purple denotes the amount of <em>synced</em> reads. That is, reads removed in one orientation should be removed in the other as well to ensure correctness. - #end + #if(sampleId.isDefined && libId.isDefined) + Here we show aggregated quality statistics for sequencing library ${libId} for sample ${sampleId}. It shows the total number of reads used after quality control, and the total number of reads discarded during quality control. This is done for both forward and reverse reads. + #elseif(sampleId.isDefined) + Here we show aggregated quality statistics for every sequencing library for sample ${sampleId}. It shows the total number of reads used after quality control, and the total number of reads discarded during quality control. This is done for both forward and reverse reads. + #else + Here we show aggregated quality statistics for every sequencing library. It shows the total number of reads used after quality control, and the total number of reads discarded during quality control. This is done for both forward and reverse reads. + We show two plots; one for the forward read in the pair, and another one of the reverse read in the pair. + Red denotes number of reads left after QC. Green denotes reads filtered by adaptor clipping. + Blue denotes number of reads filtered by read trimming. + Purple denotes the amount of <em>synced</em> reads. That is, reads removed in one orientation should be removed in the other as well to ensure correctness. + #end </p> </div> </div> @@ -69,28 +69,29 @@ if (paired) FlexiprepReport.readSummaryPlot(outputDir, "QC_Reads_R2","R2", summary, sampleId = sampleId) }# <div class="panel-body"> - <div class="row"> - <div class="col-sm-6 col-md-6"> - <img src="QC_Reads_R1.png" class="img-responsive"> - </div> + <div class="row"> + <div class="col-sm-6 col-md-6"> + <img src="QC_Reads_R1.png" class="img-responsive"> + </div> #if (paired) - <div class="col-sm-6 col-md-6"> - <img src="QC_Reads_R2.png" class="img-responsive"> - </div> - #end + <div class="col-sm-6 col-md-6"> + <img src="QC_Reads_R2.png" class="img-responsive"> </div> + #end + </div> </div> <div class="panel-footer"> - #if (showTable) - <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#QC_ReadsTable">Hide table</button> - #else - <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#QC_ReadsTable">Show table</button> - #end - <i class="glyphicon glyphicon-file"></i> <a href="QC_Reads_R1.tsv">R1 reads stats</a> - #if (paired) - - <i class="glyphicon glyphicon-file"></i> <a href="QC_Reads_R2.tsv">R2 reads stats</a> - #end + #if (showTable) + <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#QC_ReadsTable">Hide table</button> + #else + <button type="button" class="btn btn-info" data-toggle="collapse" data-target="#QC_ReadsTable">Show table</button> + #end + + <a href="QC_Reads_R1.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> R1 reads stats</button></a> + #if (paired) + <a href="QC_Reads_R2.tsv"><button type="button" class="btn btn-info"><i class="glyphicon glyphicon-cloud-download"></i> R2 reads stats</button></a> + #end </div> #end @@ -110,8 +111,8 @@ #for (sample <- samples.toList.sorted) #{ val libs = libId match { - case Some(libId) => List(libId.toString) - case _ => summary.libraries(sample).toList + case Some(libId) => List(libId.toString).sorted + case _ => summary.libraries(sample).toList.sorted } val sampleRowspan = { libs.size + @@ -143,7 +144,7 @@ val afterTotal = summary.getLibraryValue(sample, libId, "flexiprep", "stats", "seqstat_" + read + "_qc", "reads", "num_total") val clippingDiscardedToShort = summary.getLibraryValue(sample, libId, "flexiprep", "stats", "clipping_" + read, "num_reads_discarded_too_short").getOrElse(0).toString.toLong val clippingDiscardedToLong = summary.getLibraryValue(sample, libId, "flexiprep", "stats", "clipping_" + read, "num_reads_discarded_too_long").getOrElse(0).toString.toLong - val trimmingDiscarded = summary.getLibraryValue(sample, libId, "flexiprep", "stats", "trimming", "num_reads_discarded_" + read).getOrElse(0).toString.toLong + var trimmingDiscarded = summary.getLibraryValue(sample, libId, "flexiprep", "stats", "trimming_" + read, "num_reads_discarded_total").getOrElse(0).toString.toLong }# <td>${read}</td> <td>${beforeTotal}</td> diff --git a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Cutadapt.scala b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Cutadapt.scala index f974f8c9a43f685390ee0c510ffa0064d07167b5..fc8db7ab30f7581c7638f15c48bba6e9443eb195 100644 --- a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Cutadapt.scala +++ b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Cutadapt.scala @@ -30,17 +30,19 @@ import nl.lumc.sasc.biopet.utils.config.Configurable class Cutadapt(root: Configurable, fastqc: Fastqc) extends nl.lumc.sasc.biopet.extensions.Cutadapt(root) { /** Clipped adapter names from FastQC */ - protected def seqToName = fastqc.foundAdapters + protected def seqToName: Map[String, String] = fastqc.foundAdapters .map(adapter => adapter.seq -> adapter.name).toMap override def summaryStats: Map[String, Any] = { val initStats = super.summaryStats + // translationTable of sequences to the sequence-name, run once + val seqToNameMap: Map[String, String] = seqToName // Map of adapter sequence and how many times it is found val adapterCounts: Map[String, Any] = initStats.get(adaptersStatsName) match { // "adapters" key found in statistics case Some(m: Map[_, _]) => m.flatMap { case (seq: String, count) => - seqToName.get(seq) match { + seqToNameMap.get(seq) match { // adapter sequence is found by FastQC case Some(n) => Some(n -> Map("sequence" -> seq, "count" -> count)) // adapter sequence is clipped but not found by FastQC ~ should not happen since all clipped adapter diff --git a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Fastqc.scala b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Fastqc.scala index 8de84c5081db4b9801222c6853e2b185a54741e6..1bbc7b3f520b549933ec650b7eed847c97113692 100644 --- a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Fastqc.scala +++ b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Fastqc.scala @@ -18,12 +18,13 @@ package nl.lumc.sasc.biopet.pipelines.flexiprep import java.io.{ File, FileNotFoundException } -import nl.lumc.sasc.biopet.utils.config.Configurable import nl.lumc.sasc.biopet.core.summary.Summarizable -import org.broadinstitute.gatk.utils.commandline.Output +import nl.lumc.sasc.biopet.utils.config.Configurable import scala.io.Source +import htsjdk.samtools.util.SequenceUtil.reverseComplement + /** * FastQC wrapper with added functionality for the Flexiprep pipeline * @@ -32,6 +33,9 @@ import scala.io.Source */ class Fastqc(root: Configurable) extends nl.lumc.sasc.biopet.extensions.Fastqc(root) with Summarizable { + /** Allow reporting of all found (potentially adapter) sequences in the FastQC */ + var sensitiveAdapterSearch: Boolean = config("sensitiveAdapterSearch", default = false) + /** Class for storing a single FastQC module result */ protected case class FastQCModule(name: String, status: String, lines: Seq[String]) @@ -51,7 +55,8 @@ class Fastqc(root: Configurable) extends nl.lumc.sasc.biopet.extensions.Fastqc(r * @throws IllegalStateException if the module lines have no content or mapping is empty. */ def qcModules: Map[String, FastQCModule] = { - val fqModules = Source.fromFile(dataFile) + val fastQCLog = Source.fromFile(dataFile) + val fqModules: Map[String, FastQCModule] = fastQCLog // drop all the characters before the first module delimiter (i.e. '>>') .dropWhile(_ != '>') // pull everything into a string @@ -77,6 +82,7 @@ class Fastqc(root: Configurable) extends nl.lumc.sasc.biopet.extensions.Fastqc(r } .toMap + fastQCLog.close() if (fqModules.isEmpty) throw new IllegalStateException("Empty FastQC data file " + dataFile.toString) else fqModules } @@ -165,7 +171,10 @@ class Fastqc(root: Configurable) extends nl.lumc.sasc.biopet.extensions.Fastqc(r } yield AdapterSequence(values(0), values(1))).toSet } - val found = qcModules.get("Overrepresented sequences") match { + val adapterSet = getFastqcSeqs(adapters) + val contaminantSet = getFastqcSeqs(contaminants) + + val foundAdapterNames: Seq[String] = qcModules.get("Overrepresented sequences") match { case None => Seq.empty[String] case Some(qcModule) => for ( @@ -176,8 +185,34 @@ class Fastqc(root: Configurable) extends nl.lumc.sasc.biopet.extensions.Fastqc(r // select full sequences from known adapters and contaminants // based on overrepresented sequences results - (getFastqcSeqs(adapters) ++ getFastqcSeqs(contaminants)) - .filter(x => found.exists(_.startsWith(x.name))) + val fromKnownList: Set[AdapterSequence] = (adapterSet ++ contaminantSet) + .filter(x => foundAdapterNames.exists(_.startsWith(x.name))) + + val fromKnownListRC: Set[AdapterSequence] = fromKnownList.map { + x => AdapterSequence(x.name + "_RC", reverseComplement(x.seq)) + } + + // list all sequences found by FastQC + val fastQCFoundSequences: Seq[AdapterSequence] = if (sensitiveAdapterSearch) { + qcModules.get("Overrepresented sequences") match { + case None => Seq.empty + case Some(qcModule) => + for ( + line <- qcModule.lines if !(line.startsWith("#") || line.startsWith(">")); + values = line.split("\t") if values.size >= 4 + ) yield AdapterSequence(values(3), values(0)) + } + } else { + Seq.empty + } + + // we only want to keep adapter sequences which are known by FastQC + // sequences such as "Adapter01 (100% over 12bp)" are valid because "Adapter01" is in FastQC + fastQCFoundSequences.filter(x => { + (adapterSet ++ contaminantSet).count(y => x.name.startsWith(y.name)) == 1 + }) + + fromKnownList ++ fastQCFoundSequences ++ fromKnownListRC } else Set() } diff --git a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Flexiprep.scala b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Flexiprep.scala index eb26cfa038dfdee9b6bc2be1bc8881a59f1c07e8..c1a04ed85efe48a27fff3fe14cb789e98d3504a7 100644 --- a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Flexiprep.scala +++ b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/Flexiprep.scala @@ -27,11 +27,11 @@ import org.broadinstitute.gatk.queue.QScript class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with SampleLibraryTag { def this() = this(null) - @Input(doc = "R1 fastq file (gzipped allowed)", shortName = "R1", required = true) - var input_R1: File = _ + @Input(doc = "R1 fastq file (gzipped allowed)", shortName = "R1", fullName = "inputR1", required = true) + var inputR1: File = _ - @Input(doc = "R2 fastq file (gzipped allowed)", shortName = "R2", required = false) - var input_R2: Option[File] = None + @Input(doc = "R2 fastq file (gzipped allowed)", shortName = "R2", fullName = "inputR2", required = false) + var inputR2: Option[File] = None /** Skip Trim fastq files */ var skipTrim: Boolean = config("skip_trim", default = false) @@ -47,21 +47,21 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with /** Returns files to store in summary */ def summaryFiles: Map[String, File] = { - Map("input_R1" -> input_R1, "output_R1" -> fastqR1Qc) ++ - (if (paired) Map("input_R2" -> input_R2.get, "output_R2" -> fastqR2Qc.get) else Map()) + Map("input_R1" -> inputR1, "output_R1" -> fastqR1Qc) ++ + (if (paired) Map("input_R2" -> inputR2.get, "output_R2" -> fastqR2Qc.get) else Map()) } /** returns settings to store in summary */ def summarySettings = Map("skip_trim" -> skipTrim, "skip_clip" -> skipClip, "paired" -> paired) - var paired: Boolean = input_R2.isDefined - var R1_name: String = _ - var R2_name: String = _ + var paired: Boolean = inputR2.isDefined + var R1Name: String = _ + var R2Name: String = _ - var fastqc_R1: Fastqc = _ - var fastqc_R2: Fastqc = _ - var fastqc_R1_after: Fastqc = _ - var fastqc_R2_after: Fastqc = _ + var fastqcR1: Fastqc = _ + var fastqcR2: Fastqc = _ + var fastqcR1After: Fastqc = _ + var fastqcR2After: Fastqc = _ override def reportClass = { val flexiprepReport = new FlexiprepReport(this) @@ -76,19 +76,19 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with /** Function that's need to be executed before the script is accessed */ def init() { require(outputDir != null, "Missing output directory on flexiprep module") - require(input_R1 != null, "Missing input R1 on flexiprep module") + require(inputR1 != null, "Missing input R1 on flexiprep module") require(sampleId != null, "Missing sample ID on flexiprep module") require(libId != null, "Missing library ID on flexiprep module") - paired = input_R2.isDefined + paired = inputR2.isDefined - inputFiles :+= new InputFile(input_R1) - input_R2.foreach(inputFiles :+= new InputFile(_)) + inputFiles :+= new InputFile(inputR1) + inputR2.foreach(inputFiles :+= new InputFile(_)) - R1_name = getUncompressedFileName(input_R1) - input_R2.foreach { fileR2 => + R1Name = getUncompressedFileName(inputR1) + inputR2.foreach { fileR2 => paired = true - R2_name = getUncompressedFileName(fileR2) + R2Name = getUncompressedFileName(fileR2) } } @@ -96,27 +96,27 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with def biopetScript() { runInitialJobs() - if (paired) runTrimClip(input_R1, input_R2, outputDir) - else runTrimClip(input_R1, outputDir) + if (paired) runTrimClip(inputR1, inputR2, outputDir) + else runTrimClip(inputR1, outputDir) - val R1_files = for ((k, v) <- outputFiles if k.endsWith("output_R1")) yield v - val R2_files = for ((k, v) <- outputFiles if k.endsWith("output_R2")) yield v - runFinalize(R1_files.toList, R2_files.toList) + val R1Files = for ((k, v) <- outputFiles if k.endsWith("output_R1")) yield v + val R2Files = for ((k, v) <- outputFiles if k.endsWith("output_R2")) yield v + runFinalize(R1Files.toList, R2Files.toList) } /** Add init non chunkable jobs */ def runInitialJobs() { - outputFiles += ("fastq_input_R1" -> input_R1) - if (paired) outputFiles += ("fastq_input_R2" -> input_R2.get) + outputFiles += ("fastq_input_R1" -> inputR1) + if (paired) outputFiles += ("fastq_input_R2" -> inputR2.get) - fastqc_R1 = Fastqc(this, input_R1, new File(outputDir, R1_name + ".fastqc/")) - add(fastqc_R1) - addSummarizable(fastqc_R1, "fastqc_R1") - outputFiles += ("fastqc_R1" -> fastqc_R1.output) + fastqcR1 = Fastqc(this, inputR1, new File(outputDir, R1Name + ".fastqc/")) + add(fastqcR1) + addSummarizable(fastqcR1, "fastqc_R1") + outputFiles += ("fastqc_R1" -> fastqcR1.output) val validateFastq = new ValidateFastq(this) - validateFastq.r1Fastq = input_R1 - validateFastq.r2Fastq = input_R2 + validateFastq.r1Fastq = inputR1 + validateFastq.r2Fastq = inputR2 validateFastq.jobOutputFile = new File(outputDir, ".validate_fastq.log.out") add(validateFastq) @@ -128,22 +128,22 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with } if (paired) { - fastqc_R2 = Fastqc(this, input_R2.get, new File(outputDir, R2_name + ".fastqc/")) - add(fastqc_R2) - addSummarizable(fastqc_R2, "fastqc_R2") - outputFiles += ("fastqc_R2" -> fastqc_R2.output) + fastqcR2 = Fastqc(this, inputR2.get, new File(outputDir, R2Name + ".fastqc/")) + add(fastqcR2) + addSummarizable(fastqcR2, "fastqc_R2") + outputFiles += ("fastqc_R2" -> fastqcR2.output) } - val seqstat_R1 = SeqStat(this, input_R1, outputDir) - seqstat_R1.isIntermediate = true - add(seqstat_R1) - addSummarizable(seqstat_R1, "seqstat_R1") + val seqstatR1 = SeqStat(this, inputR1, outputDir) + seqstatR1.isIntermediate = true + add(seqstatR1) + addSummarizable(seqstatR1, "seqstat_R1") if (paired) { - val seqstat_R2 = SeqStat(this, input_R2.get, outputDir) - seqstat_R2.isIntermediate = true - add(seqstat_R2) - addSummarizable(seqstat_R2, "seqstat_R2") + val seqstatR2 = SeqStat(this, inputR2.get, outputDir) + seqstatR2.isIntermediate = true + add(seqstatR2) + addSummarizable(seqstatR2, "seqstat_R2") } } @@ -176,17 +176,17 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with var R1 = R1_in var R2 = R2_in - val qcCmdR1 = new QcCommand(this, fastqc_R1) + val qcCmdR1 = new QcCommand(this, fastqcR1) qcCmdR1.input = R1_in qcCmdR1.read = "R1" qcCmdR1.output = if (paired) new File(outDir, fastqR1Qc.getName.stripSuffix(".gz")) else fastqR1Qc - qcCmdR1.deps :+= fastqc_R1.output + qcCmdR1.deps :+= fastqcR1.output qcCmdR1.isIntermediate = paired || !keepQcFastqFiles addSummarizable(qcCmdR1, "qc_command_R1") if (paired) { - val qcCmdR2 = new QcCommand(this, fastqc_R2) + val qcCmdR2 = new QcCommand(this, fastqcR2) qcCmdR2.input = R2_in.get qcCmdR2.output = new File(outDir, fastqR2Qc.get.getName.stripSuffix(".gz")) qcCmdR2.read = "R2" @@ -222,8 +222,8 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with } } - pipe.deps ::= fastqc_R1.output - pipe.deps ::= fastqc_R2.output + pipe.deps ::= fastqcR1.output + pipe.deps ::= fastqcR2.output pipe.isIntermediate = !keepQcFastqFiles add(pipe) @@ -236,14 +236,14 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with R1 = qcCmdR1.output } - val seqstat_R1_after = SeqStat(this, R1, outDir) - add(seqstat_R1_after) - addSummarizable(seqstat_R1_after, "seqstat_R1_qc") + val seqstatR1After = SeqStat(this, R1, outDir) + add(seqstatR1After) + addSummarizable(seqstatR1After, "seqstat_R1_qc") if (paired) { - val seqstat_R2_after = SeqStat(this, R2.get, outDir) - add(seqstat_R2_after) - addSummarizable(seqstat_R2_after, "seqstat_R2_qc") + val seqstatR2After = SeqStat(this, R2.get, outDir) + add(seqstatR2After) + addSummarizable(seqstatR2After, "seqstat_R2_qc") } outputFiles += (chunk + "output_R1" -> R1) @@ -283,14 +283,14 @@ class Flexiprep(val root: Configurable) extends QScript with SummaryQScript with outputFiles += ("output_R1_gzip" -> fastqR1Qc) if (paired) outputFiles += ("output_R2_gzip" -> fastqR2Qc.get) - fastqc_R1_after = Fastqc(this, fastqR1Qc, new File(outputDir, R1_name + ".qc.fastqc/")) - add(fastqc_R1_after) - addSummarizable(fastqc_R1_after, "fastqc_R1_qc") + fastqcR1After = Fastqc(this, fastqR1Qc, new File(outputDir, R1Name + ".qc.fastqc/")) + add(fastqcR1After) + addSummarizable(fastqcR1After, "fastqc_R1_qc") if (paired) { - fastqc_R2_after = Fastqc(this, fastqR2Qc.get, new File(outputDir, R2_name + ".qc.fastqc/")) - add(fastqc_R2_after) - addSummarizable(fastqc_R2_after, "fastqc_R2_qc") + fastqcR2After = Fastqc(this, fastqR2Qc.get, new File(outputDir, R2Name + ".qc.fastqc/")) + add(fastqcR2After) + addSummarizable(fastqcR2After, "fastqc_R2_qc") } addSummaryJobs() diff --git a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepReport.scala b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepReport.scala index c715f1338e944bd560b6c2103615e620398b88fc..83b78b657cdeed9e8110109c50abd7fdaafe6c00 100644 --- a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepReport.scala +++ b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepReport.scala @@ -62,7 +62,7 @@ object FlexiprepReport extends ReportBuilder { fastqcPlotSection("Sequence quality", "plot_per_sequence_quality"), fastqcPlotSection("Base GC content", "plot_per_base_gc_content"), fastqcPlotSection("Sequence GC content", "plot_per_sequence_gc_content"), - fastqcPlotSection("Base seqeunce content", "plot_per_base_sequence_content"), + fastqcPlotSection("Base sequence content", "plot_per_base_sequence_content"), fastqcPlotSection("Duplication", "plot_duplication_levels"), fastqcPlotSection("Kmers", "plot_kmer_profiles"), fastqcPlotSection("Length distribution", "plot_sequence_length_distribution") @@ -71,7 +71,7 @@ object FlexiprepReport extends ReportBuilder { ) protected def fastqcPlotSection(name: String, tag: String) = { - name -> ReportSection("/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastaqcPlot.ssp", Map("plot" -> tag)) + name -> ReportSection("/nl/lumc/sasc/biopet/pipelines/flexiprep/flexiprepFastQcPlot.ssp", Map("plot" -> tag)) } /** diff --git a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/QcCommand.scala b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/QcCommand.scala index 2b05933820c2b985f733ac5c8df0c44e8163ed55..5d92bf92b5aa8fe4b3c98a2dbc71efa76c5a2588 100644 --- a/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/QcCommand.scala +++ b/public/flexiprep/src/main/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/QcCommand.scala @@ -115,18 +115,18 @@ class QcCommand(val root: Configurable, val fastqc: Fastqc) extends BiopetComman trim = if (!flexiprep.skipTrim) { val sickle = new Sickle(root) - sickle.output_stats = new File(flexiprep.outputDir, s"${flexiprep.sampleId.getOrElse("x")}-${flexiprep.libId.getOrElse("x")}.$read.trim.stats") - sickle.input_R1 = clip match { + sickle.outputStats = new File(flexiprep.outputDir, s"${flexiprep.sampleId.getOrElse("x")}-${flexiprep.libId.getOrElse("x")}.$read.trim.stats") + sickle.inputR1 = clip match { case Some(c) => c.fastqOutput case _ => seqtk.output } - sickle.output_R1 = new File(output.getParentFile, input.getName + ".sickle.fq") + sickle.outputR1 = new File(output.getParentFile, input.getName + ".sickle.fq") addPipeJob(sickle) Some(sickle) } else None val outputFile = (clip, trim) match { - case (_, Some(t)) => t.output_R1 + case (_, Some(t)) => t.outputR1 case (Some(c), _) => c.fastqOutput case _ => seqtk.output } diff --git a/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FastqcV0101Test.scala b/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FastqcV0101Test.scala index 0951bea84834b611c323c8e0b1b77ae55f0461b1..4cb68fdfc44d5a30c3ed76aabc9570d6f62529f3 100644 --- a/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FastqcV0101Test.scala +++ b/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FastqcV0101Test.scala @@ -71,10 +71,16 @@ class FastqcV0101Test extends TestNGSuite with Matchers { val fqc = new Fastqc(null) fqc.output = outputv0101 fqc.contaminants = Option(resourceFile("fqc_contaminants_v0101.txt")) + // found adapters also contain the adapters in reverse complement (done within flexiprep/fastqc only) val adapters = fqc.foundAdapters - adapters.size shouldBe 1 - adapters.head.name should ===("TruSeq Adapter, Index 1") - // from fqc_contaminants_v0101.txt - adapters.head.seq should ===("GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG") + // we find 1 adapter which comes with the Reverse Complement counterpart + adapters.size shouldBe 2 + + adapters.head.name shouldEqual "TruSeq Adapter, Index 1_RC" + adapters.head.seq shouldEqual "CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATC" + + adapters.last.name shouldEqual "TruSeq Adapter, Index 1" + adapters.last.seq shouldEqual "GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG" + } } \ No newline at end of file diff --git a/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepTest.scala b/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepTest.scala index ee5e5a291f6c105b79effa7b7af2179e65253417..2af324fad3d2815be4c51d191e4b849692c81945 100644 --- a/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepTest.scala +++ b/public/flexiprep/src/test/scala/nl/lumc/sasc/biopet/pipelines/flexiprep/FlexiprepTest.scala @@ -70,8 +70,8 @@ class FlexiprepTest extends TestNGSuite with Matchers { ), Map(FlexiprepTest.executables.toSeq: _*)) val flexiprep: Flexiprep = initPipeline(map) - flexiprep.input_R1 = (if (zipped) FlexiprepTest.r1Zipped else FlexiprepTest.r1) - if (paired) flexiprep.input_R2 = Some((if (zipped) FlexiprepTest.r2Zipped else FlexiprepTest.r2)) + flexiprep.inputR1 = (if (zipped) FlexiprepTest.r1Zipped else FlexiprepTest.r1) + if (paired) flexiprep.inputR2 = Some((if (zipped) FlexiprepTest.r2Zipped else FlexiprepTest.r2)) flexiprep.sampleId = Some("1") flexiprep.libId = Some("1") flexiprep.script() diff --git a/public/gears/pom.xml b/public/gears/pom.xml index 07c199380f19a5bc1e295af96ca33ab49d83b4c9..c436ee0fd0d8b40238a4392f4e9c3c3a27debaa5 100644 --- a/public/gears/pom.xml +++ b/public/gears/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/gears/src/main/resources/nl/lumc/sasc/biopet/pipelines/gears/report/ext/js/krona-2.0.js b/public/gears/src/main/resources/nl/lumc/sasc/biopet/pipelines/gears/report/ext/js/krona-2.0.js index 74025705723f56045b5cbb058d4b597271a8df20..4593453af4a00057bf7b31205d829c1433a86587 100644 --- a/public/gears/src/main/resources/nl/lumc/sasc/biopet/pipelines/gears/report/ext/js/krona-2.0.js +++ b/public/gears/src/main/resources/nl/lumc/sasc/biopet/pipelines/gears/report/ext/js/krona-2.0.js @@ -1,3 +1,18 @@ +/* + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ {//----------------------------------------------------------------------------- // // PURPOSE diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/CombineReads.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/CombineReads.scala index af2f46af6b0a88e7e09804f4d35ac723cf0a1481..634ff6c697e73f7a4125fcb1e3dade7f1948d180 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/CombineReads.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/CombineReads.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import nl.lumc.sasc.biopet.core.SampleLibraryTag diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/ExtractUnmappedReads.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/ExtractUnmappedReads.scala index 5d64ef6ffddf3e7715ad9508415bc9224c7daef5..eb98b725afa1fb497421fe410b83c9c10e0195d3 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/ExtractUnmappedReads.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/ExtractUnmappedReads.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import nl.lumc.sasc.biopet.core.BiopetQScript diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/Gears.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/Gears.scala index 37c6f110acab86db6ffb243037e7b5a1d7c22aed..ac2348ed8e66b6c099a735e333f2e75433e29fcb 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/Gears.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/Gears.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import nl.lumc.sasc.biopet.core.BiopetQScript.InputFile @@ -97,8 +112,8 @@ class Gears(val root: Configurable) extends QScript with MultiSampleQScript { qs lazy val flexiprep = new Flexiprep(qscript) flexiprep.sampleId = Some(sampleId) flexiprep.libId = Some(libId) - flexiprep.input_R1 = config("R1") - flexiprep.input_R2 = config("R2") + flexiprep.inputR1 = config("R1") + flexiprep.inputR2 = config("R2") flexiprep.outputDir = new File(libDir, "flexiprep") lazy val gs = new GearsSingle(qscript) @@ -108,8 +123,8 @@ class Gears(val root: Configurable) extends QScript with MultiSampleQScript { qs /** Function that add library jobs */ protected def addJobs(): Unit = { - inputFiles :+= InputFile(flexiprep.input_R1, config("R1_md5")) - flexiprep.input_R2.foreach(inputFiles :+= InputFile(_, config("R2_md5"))) + inputFiles :+= InputFile(flexiprep.inputR1, config("R1_md5")) + flexiprep.inputR2.foreach(inputFiles :+= InputFile(_, config("R2_md5"))) add(flexiprep) gs.fastqR1 = Some(flexiprep.fastqR1Qc) diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKraken.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKraken.scala index 3e1a7b48d88360b8b473929a746c0a778265daa3..e908d94b254a0dbfbad021447e36bb9fa8f7ab0b 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKraken.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKraken.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import java.io.{ File, PrintWriter } @@ -58,18 +73,18 @@ class GearsKraken(val root: Configurable) extends QScript with SummaryQScript wi krakenAnalysis.paired = fastqR2.isDefined - krakenAnalysis.classified_out = Some(new File(outputDir, s"$outputName.krkn.classified.fastq")) - krakenAnalysis.unclassified_out = Some(new File(outputDir, s"$outputName.krkn.unclassified.fastq")) + krakenAnalysis.classifiedOut = Some(new File(outputDir, s"$outputName.krkn.classified.fastq")) + krakenAnalysis.unclassifiedOut = Some(new File(outputDir, s"$outputName.krkn.unclassified.fastq")) add(krakenAnalysis) outputFiles += ("kraken_output_raw" -> krakenAnalysis.output) - outputFiles += ("kraken_classified_out" -> krakenAnalysis.classified_out.getOrElse("")) - outputFiles += ("kraken_unclassified_out" -> krakenAnalysis.unclassified_out.getOrElse("")) + outputFiles += ("kraken_classified_out" -> krakenAnalysis.classifiedOut.getOrElse("")) + outputFiles += ("kraken_unclassified_out" -> krakenAnalysis.unclassifiedOut.getOrElse("")) // create kraken summary file val krakenReport = new KrakenReport(this) krakenReport.input = krakenAnalysis.output - krakenReport.show_zeros = true + krakenReport.showZeros = true krakenReport.output = new File(outputDir, s"$outputName.krkn.full") add(krakenReport) diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosed.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosed.scala index 7fa102d1111b64967c2b5a71088729feed06c804..9005e72dd6f282cf702f039b7585b2b02d0e7018 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosed.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosed.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import java.io.{ File, PrintWriter } @@ -42,7 +57,7 @@ class GearsQiimeClosed(val root: Configurable) extends QScript with SummaryQScri val splitLib = new SplitLibrariesFastq(this) splitLib.input :+= fastqInput splitLib.outputDir = new File(outputDir, "split_libraries_fastq") - sampleId.foreach(splitLib.sample_ids :+= _) + sampleId.foreach(splitLib.sampleIds :+= _) add(splitLib) val closedReference = new PickClosedReferenceOtus(this) diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeRtax.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeRtax.scala index f50a38a8652c4f14f26a62eb3f4a55f6a0cce9f9..106480489e426b7f9738da9429478e89d3de7793 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeRtax.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeRtax.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import nl.lumc.sasc.biopet.core.{ SampleLibraryTag, BiopetQScript } @@ -45,14 +60,14 @@ class GearsQiimeRtax(val root: Configurable) extends QScript with BiopetQScript val slfR1 = new SplitLibrariesFastq(this) slfR1.input :+= fastqR1 slfR1.outputDir = new File(outputDir, "split_libraries_fastq_R1") - sampleId.foreach(slfR1.sample_ids :+= _) + sampleId.foreach(slfR1.sampleIds :+= _) add(slfR1) lazy val slfR2 = fastqR2.map { file => val j = new SplitLibrariesFastq(this) j.input :+= file j.outputDir = new File(outputDir, "split_libraries_fastq_R2") - sampleId.foreach(j.sample_ids :+= _) + sampleId.foreach(j.sampleIds :+= _) add(j) j } @@ -75,8 +90,8 @@ class GearsQiimeRtax(val root: Configurable) extends QScript with BiopetQScript assignTaxonomy.outputDir = new File(outputDir, "assign_taxonomy") assignTaxonomy.jobOutputFile = new File(assignTaxonomy.outputDir, ".assign_taxonomy.out") assignTaxonomy.inputFasta = pickRepSet.outputFasta.get - assignTaxonomy.read_1_seqs_fp = Some(slfR1.outputSeqs) - assignTaxonomy.read_2_seqs_fp = slfR2.map(_.outputSeqs) + assignTaxonomy.read1SeqsFp = Some(slfR1.outputSeqs) + assignTaxonomy.read2SeqsFp = slfR2.map(_.outputSeqs) add(assignTaxonomy) } } diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsReport.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsReport.scala index aa819f08c3ddd78d96853e88c4f660417f89ae8f..63426654522fec2b4df03c272e96bcf6e174f817 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsReport.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsReport.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import java.io.File diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSeqCount.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSeqCount.scala index 70b850ce1ae85d8f9dcf70cab4484e126b6eb589..a81df0bb04062440372201e79bfafd38009a478e 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSeqCount.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSeqCount.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import nl.lumc.sasc.biopet.core.{ BiopetQScript, SampleLibraryTag } diff --git a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSingle.scala b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSingle.scala index abe9edf5a78e486caea88625c7af12f878483783..a01b088b47224078cc36f2f937715d10d3e5adf8 100644 --- a/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSingle.scala +++ b/public/gears/src/main/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsSingle.scala @@ -79,8 +79,8 @@ class GearsSingle(val root: Configurable) extends QScript with SummaryQScript wi protected def executeFlexiprep(r1: File, r2: Option[File]): (File, Option[File]) = { if (!skipFlexiprep) { val flexiprep = new Flexiprep(this) - flexiprep.input_R1 = r1 - flexiprep.input_R2 = r2 + flexiprep.inputR1 = r1 + flexiprep.inputR2 = r2 flexiprep.outputDir = new File(outputDir, "flexiprep") add(flexiprep) (flexiprep.fastqR1Qc, flexiprep.fastqR2Qc) diff --git a/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKrakenTest.scala b/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKrakenTest.scala index 5f209cf0bb1ba9a3c8c4091e98ed842015fc6d00..150b22b058372537fd9db85823686047fa4756ae 100644 --- a/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKrakenTest.scala +++ b/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsKrakenTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import java.io.File diff --git a/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosedTest.scala b/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosedTest.scala index 58fdd3aaafc2e3d6e798c458f39d2ea8f0601b67..e314866be64643a4baad988629ff6e5c1c152503 100644 --- a/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosedTest.scala +++ b/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsQiimeClosedTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import java.io.File diff --git a/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsTest.scala b/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsTest.scala index 638d21b21c1c6669360cb7fa7b13fd27a0d3806e..cda46755da89d7af2679a77f4e23edb142567fcb 100644 --- a/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsTest.scala +++ b/public/gears/src/test/scala/nl/lumc/sasc/biopet/pipelines/gears/GearsTest.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gears import java.io.File @@ -27,7 +42,7 @@ class GearsTest extends TestNGSuite with Matchers { } @DataProvider(name = "gearsOptions") - def shivaOptions = { + def gearsOptions = { val bool = Array(true, false) for ( diff --git a/public/generate-indexes/pom.xml b/public/generate-indexes/pom.xml index 5e748e8636e83a1e5f49d613e9f34bfb8c1fb806..36a972c9cadc1d8c1a131fe76df9b0df3f4c5b46 100644 --- a/public/generate-indexes/pom.xml +++ b/public/generate-indexes/pom.xml @@ -27,7 +27,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/gentrap/pom.xml b/public/gentrap/pom.xml index 2a43fbddea3f257e9955ccf74a9c62360e15c2d1..33fc9fd99d3cf3e7f822fce8e6d83b0c6ac6c0ff 100644 --- a/public/gentrap/pom.xml +++ b/public/gentrap/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/measure_plotreport.ssp b/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/measure_plotreport.ssp new file mode 100644 index 0000000000000000000000000000000000000000..9193dd91157a5bd739c03bc13117dac9c84c6db7 --- /dev/null +++ b/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/measure_plotreport.ssp @@ -0,0 +1,37 @@ +#import(nl.lumc.sasc.biopet.utils.summary.Summary) +#import(org.apache.commons.io.FileUtils) +#import(java.io.File) +<%@ var summary: Summary %> +<%@ var outputDir: File %> +<%@ var pipelineName: String %> +<%@ var plotPath: Option[Any] %> +#{ + +def getPlot(path:Option[Any], targetPath:String) = { + path.collect { + case value => + new File(targetPath).mkdirs() + val file = new File(value.toString) + val newFile = new File(outputDir, targetPath + File.separator + file.getName) + + if (file.exists()) FileUtils.copyFile(file, newFile) + targetPath + File.separator + file.getName + } +} + +}# +<div class="panel-body"> +<!-- Table --> +<table class="table"> +<thead> +</thead> +<tbody> + <tr> + <td> + <img src="${getPlot(plotPath, "measurements" )}" class="img-responsive center-block" /> + </td> + </tr> +</tbody> +</table> + +</div> diff --git a/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/scripts/plot_heatmap.R b/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/scripts/plot_heatmap.R index 3f5d8bb5515b84e05f1a7e351851a82481843bf4..c12058b385d069f22c0bdd3186d7a9a65f579dcf 100755 --- a/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/scripts/plot_heatmap.R +++ b/public/gentrap/src/main/resources/nl/lumc/sasc/biopet/pipelines/gentrap/scripts/plot_heatmap.R @@ -118,8 +118,8 @@ plotHeatmap <- function(in.data, out.name=OUTPUT.PLOT, count.type=COUNT.TYPE, tm img.margin <- 8 } else { img.len <- 800 - img.margin <- max(11, max(sapply(rownames(in.data), nchar))) - } + img.margin <- min(16, max(sapply(rownames(in.data), nchar))) +} png(out.name, height=img.len, width=img.len, res=100) heatmap.2(in.data, col=brewer.pal(9, "YlGnBu"), trace="none", density.info="histogram", main=title, margins=c(img.margin, img.margin)) diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/Gentrap.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/Gentrap.scala index adee53966eac02109d44bf402dea34d83a5a55d4..37f3cf7a4894f58fe9dabef41811c388898b6267 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/Gentrap.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/Gentrap.scala @@ -215,10 +215,10 @@ class Gentrap(val root: Configurable) extends QScript } else bamFile /** Whether all libraries are paired or not */ - def allPaired: Boolean = libraries.values.forall(_.mapping.forall(_.input_R2.isDefined)) + def allPaired: Boolean = libraries.values.forall(_.mapping.forall(_.inputR2.isDefined)) /** Whether all libraries are single or not */ - def allSingle: Boolean = libraries.values.forall(_.mapping.forall(_.input_R2.isEmpty)) + def allSingle: Boolean = libraries.values.forall(_.mapping.forall(_.inputR2.isEmpty)) /** Adds all jobs for the sample */ override def addJobs(): Unit = { diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/BaseCounts.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/BaseCounts.scala index ef1d47e45d7cc953d24725debb89e92499dbfafd..0ba6640de4ace503c750ef1e6a44acf90bd5477c 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/BaseCounts.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/BaseCounts.scala @@ -1,8 +1,22 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.core.annotations.AnnotationRefFlat import nl.lumc.sasc.biopet.extensions.tools.BaseCounter -import nl.lumc.sasc.biopet.pipelines.gentrap.Gentrap import nl.lumc.sasc.biopet.utils.config.Configurable import org.broadinstitute.gatk.queue.QScript diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksBlind.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksBlind.scala index 8b963c3315a33cf8c484aa2ce1cb68b9abdc8e90..7e4ffa0b32db16d4d690dab566ea3d3e23f9b2aa 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksBlind.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksBlind.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.core.annotations.AnnotationGtf diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksGuided.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksGuided.scala index 30bdcb5893227c0c53e052317ac8984b088c4610..a555585a561df62c405e7549ca70eb7e92093349 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksGuided.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksGuided.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.core.annotations.AnnotationGtf @@ -10,7 +25,7 @@ import org.broadinstitute.gatk.queue.QScript class CufflinksGuided(val root: Configurable) extends QScript with CufflinksMeasurement with AnnotationGtf { override def makeCufflinksJob(id: String, bamFile: File) = { val cufflinks = super.makeCufflinksJob(id, bamFile) - cufflinks.GTF_guide = Some(annotationGtf) + cufflinks.gtfGuide = Some(annotationGtf) cufflinks } } diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksMeasurement.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksMeasurement.scala index 26616991e3b74985e38431e4cc19e99bc785d389..54e14057cee8a0c194793203dfdd908954799e49 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksMeasurement.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksMeasurement.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.extensions.{ Ln, Cufflinks } @@ -11,7 +26,7 @@ trait CufflinksMeasurement extends QScript with Measurement { def makeCufflinksJob(id: String, bamFile: File) = { val cufflinks = new Cufflinks(this) cufflinks.input = bamFile - cufflinks.output_dir = new File(outputDir, id) + cufflinks.outputDir = new File(outputDir, id) cufflinks } @@ -25,14 +40,14 @@ trait CufflinksMeasurement extends QScript with Measurement { val genesFpkmFiles = jobs.toList.map { case (id, job) => - val file = new File(job.output_dir, s"$id.genes_fpkm.counts") + val file = new File(job.outputDir, s"$id.genes_fpkm.counts") add(Ln(this, job.outputGenesFpkm, file)) file } val isoFormFpkmFiles = jobs.toList.map { case (id, job) => - val file = new File(job.output_dir, s"$id.iso_form_fpkn.counts") + val file = new File(job.outputDir, s"$id.iso_form_fpkn.counts") add(Ln(this, job.outputIsoformsFpkm, file)) file } diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksStrict.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksStrict.scala index 7a531820f4a1317d0a5c8730c6829ef4fe2415aa..685e36033ecc8ce52f3327a28ec15d70df6cfda2 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksStrict.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/CufflinksStrict.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.core.annotations.AnnotationGtf diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerExon.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerExon.scala index dbca36eed14d1203473b5b1d5d5ddaa33846733a..c68b7ffcf7fa4288d072edea455f38a290e0be88 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerExon.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerExon.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.utils.config.Configurable diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerGene.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerGene.scala index e6323699e4a488cf2c26b20ceb0ebeab3fbbf1b9..bcfc417864cc16279e0a886306f0ebce1ca1816c 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerGene.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/FragmentsPerGene.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.core.annotations.AnnotationGtf @@ -9,7 +24,9 @@ import org.broadinstitute.gatk.queue.QScript * Created by pjvan_thof on 1/12/16. */ class FragmentsPerGene(val root: Configurable) extends QScript with Measurement with AnnotationGtf { - def mergeArgs = MergeArgs(List(1), 2, numHeaderLines = 1, fallback = "0") + def mergeArgs = MergeArgs(idCols = List(1), valCol = 2, numHeaderLines = 0, fallback = "0") + + override def fixedValues: Map[String, Any] = Map("htseqcount" -> Map("order" -> "pos")) /** Pipeline itself */ def biopetScript(): Unit = { @@ -23,10 +40,6 @@ class FragmentsPerGene(val root: Configurable) extends QScript with Measurement job.output = new File(outputDir, s"$id.$name.counts") job.format = Option("bam") add(job) - // We are forcing the sort order to be ID-sorted, since HTSeq-count often chokes when using position-sorting due - // to its buffer not being large enough. - //TODO: ID sorting job - //job.order = Option("name") id -> job } diff --git a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/Measurement.scala b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/Measurement.scala index c3d389b0e8219320df6b945a74bda72a496293db..b07d295a88d9044f351f738320c23692f53711bf 100644 --- a/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/Measurement.scala +++ b/public/gentrap/src/main/scala/nl/lumc/sasc/biopet/pipelines/gentrap/measures/Measurement.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.gentrap.measures import nl.lumc.sasc.biopet.core.Reference @@ -14,7 +29,7 @@ trait Measurement extends SummaryQScript with Reference { qscript: QScript => /** * Method to add a bamFile to the pipeline - * @param id Uniqe id used for this bam file, most likly to be a sampleName + * @param id Unique id used for this bam file, most likely to be a sampleName * @param file Location of the bam file */ def addBamfile(id: String, file: File): Unit = { diff --git a/public/gentrap/src/test/scala/nl/lumc/sasc/biopet/pipelines/gentrap/GentrapTest.scala b/public/gentrap/src/test/scala/nl/lumc/sasc/biopet/pipelines/gentrap/GentrapTest.scala index 9125f75074bdea460841803fba7508bd3af3900b..73a25ef561cb2363f7b4e930b8e0fb5187a8827a 100644 --- a/public/gentrap/src/test/scala/nl/lumc/sasc/biopet/pipelines/gentrap/GentrapTest.scala +++ b/public/gentrap/src/test/scala/nl/lumc/sasc/biopet/pipelines/gentrap/GentrapTest.scala @@ -15,7 +15,7 @@ */ package nl.lumc.sasc.biopet.pipelines.gentrap -import java.io.{File, FileOutputStream} +import java.io.{ File, FileOutputStream } import com.google.common.io.Files import nl.lumc.sasc.biopet.extensions._ @@ -25,7 +25,7 @@ import nl.lumc.sasc.biopet.utils.config.Config import org.broadinstitute.gatk.queue.QSettings import org.scalatest.Matchers import org.scalatest.testng.TestNGSuite -import org.testng.annotations.{DataProvider, Test} +import org.testng.annotations.{ DataProvider, Test } abstract class GentrapTestAbstract(val expressionMeasure: String) extends TestNGSuite with Matchers { diff --git a/public/kopisu/pom.xml b/public/kopisu/pom.xml index 720016ce0a8af0070bbdc9b0560c2d72ab5b4761..ecbd27e5c70a608daf228fe8c6a60d9bbeae4a82 100644 --- a/public/kopisu/pom.xml +++ b/public/kopisu/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/mapping/pom.xml b/public/mapping/pom.xml index 0ce243639822c6b338ad71ceafc3f08500818824..deff3be527f6f23fb935b56e7c24d14f7b3396ab 100644 --- a/public/mapping/pom.xml +++ b/public/mapping/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/multisampleMappingFront.ssp b/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/multisampleMappingFront.ssp index a7027c4681c22f8119456fd445c5ddf83bb2366b..1f31856e4e3069afb64ee97306789fb98d047743 100644 --- a/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/multisampleMappingFront.ssp +++ b/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/multisampleMappingFront.ssp @@ -2,14 +2,15 @@ #import(nl.lumc.sasc.biopet.core.report.ReportPage) <%@ var summary: Summary %> <%@ var rootPath: String %> +<%@ var pipeline: String %> <table class="table"> <tbody> - <tr><th>Pipeline</th><td>Shiva</td></tr> + <tr><th>Pipeline</th><td>${pipeline}</td></tr> <tr><th>Version</th><td>${summary.getValue("meta", "pipeline_version")}</td></tr> <tr><th>Last commit hash</th><td>${summary.getValue("meta", "last_commit_hash")}</td></tr> <tr><th>Output directory</th><td>${summary.getValue("meta", "output_dir")}</td></tr> - <tr><th>Reference</th><td>${summary.getValue("shiva", "settings", "reference", "species")} - ${summary.getValue("shiva", "settings", "reference", "name")}</td></tr> + <tr><th>Reference</th><td>${summary.getValue(pipeline, "settings", "reference", "species")} - ${summary.getValue(pipeline, "settings", "reference", "name")}</td></tr> <tr><th>Number of samples</th><td>${summary.samples.size}</td></tr> </tbody> </table> @@ -18,7 +19,7 @@ <div class="col-md-1"></div> <div class="col-md-6"> <p> - In this web document you can find your <em>Shiva</em> pipeline report. + In this web document you can find your <em>${pipeline}</em> pipeline report. Different categories of data can be found in the left-side menu. Statistics per sample and library can be accessed through the top-level menu. Some statistics for target regions can be found in the regions tab. diff --git a/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/outputBamfiles.ssp b/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/outputBamfiles.ssp index 41d8249e75c416bebc27a3acff58c1e1498e17ae..9d6a6932c385495d07945f97e4f399680e4828bf 100644 --- a/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/outputBamfiles.ssp +++ b/public/mapping/src/main/resources/nl/lumc/sasc/biopet/pipelines/mapping/outputBamfiles.ssp @@ -40,8 +40,8 @@ #{ val libs = (libId, sampleLevel) match { case (_, true) => List("") - case (Some(libId), _) => List(libId.toString) - case _ => summary.libraries(sample).toList + case (Some(libId), _) => List(libId.toString).sorted + case _ => summary.libraries(sample).toList.sorted } }# <tr><td rowspan="${libs.size}"><a href="${rootPath}Samples/${sample}/index.html">${sample}</a></td> diff --git a/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/Mapping.scala b/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/Mapping.scala index d37f8031b0f22bc06384b1d2febe7f7a58c49196..47532b3297d6b96fe99391e3b39a657a22ca7b9e 100644 --- a/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/Mapping.scala +++ b/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/Mapping.scala @@ -41,11 +41,11 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S def this() = this(null) - @Input(doc = "R1 fastq file", shortName = "R1", required = true) - var input_R1: File = _ + @Input(doc = "R1 fastq file", shortName = "R1", fullName = "inputR1", required = true) + var inputR1: File = _ - @Input(doc = "R2 fastq file", shortName = "R2", required = false) - var input_R2: Option[File] = None + @Input(doc = "R2 fastq file", shortName = "R2", fullName = "inputR2", required = false) + var inputR2: Option[File] = None /** Output name */ var outputName: String = _ @@ -107,9 +107,16 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S ) /** File to add to the summary */ - def summaryFiles: Map[String, File] = Map("output_bam" -> finalBamFile, "input_R1" -> input_R1, + def summaryFiles: Map[String, File] = Map("output_bam" -> finalBamFile, "input_R1" -> inputR1, "reference" -> referenceFasta()) ++ - (if (input_R2.isDefined) Map("input_R2" -> input_R2.get) else Map()) + (if (inputR2.isDefined) Map("input_R2" -> inputR2.get) else Map()) ++ + (bam2wig match { + case Some(b) => Map( + "output_wigle" -> b.outputWigleFile, + "output_tdf" -> b.outputTdfFile, + "output_bigwig" -> b.outputBwFile) + case _ => Map() + }) /** Settings to add to summary */ def summarySettings = Map( @@ -134,14 +141,14 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S /** Will be executed before script */ def init() { require(outputDir != null, "Missing output directory on mapping module") - require(input_R1 != null, "Missing output directory on mapping module") + require(inputR1 != null, "Missing output directory on mapping module") require(sampleId.isDefined, "Missing sample ID on mapping module") require(libId.isDefined, "Missing library ID on mapping module") - inputFiles :+= new InputFile(input_R1) - input_R2.foreach(inputFiles :+= new InputFile(_)) + inputFiles :+= new InputFile(inputR1) + inputR2.foreach(inputFiles :+= new InputFile(_)) - paired = input_R2.isDefined + paired = inputR2.isDefined if (readgroupId == null) readgroupId = config("readgroup_id", default = sampleId.get + "-" + libId.get) @@ -153,8 +160,8 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S if (config.contains("numberchunks")) numberChunks = config("numberchunks", default = None) else { val chunkSize: Int = config("chunksize", 1 << 30) - val filesize = if (input_R1.getName.endsWith(".gz") || input_R1.getName.endsWith(".gzip")) input_R1.length * 3 - else input_R1.length + val filesize = if (inputR1.getName.endsWith(".gz") || inputR1.getName.endsWith(".gzip")) inputR1.length * 3 + else inputR1.length numberChunks = Option(ceil(filesize.toDouble / chunkSize).toInt) } } @@ -166,40 +173,40 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S def biopetScript() { if (!skipFlexiprep) { flexiprep.outputDir = new File(outputDir, "flexiprep") - flexiprep.input_R1 = input_R1 - flexiprep.input_R2 = input_R2 + flexiprep.inputR1 = inputR1 + flexiprep.inputR2 = inputR2 flexiprep.sampleId = this.sampleId flexiprep.libId = this.libId flexiprep.init() flexiprep.runInitialJobs() } var bamFiles: List[File] = Nil - var fastq_R1_output: List[File] = Nil - var fastq_R2_output: List[File] = Nil + var fastqR1Output: List[File] = Nil + var fastqR2Output: List[File] = Nil val chunks: Map[File, (File, Option[File])] = { if (chunking) (for (t <- 1 to numberChunks.getOrElse(1)) yield { val chunkDir = new File(outputDir, "chunks" + File.separator + t) - chunkDir -> (new File(chunkDir, input_R1.getName), - if (paired) Some(new File(chunkDir, input_R2.get.getName)) else None) + chunkDir -> (new File(chunkDir, inputR1.getName), + if (paired) Some(new File(chunkDir, inputR2.get.getName)) else None) }).toMap - else if (skipFlexiprep) Map(outputDir -> (input_R1, if (paired) input_R2 else None)) - else Map(outputDir -> (flexiprep.input_R1, flexiprep.input_R2)) + else if (skipFlexiprep) Map(outputDir -> (inputR1, if (paired) inputR2 else None)) + else Map(outputDir -> (flexiprep.inputR1, flexiprep.inputR2)) } if (chunking) { - val fastSplitter_R1 = new FastqSplitter(this) - fastSplitter_R1.input = input_R1 - for ((chunkDir, fastqfile) <- chunks) fastSplitter_R1.output :+= fastqfile._1 - fastSplitter_R1.isIntermediate = true - add(fastSplitter_R1) + val fastSplitterR1 = new FastqSplitter(this) + fastSplitterR1.input = inputR1 + for ((chunkDir, fastqfile) <- chunks) fastSplitterR1.output :+= fastqfile._1 + fastSplitterR1.isIntermediate = true + add(fastSplitterR1) if (paired) { - val fastSplitter_R2 = new FastqSplitter(this) - fastSplitter_R2.input = input_R2.get - for ((chunkDir, fastqfile) <- chunks) fastSplitter_R2.output :+= fastqfile._2.get - fastSplitter_R2.isIntermediate = true - add(fastSplitter_R2) + val fastSplitterR2 = new FastqSplitter(this) + fastSplitterR2.input = inputR2.get + for ((chunkDir, fastqfile) <- chunks) fastSplitterR2.output :+= fastqfile._2.get + fastSplitterR2.isIntermediate = true + add(fastSplitterR2) } } @@ -211,8 +218,8 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S logger.debug(chunkDir + " - " + flexiout) R1 = flexiout._1 if (paired) R2 = flexiout._2 - fastq_R1_output :+= R1 - R2.foreach(R2 => fastq_R2_output :+= R2) + fastqR1Output :+= R1 + R2.foreach(R2 => fastqR2Output :+= R2) } val outputBam = new File(chunkDir, outputName + ".bam") @@ -234,7 +241,7 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S addAll(BamMetrics(this, outputBam, new File(chunkDir, "metrics"), sampleId, libId).functions) } if (!skipFlexiprep) { - flexiprep.runFinalize(fastq_R1_output, fastq_R2_output) + flexiprep.runFinalize(fastqR1Output, fastqR2Output) addAll(flexiprep.functions) // Add function of flexiprep to curent function pool addSummaryQScript(flexiprep) } @@ -270,12 +277,15 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S add(gears) } - if (config("generate_wig", default = false).asBoolean) - addAll(Bam2Wig(this, finalBamFile).functions) + bam2wig.foreach(add(_)) addSummaryJobs() } + protected lazy val bam2wig = if (config("generate_wig", default = false)) { + Some(Bam2Wig(this, finalBamFile)) + } else None + /** Add bwa aln jobs */ def addBwaAln(R1: File, R2: Option[File], output: File): File = { val bwaAlnR1 = new BwaAln(this) @@ -356,15 +366,15 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S val tophat = new Tophat(this) tophat.R1 = tophat.R1 :+ R1 if (paired) tophat.R2 = tophat.R2 :+ R2.get - tophat.output_dir = new File(outputDir, "tophat_out") + tophat.outputDir = new File(outputDir, "tophat_out") // always output BAM - tophat.no_convert_bam = false + tophat.noConvertBam = false // and always keep input ordering - tophat.keep_fasta_order = true + tophat.keepFastaOrder = true add(tophat) // fix unmapped file coordinates - val fixedUnmapped = new File(tophat.output_dir, "unmapped_fixup.sam") + val fixedUnmapped = new File(tophat.outputDir, "unmapped_fixup.sam") val fixer = new TophatRecondition(this) fixer.inputBam = tophat.outputAcceptedHits fixer.outputSam = fixedUnmapped.getAbsoluteFile @@ -372,14 +382,14 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S add(fixer) // sort fixed SAM file - val sorter = SortSam(this, fixer.outputSam, new File(tophat.output_dir, "unmapped_fixup.sorted.bam")) + val sorter = SortSam(this, fixer.outputSam, new File(tophat.outputDir, "unmapped_fixup.sorted.bam")) sorter.sortOrder = "coordinate" sorter.isIntermediate = true add(sorter) // merge with mapped file val mergeSamFile = MergeSamFiles(this, List(tophat.outputAcceptedHits, sorter.output), - new File(tophat.output_dir, "fixed_merged.bam"), sortOrder = "coordinate") + new File(tophat.outputDir, "fixed_merged.bam"), sortOrder = "coordinate") mergeSamFile.createIndex = true mergeSamFile.isIntermediate = true add(mergeSamFile) @@ -443,7 +453,7 @@ class Mapping(val root: Configurable) extends QScript with SummaryQScript with S /** Add bowtie2 jobs **/ def addBowtie2(R1: File, R2: Option[File], output: File): File = { val bowtie2 = new Bowtie2(this) - bowtie2.rg_id = Some(readgroupId) + bowtie2.rgId = Some(readgroupId) bowtie2.rg +:= ("LB:" + libId.get) bowtie2.rg +:= ("PL:" + platform) bowtie2.rg +:= ("PU:" + platformUnit) diff --git a/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingReport.scala b/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingReport.scala index f2933717de04c873281a38acd4809d0ab4bf2fed..845f1fc1ee76a8c513c57eaac5be5a1525b9415a 100644 --- a/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingReport.scala +++ b/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingReport.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.mapping import nl.lumc.sasc.biopet.core.report.{ ReportBuilderExtension, ReportSection, ReportPage, MultisampleReportBuilder } @@ -72,7 +87,7 @@ trait MultisampleMappingReportTrait extends MultisampleReportBuilder { Map("showPlot" -> true, "showTable" -> false)) ) else Nil), - pageArgs + pageArgs ++ Map("pipeline" -> pipelineName) ) } diff --git a/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingTrait.scala b/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingTrait.scala index a69160a6a7928add1503c8a332fb84978a3d03fd..f0d3c7313cef6f247ede75d1d2eab235f1d3bb7d 100644 --- a/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingTrait.scala +++ b/public/mapping/src/main/scala/nl/lumc/sasc/biopet/pipelines/mapping/MultisampleMappingTrait.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.mapping import java.io.File @@ -110,8 +125,8 @@ trait MultisampleMappingTrait extends MultiSampleQScript if (inputR1.isDefined) { mapping.foreach { m => - m.input_R1 = inputR1.get - m.input_R2 = inputR2 + m.inputR1 = inputR1.get + m.inputR2 = inputR2 add(m) } } else if (inputBam.isDefined) { @@ -122,8 +137,8 @@ trait MultisampleMappingTrait extends MultiSampleQScript samToFastq.isIntermediate = libraries.size > 1 qscript.add(samToFastq) mapping.foreach(m => { - m.input_R1 = samToFastq.fastqR1 - m.input_R2 = Some(samToFastq.fastqR2) + m.inputR1 = samToFastq.fastqR1 + m.inputR2 = Some(samToFastq.fastqR2) add(m) }) } else { diff --git a/public/mapping/src/test/scala/nl/lumc/sasc/biopet/pipelines/mapping/MappingTest.scala b/public/mapping/src/test/scala/nl/lumc/sasc/biopet/pipelines/mapping/MappingTest.scala index efd54e8d3c34c4eafdbf7a3a2ef6df11ca18b85a..7b8f289df558dfa2667fd44cbdcfc0b3874926b6 100644 --- a/public/mapping/src/test/scala/nl/lumc/sasc/biopet/pipelines/mapping/MappingTest.scala +++ b/public/mapping/src/test/scala/nl/lumc/sasc/biopet/pipelines/mapping/MappingTest.scala @@ -73,11 +73,11 @@ abstract class AbstractTestMapping(val aligner: String) extends TestNGSuite with val mapping: Mapping = initPipeline(map) if (zipped) { - mapping.input_R1 = r1Zipped - if (paired) mapping.input_R2 = Some(r2Zipped) + mapping.inputR1 = r1Zipped + if (paired) mapping.inputR2 = Some(r2Zipped) } else { - mapping.input_R1 = r1 - if (paired) mapping.input_R2 = Some(r2) + mapping.inputR1 = r1 + if (paired) mapping.inputR2 = Some(r2) } mapping.sampleId = Some("1") mapping.libId = Some("1") diff --git a/public/pom.xml b/public/pom.xml index 28cd0dc788e7b704d8930f94cd9b6a02759d57c6..4171089d31ef05ec2c544c29154b809638004dd5 100644 --- a/public/pom.xml +++ b/public/pom.xml @@ -22,25 +22,24 @@ <groupId>nl.lumc.sasc</groupId> <name>Biopet</name> <packaging>pom</packaging> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <modules> - <!--<module>biopet-framework</module>--> <module>biopet-public-package</module> + <module>bam2wig</module> <module>bammetrics</module> + <module>basty</module> + <module>carp</module> <module>flexiprep</module> + <module>gears</module> + <module>generate-indexes</module> <module>gentrap</module> + <module>kopisu</module> <module>mapping</module> <module>sage</module> - <module>kopisu</module> - <!-- <module>yamsvp</module> --> - <module>gears</module> - <module>bam2wig</module> - <module>carp</module> - <module>toucan</module> <module>shiva</module> - <module>basty</module> - <module>generate-indexes</module> + <module>tinycap</module> + <module>toucan</module> <module>biopet-core</module> <module>biopet-utils</module> <module>biopet-tools</module> diff --git a/public/sage/pom.xml b/public/sage/pom.xml index e3eab0bf3eac563b68aee8d71ff58acc7dba4875..b2070910037c653e38d70ef76be6cae3a36101f4 100644 --- a/public/sage/pom.xml +++ b/public/sage/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/sage/src/main/scala/nl/lumc/sasc/biopet/pipelines/sage/Sage.scala b/public/sage/src/main/scala/nl/lumc/sasc/biopet/pipelines/sage/Sage.scala index 8d28a6ea13e1e09ce16afbec4faccc29631945c8..3ef45bc4ea2f141fb6c5cbb17b043121d2d5c60e 100644 --- a/public/sage/src/main/scala/nl/lumc/sasc/biopet/pipelines/sage/Sage.scala +++ b/public/sage/src/main/scala/nl/lumc/sasc/biopet/pipelines/sage/Sage.scala @@ -92,7 +92,7 @@ class Sage(val root: Configurable) extends QScript with MultiSampleQScript { inputFiles :+= new InputFile(inputFastq, config("R1_md5")) flexiprep.outputDir = new File(libDir, "flexiprep/") - flexiprep.input_R1 = inputFastq + flexiprep.inputR1 = inputFastq flexiprep.init() flexiprep.biopetScript() qscript.addAll(flexiprep.functions) @@ -105,7 +105,7 @@ class Sage(val root: Configurable) extends QScript with MultiSampleQScript { pf.deps +:= flexiprep.outputFiles("fastq_input_R1") qscript.add(pf) - mapping.input_R1 = pf.outputFastq + mapping.inputR1 = pf.outputFastq mapping.outputDir = libDir mapping.init() mapping.biopetScript() diff --git a/public/shiva/pom.xml b/public/shiva/pom.xml index 2638bcfdc1fe39c5dce7c8e2c9a463ec8d8f2e48..dd49b39beb655a851b5893afec09a71c1427f081 100644 --- a/public/shiva/pom.xml +++ b/public/shiva/pom.xml @@ -22,7 +22,7 @@ <parent> <artifactId>Biopet</artifactId> <groupId>nl.lumc.sasc</groupId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> </parent> <modelVersion>4.0.0</modelVersion> diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCalling.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCalling.scala index 60931d62c77df32203f59d0a90891e64ce4c2beb..35fa0d4ae7a0625efcfa4e35adb546335e23470d 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCalling.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCalling.scala @@ -17,9 +17,10 @@ package nl.lumc.sasc.biopet.pipelines.shiva import nl.lumc.sasc.biopet.core.summary.SummaryQScript import nl.lumc.sasc.biopet.core.{ PipelineCommand, Reference, SampleLibraryTag } +import nl.lumc.sasc.biopet.extensions.Pysvtools import nl.lumc.sasc.biopet.pipelines.shiva.svcallers._ -import nl.lumc.sasc.biopet.utils.{ BamUtils, Logging } import nl.lumc.sasc.biopet.utils.config.Configurable +import nl.lumc.sasc.biopet.utils.{ BamUtils, Logging } import org.broadinstitute.gatk.queue.QScript /** @@ -32,6 +33,9 @@ class ShivaSvCalling(val root: Configurable) extends QScript with SummaryQScript def this() = this(null) + var outputMergedVCFbySample: Map[String, File] = Map() + var outputMergedVCF: File = _ + @Input(doc = "Bam files (should be deduped bams)", shortName = "BAM", required = true) protected[shiva] var inputBamsArg: List[File] = Nil @@ -40,6 +44,7 @@ class ShivaSvCalling(val root: Configurable) extends QScript with SummaryQScript /** Executed before script */ def init(): Unit = { if (inputBamsArg.nonEmpty) inputBams = BamUtils.sampleBamMap(inputBamsArg) + outputMergedVCF = new File(outputDir, "allsamples.merged.vcf") } /** Variantcallers requested by the config */ @@ -55,7 +60,7 @@ class ShivaSvCalling(val root: Configurable) extends QScript with SummaryQScript val callers = callersList.filter(x => configCallers.contains(x.name)) require(inputBams.nonEmpty, "No input bams found") - require(callers.nonEmpty, "must select at least 1 SV caller, choices are: " + callersList.map(_.name).mkString(", ")) + require(callers.nonEmpty, "Please select at least 1 SV caller, choices are: " + callersList.map(_.name).mkString(", ")) callers.foreach { caller => caller.inputBams = inputBams @@ -63,6 +68,30 @@ class ShivaSvCalling(val root: Configurable) extends QScript with SummaryQScript add(caller) } + // merge VCF by sample + for ((sample, bamFile) <- inputBams) { + var sampleVCFS: List[Option[File]] = List.empty + callers.foreach { caller => + sampleVCFS ::= caller.outputVCF(sample) + } + val mergeSVcalls = new Pysvtools(this) + mergeSVcalls.input = sampleVCFS.flatten + mergeSVcalls.output = new File(outputDir, sample + ".merged.vcf") + add(mergeSVcalls) + outputMergedVCFbySample += (sample -> mergeSVcalls.output) + } + + // merge all files from all samples in project + val mergeSVcallsProject = new Pysvtools(this) + mergeSVcallsProject.input = outputMergedVCFbySample.values.toList + mergeSVcallsProject.output = outputMergedVCF + add(mergeSVcallsProject) + + // merging the VCF calls by project + // basicly this will do all samples from this pipeline run + // group by "tags" + // sample tagging is however not available within this pipeline + addSummaryJobs() } @@ -79,7 +108,10 @@ class ShivaSvCalling(val root: Configurable) extends QScript with SummaryQScript def summaryFiles: Map[String, File] = { val callers: Set[String] = configCallers //callersList.filter(x => callers.contains(x.name)).map(x => x.name -> x.outputFile).toMap + ("final" -> finalFile) - Map() + Map( + "final_mergedvcf" -> outputMergedVCF + + ) } } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Breakdancer.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Breakdancer.scala index d7ff2ac12e78f7e149a3beea14eaea766d385079..b9d1457401a5da9dae08e5abc911f97b3765900e 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Breakdancer.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Breakdancer.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.svcallers import nl.lumc.sasc.biopet.extensions.breakdancer.{ BreakdancerVCF, BreakdancerCaller, BreakdancerConfig } @@ -18,6 +33,8 @@ class Breakdancer(val root: Configurable) extends SvCaller { val breakdancer = BreakdancerCaller(this, bdcfg.output, new File(breakdancerSampleDir, sample + ".breakdancer.tsv")) val bdvcf = BreakdancerVCF(this, breakdancer.output, new File(breakdancerSampleDir, sample + ".breakdancer.vcf")) add(bdcfg, breakdancer, bdvcf) + + addVCF(sample, bdvcf.output) } } } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Clever.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Clever.scala index ff98b32e5f9fe23f6f9e8b52334742b29973da5c..c68a0e1f6819a1c995fac2266431a7b4f309d2c3 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Clever.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Clever.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.svcallers import nl.lumc.sasc.biopet.extensions.clever.CleverCaller @@ -13,6 +28,8 @@ class Clever(val root: Configurable) extends SvCaller { val cleverDir = new File(outputDir, sample) val clever = CleverCaller(this, bamFile, cleverDir) add(clever) + + addVCF(sample, clever.outputvcf) } } } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Delly.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Delly.scala index ef82d4ffd2ce377a603b19391e6b142ceb599b6a..98fe0e0a06342cee60db461acc33f1a64b5c23b2 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Delly.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Delly.scala @@ -1,10 +1,25 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.svcallers import nl.lumc.sasc.biopet.extensions.delly.DellyCaller import nl.lumc.sasc.biopet.extensions.gatk.CatVariants import nl.lumc.sasc.biopet.utils.config.Configurable -/** Script for sv caler delly */ +/** Script for sv caller delly */ class Delly(val root: Configurable) extends SvCaller { def name = "delly" @@ -20,7 +35,6 @@ class Delly(val root: Configurable) extends SvCaller { val catVariants = new CatVariants(this) catVariants.outputFile = new File(dellyDir, sample + ".delly.vcf.gz") - /// start delly and then copy the vcf into the root directory "<sample>.delly/" if (del) { val delly = new DellyCaller(this) delly.input = bamFile @@ -57,6 +71,7 @@ class Delly(val root: Configurable) extends SvCaller { require(catVariants.inputFiles.nonEmpty, "Must atleast 1 SV-type be selected for Delly") add(catVariants) + addVCF(sample, catVariants.outputFile) } } } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Pindel.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Pindel.scala index 68cebb07c5eb80a6b2c4659a7c7daae75ef8e2dd..25281ec11f2f26b608378008c2976a33afff4849 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Pindel.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/Pindel.scala @@ -36,11 +36,11 @@ class Pindel(val root: Configurable) extends SvCaller { for ((sample, bamFile) <- inputBams) { val pindelDir = new File(outputDir, sample) - val config_file: File = new File(pindelDir, sample + ".pindel.cfg") + val configFile: File = new File(pindelDir, sample + ".pindel.cfg") val cfg = new PindelConfig(this) cfg.input = bamFile cfg.sampleName = sample - cfg.output = config_file + cfg.output = configFile add(cfg) val pindel = PindelCaller(this, cfg.output, pindelDir) @@ -57,6 +57,8 @@ class Pindel(val root: Configurable) extends SvCaller { pindelVcf.rDate = todayformat.format(today) // officially, we should enter the date of the genome here pindelVcf.outputVCF = new File(pindelDir, s"${sample}.pindel.vcf") add(pindelVcf) + + addVCF(sample, pindelVcf.outputVCF) } } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/SvCaller.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/SvCaller.scala index 1c63aa7b86954ea2d1fba84473b0cc67eb9a76d0..14fea623b201618076f56e9a72c0bc813efe199e 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/SvCaller.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/svcallers/SvCaller.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.svcallers import nl.lumc.sasc.biopet.core.{ Reference, BiopetQScript } @@ -13,7 +28,20 @@ trait SvCaller extends QScript with BiopetQScript with Reference { var namePrefix: String = _ - var inputBams: Map[String, File] = _ + var inputBams: Map[String, File] = Map.empty + + def outputVCF(sample: String): Option[File] = { + outputVCFs.get(sample) match { + case Some(file) => Some(file) + case _ => None + } + } + + protected var outputVCFs: Map[String, File] = Map.empty + + protected def addVCF(sampleId: String, outputVCF: File) = { + outputVCFs += (sampleId -> outputVCF) + } def init() = {} } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Bcftools.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Bcftools.scala index 515b233977e3903df86238ba316c8a5887a6848f..5b4cdfd0c0ea8612e14ac42ccee48c856e90ecf0 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Bcftools.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Bcftools.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.variantcallers import nl.lumc.sasc.biopet.extensions.Tabix diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/BcftoolsSingleSample.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/BcftoolsSingleSample.scala index f0363101ceb099b25e40a9abe6f5862df135fa48..b123dd0bc87a3b799d0d8e8cb411b3c71122d2df 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/BcftoolsSingleSample.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/BcftoolsSingleSample.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.variantcallers import nl.lumc.sasc.biopet.extensions.{ Ln, Tabix } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Freebayes.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Freebayes.scala index 3227c2dc2fc845fdeb18f3915ae4c692369fdcca..58131a52aa7ebabf99f58564d9eee00742bdd596 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Freebayes.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Freebayes.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.variantcallers import java.io.File @@ -14,15 +29,8 @@ class Freebayes(val root: Configurable) extends Variantcaller { val fb = new nl.lumc.sasc.biopet.extensions.Freebayes(this) fb.bamfiles = inputBams.values.toList fb.outputVcf = new File(outputDir, namePrefix + ".freebayes.vcf") - fb.isIntermediate = true - add(fb) + add(fb | new Bgzip(this) > outputFile) - //TODO: need piping for this, see also issue #114 - val bz = new Bgzip(this) - bz.input = List(fb.outputVcf) - bz.output = outputFile - add(bz) - - add(Tabix.apply(this, bz.output)) + add(Tabix.apply(this, outputFile)) } } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/RawVcf.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/RawVcf.scala index 4d4208d68b2ac3140c90222475b72efd5083185c..d978be672b518ff3246162cab3f5b8d0371be33e 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/RawVcf.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/RawVcf.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.variantcallers import java.io.File diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Variantcaller.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Variantcaller.scala index eb0a9ffccd11e6bb9d3a4d92af18ba381ce009eb..a2e13a46602dc9ea6bee2b3cfd19e2c251d8f905 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Variantcaller.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/Variantcaller.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.variantcallers import nl.lumc.sasc.biopet.core.{ BiopetQScript, Reference } diff --git a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/VarscanCnsSingleSample.scala b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/VarscanCnsSingleSample.scala index 157013e5bbd0f22c8d74bf04355872f4bdb6cfaa..9a0fb2839413948de68d3d16101fc4ce912df5b3 100644 --- a/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/VarscanCnsSingleSample.scala +++ b/public/shiva/src/main/scala/nl/lumc/sasc/biopet/pipelines/shiva/variantcallers/VarscanCnsSingleSample.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.shiva.variantcallers import java.io.PrintWriter diff --git a/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCallingTest.scala b/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCallingTest.scala index d5d73c284dd542f2532b2622c635f3104e438e20..b0eaaa4303b9725ce88c2856d3d050060196140f 100644 --- a/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCallingTest.scala +++ b/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaSvCallingTest.scala @@ -217,6 +217,10 @@ object ShivaSvCallingTest { "pindelvcf" -> Map("exe" -> "test"), "clever" -> Map("exe" -> "test"), "delly" -> Map("exe" -> "test"), - "varscan_jar" -> "test" + "varscan_jar" -> "test", + "pysvtools" -> Map( + "exe" -> "test", + "exclusion_regions" -> "test", + "translocations_only" -> false) ) } \ No newline at end of file diff --git a/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaVariantcallingTest.scala b/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaVariantcallingTest.scala index 3144e3f4a1fac3392139bef0506b7d1486ec637a..d86e46c7e9ef21dfcdd5a67f779e46adc2b23483 100644 --- a/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaVariantcallingTest.scala +++ b/public/shiva/src/test/scala/nl/lumc/sasc/biopet/pipelines/shiva/ShivaVariantcallingTest.scala @@ -18,16 +18,17 @@ package nl.lumc.sasc.biopet.pipelines.shiva import java.io.{ File, FileOutputStream } import com.google.common.io.Files -import nl.lumc.sasc.biopet.utils.config.Config +import nl.lumc.sasc.biopet.core.BiopetPipe import nl.lumc.sasc.biopet.extensions.Freebayes +import nl.lumc.sasc.biopet.extensions.bcftools.{ BcftoolsCall, BcftoolsMerge } import nl.lumc.sasc.biopet.extensions.gatk.CombineVariants -import nl.lumc.sasc.biopet.extensions.tools.VcfFilter +import nl.lumc.sasc.biopet.extensions.tools.{ MpileupToVcf, VcfFilter } import nl.lumc.sasc.biopet.utils.ConfigUtils -import org.apache.commons.io.FileUtils +import nl.lumc.sasc.biopet.utils.config.Config import org.broadinstitute.gatk.queue.QSettings import org.scalatest.Matchers import org.scalatest.testng.TestNGSuite -import org.testng.annotations.{ AfterClass, DataProvider, Test } +import org.testng.annotations.{ DataProvider, Test } import scala.collection.mutable.ListBuffer @@ -88,11 +89,13 @@ class ShivaVariantcallingTest extends TestNGSuite with Matchers { pipeline.init() pipeline.script() + val pipesJobs = pipeline.functions.filter(_.isInstanceOf[BiopetPipe]).flatMap(_.asInstanceOf[BiopetPipe].pipesJobs) + pipeline.functions.count(_.isInstanceOf[CombineVariants]) shouldBe (1 + (if (raw) 1 else 0) + (if (varscanCnsSinglesample) 1 else 0)) - //pipeline.functions.count(_.isInstanceOf[Bcftools]) shouldBe (if (bcftools) 1 else 0) - //FIXME: Can not check for bcftools because of piping - pipeline.functions.count(_.isInstanceOf[Freebayes]) shouldBe (if (freebayes) 1 else 0) - //pipeline.functions.count(_.isInstanceOf[MpileupToVcf]) shouldBe (if (raw) bams else 0) + pipesJobs.count(_.isInstanceOf[BcftoolsCall]) shouldBe (if (bcftools) 1 else 0) + (if (bcftoolsSinglesample) bams else 0) + pipeline.functions.count(_.isInstanceOf[BcftoolsMerge]) shouldBe (if (bcftoolsSinglesample && bams > 1) 1 else 0) + pipesJobs.count(_.isInstanceOf[Freebayes]) shouldBe (if (freebayes) 1 else 0) + pipesJobs.count(_.isInstanceOf[MpileupToVcf]) shouldBe (if (raw) bams else 0) pipeline.functions.count(_.isInstanceOf[VcfFilter]) shouldBe (if (raw) bams else 0) } } diff --git a/public/tinycap/pom.xml b/public/tinycap/pom.xml new file mode 100644 index 0000000000000000000000000000000000000000..18e93a26e5b04df0b3b3971cf4708939a14af17b --- /dev/null +++ b/public/tinycap/pom.xml @@ -0,0 +1,57 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + + Biopet is built on top of GATK Queue for building bioinformatic + pipelines. It is mainly intended to support LUMC SHARK cluster which is running + SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + should also be able to execute Biopet tools and pipelines. + + Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + + Contact us at: sasc@lumc.nl + + A dual licensing mode is applied. The source code within this project that are + not part of GATK Queue is freely available for non-commercial use under an AGPL + license; For commercial users or users who do not want to follow the AGPL + license, please contact us to obtain a separate license. + +--> +<project xmlns="http://maven.apache.org/POM/4.0.0" + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> + <parent> + <artifactId>Biopet</artifactId> + <groupId>nl.lumc.sasc</groupId> + <version>0.7.0-SNAPSHOT</version> + </parent> + <modelVersion>4.0.0</modelVersion> + + <inceptionYear>2016</inceptionYear> + <artifactId>TinyCap</artifactId> + + <dependencies> + <dependency> + <groupId>nl.lumc.sasc</groupId> + <artifactId>Mapping</artifactId> + <version>${project.version}</version> + </dependency> + <dependency> + <groupId>nl.lumc.sasc</groupId> + <artifactId>Gentrap</artifactId> + <version>${project.version}</version> + </dependency> + <dependency> + <groupId>org.testng</groupId> + <artifactId>testng</artifactId> + <version>6.8</version> + <scope>test</scope> + </dependency> + <dependency> + <groupId>org.scalatest</groupId> + <artifactId>scalatest_2.10</artifactId> + <version>2.2.1</version> + <scope>test</scope> + </dependency> + </dependencies> + +</project> \ No newline at end of file diff --git a/public/tinycap/src/main/resources/nl/lumc/sasc/biopet/pipelines/tinycap/tinycapFront.ssp b/public/tinycap/src/main/resources/nl/lumc/sasc/biopet/pipelines/tinycap/tinycapFront.ssp new file mode 100644 index 0000000000000000000000000000000000000000..c9827c399e92cb1a80e31c4e3d416b0cd77c9dcb --- /dev/null +++ b/public/tinycap/src/main/resources/nl/lumc/sasc/biopet/pipelines/tinycap/tinycapFront.ssp @@ -0,0 +1,37 @@ +#import(nl.lumc.sasc.biopet.utils.summary.Summary) +#import(nl.lumc.sasc.biopet.core.report.ReportPage) +<%@ var summary: Summary %> +<%@ var rootPath: String %> +<%@ var pipeline: String %> + +<table class="table"> +<tbody> + <tr><th>Pipeline</th><td>${pipeline}</td></tr> + <tr><th>Version</th><td>${summary.getValue("meta", "pipeline_version")}</td></tr> + <tr><th>Last commit hash</th><td>${summary.getValue("meta", "last_commit_hash")}</td></tr> + <tr><th>Output directory</th><td>${summary.getValue("meta", "output_dir")}</td></tr> + <tr><th>Reference</th><td>${summary.getValue(pipeline, "settings", "reference", "species")} - ${summary.getValue(pipeline, "settings", "reference", "name")}</td></tr> + <tr><th>Number of samples</th><td>${summary.samples.size}</td></tr> +</tbody> +</table> +<br/> +<div class="row"> + <div class="col-md-1"></div> + <div class="col-md-10"> + <p> + In this web document you can find your <em><strong>${pipeline}</strong></em> pipeline report. + Different categories of data can be found in the left-side menu. + Statistics per sample and library can be accessed through the top-level menu. + Futhermore, you can view all versions of software tools used by selecting <em><a href="./Versions/index.html">Versions</a></em> from the top menu. + </p> + + <p> + <small>Brought to you by <a href="https://sasc.lumc.nl" target="_blank"><abbr + title="Sequence Analysis Support Core">SASC</abbr></a> and <a + href="https://www.lumc.nl/org/klinische-genetica/" target="_blank"><abbr title="Clinical Genetics LUMC">KG</abbr></a>, + LUMC. + </small> + </p> + </div> + <div class="col-md-1"></div> +</div> \ No newline at end of file diff --git a/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/TinyCap.scala b/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/TinyCap.scala new file mode 100644 index 0000000000000000000000000000000000000000..70300b345fc8bd5f94ae753366a5fb845f89e3fe --- /dev/null +++ b/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/TinyCap.scala @@ -0,0 +1,131 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ +package nl.lumc.sasc.biopet.pipelines.tinycap + +import java.io.File + +import nl.lumc.sasc.biopet.core.annotations.{ AnnotationGff, AnnotationGtf, AnnotationRefFlat } +import nl.lumc.sasc.biopet.core.report.ReportBuilderExtension +import nl.lumc.sasc.biopet.core.{ PipelineCommand, Reference } +import nl.lumc.sasc.biopet.pipelines.gentrap.measures.{ BaseCounts, FragmentsPerGene } +import nl.lumc.sasc.biopet.pipelines.mapping.MultisampleMappingTrait +import nl.lumc.sasc.biopet.pipelines.tinycap.measures.FragmentsPerSmallRna +import nl.lumc.sasc.biopet.utils.config.Configurable +import org.broadinstitute.gatk.queue.QScript +import picard.analysis.directed.RnaSeqMetricsCollector.StrandSpecificity + +/** + * Created by pjvan_thof on 12/29/15. + * Design based on work from Henk Buermans (e-Mir) + * Implementation by wyleung started 19/01/16 + */ +class TinyCap(val root: Configurable) extends QScript + with MultisampleMappingTrait + with AnnotationRefFlat + with AnnotationGff + with AnnotationGtf + with Reference { + qscript => + def this() = this(null) + + var annotateSam: Boolean = config("annotate_sam", default = false) + + override def defaults = Map( + "igvtoolscount" -> Map( + "strands" -> "reads", + "includeDuplicates" -> true + ), + "merge_strategy" -> "preprocessmergesam", + "keep_merged_files" -> true, + "mapping" -> Map( + "aligner" -> "bowtie", + "generate_wig" -> true, + "skip_markduplicates" -> true + ), + "bammetrics" -> Map( + "wgs_metrics" -> false, + "rna_metrics" -> false, + "collectrnaseqmetrics" -> Map( + "strand_specificity" -> StrandSpecificity.SECOND_READ_TRANSCRIPTION_STRAND.toString + ) + ), + "bowtie" -> Map( + "chunkmbs" -> 256, + "seedmms" -> 3, + "seedlen" -> 25, + "k" -> 5, + "best" -> true + ), + "sickle" -> Map( + "lengthThreshold" -> 15 + ), + "fastqc" -> Map( + "sensitiveAdapterSearch" -> true + ), + "cutadapt" -> Map( + "error_rate" -> 0.2, + "minimum_length" -> 15, + "q" -> 30, + "default_clip_mode" -> "both", + "times" -> 2 + ) + ) + + lazy val fragmentsPerGene = new FragmentsPerGene(this) + lazy val fragmentsPerSmallRna = new FragmentsPerSmallRna(this) + lazy val baseCounts = new BaseCounts(this) + + def executedMeasures = (fragmentsPerGene :: fragmentsPerSmallRna :: baseCounts :: Nil) + + override def init = { + super.init() + executedMeasures.foreach(x => x.outputDir = new File(outputDir, "expression_measures" + File.separator + x.name)) + } + + override def makeSample(id: String) = new Sample(id) + + class Sample(sampleId: String) extends super.Sample(sampleId) { + override def addJobs(): Unit = { + super.addJobs() + + preProcessBam.foreach { file => + executedMeasures.foreach(_.addBamfile(sampleId, file)) + } + } + } + + override def summaryFile = new File(outputDir, "tinycap.summary.json") + + override def summaryFiles: Map[String, File] = super.summaryFiles ++ Map( + "annotation_refflat" -> annotationRefFlat(), + "annotationGtf" -> annotationGtf, + "annotationGff" -> annotationGff + ) + + override def reportClass: Option[ReportBuilderExtension] = { + val report = new TinyCapReport(this) + report.outputDir = new File(outputDir, "report") + report.summaryFile = summaryFile + Some(report) + } + + override def addMultiSampleJobs = { + super.addMultiSampleJobs + executedMeasures.foreach(add) + } +} + +object TinyCap extends PipelineCommand \ No newline at end of file diff --git a/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/TinyCapReport.scala b/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/TinyCapReport.scala new file mode 100644 index 0000000000000000000000000000000000000000..b69509e2aeaf06aec38bffae7ae75099422cb0e3 --- /dev/null +++ b/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/TinyCapReport.scala @@ -0,0 +1,50 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ +package nl.lumc.sasc.biopet.pipelines.tinycap + +import nl.lumc.sasc.biopet.core.report.{ ReportBuilderExtension, ReportSection } +import nl.lumc.sasc.biopet.pipelines.mapping.MultisampleMappingReportTrait +import nl.lumc.sasc.biopet.utils.config.Configurable + +/** + * Created by wyleung on 4-2-16. + */ +class TinyCapReport(val root: Configurable) extends ReportBuilderExtension { + def builder = TinyCapReport + +} + +object TinyCapReport extends MultisampleMappingReportTrait { + /** Name of the report */ + def reportName = "TinyCap Report" + + /** Front section for the report */ + override def frontSection: ReportSection = ReportSection("/nl/lumc/sasc/biopet/pipelines/tinycap/tinycapFront.ssp") + + override def additionalSections = List( + "Fragments per gene" -> ReportSection("/nl/lumc/sasc/biopet/pipelines/gentrap/measure_plotreport.ssp", + Map("pipelineName" -> pipelineName, + "plotName" -> "fragmentspergene", + "plotPath" -> summary.getValue("fragmentspergene", "files", "pipeline", "fragments_per_gene_heatmap", "path") + )), + "Fragments per microRNA" -> ReportSection("/nl/lumc/sasc/biopet/pipelines/gentrap/measure_plotreport.ssp", + Map("pipelineName" -> pipelineName, + "plotName" -> "fragmentspersmallrna", + "plotPath" -> summary.getValue("fragmentspersmallrna", "files", "pipeline", "fragments_per_smallrna_heatmap", "path"))) + ) + + override def pipelineName = "tinycap" +} diff --git a/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/measures/FragmentsPerSmallRna.scala b/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/measures/FragmentsPerSmallRna.scala new file mode 100644 index 0000000000000000000000000000000000000000..0cfa32db62ca5011e20c262fd7d0f583e9677995 --- /dev/null +++ b/public/tinycap/src/main/scala/nl/lumc/sasc/biopet/pipelines/tinycap/measures/FragmentsPerSmallRna.scala @@ -0,0 +1,56 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ +package nl.lumc.sasc.biopet.pipelines.tinycap.measures + +import nl.lumc.sasc.biopet.core.annotations.AnnotationGff +import nl.lumc.sasc.biopet.extensions.HtseqCount +import nl.lumc.sasc.biopet.pipelines.gentrap.measures.Measurement +import nl.lumc.sasc.biopet.utils.config.Configurable +import org.broadinstitute.gatk.queue.QScript + +/** + * Created by wyleung on 11-2-16. + */ +class FragmentsPerSmallRna(val root: Configurable) extends QScript with Measurement with AnnotationGff { + def mergeArgs = MergeArgs(List(1), 2, numHeaderLines = 1, fallback = "0") + + /** Pipeline itself */ + def biopetScript(): Unit = { + val jobs = bamFiles.map { + case (id, file) => + // Do expression counting for miRNA and siRNA + val job = new HtseqCount(this) + job.inputAlignment = file + job.inputAnnotation = annotationGff + job.format = Option("bam") + job.stranded = Option("yes") + job.featuretype = Option("miRNA") + job.idattr = Option("Name") + job.output = new File(outputDir, s"$id.$name.counts") + add(job) + + id -> job + } + + addMergeTableJob(jobs.values.map(_.output).toList, mergedTable, "fragments_per_smallrna", s".$name.counts") + addHeatmapJob(mergedTable, heatmap, "fragments_per_smallrna") + + addSummaryJobs() + } + + def mergedTable = new File(outputDir, s"$name.fragments_per_smallrna.tsv") + def heatmap = new File(outputDir, s"$name.fragments_per_smallrna.png") +} diff --git a/public/tinycap/src/test/scala/nl/lucm/sasc/biopet/pipelines/tinycap/TinyCapTest.scala b/public/tinycap/src/test/scala/nl/lucm/sasc/biopet/pipelines/tinycap/TinyCapTest.scala new file mode 100644 index 0000000000000000000000000000000000000000..b8ec5c3ab159b584c5935bc87e29892799216a71 --- /dev/null +++ b/public/tinycap/src/test/scala/nl/lucm/sasc/biopet/pipelines/tinycap/TinyCapTest.scala @@ -0,0 +1,135 @@ +/** + * Created by wyleung on 11-2-16. + */ + +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ +package nl.lumc.sasc.biopet.pipelines.tinycap + +import java.io.File + +import com.google.common.io.Files +import nl.lumc.sasc.biopet.extensions.HtseqCount +import nl.lumc.sasc.biopet.utils.ConfigUtils +import nl.lumc.sasc.biopet.utils.config.Config +import org.apache.commons.io.FileUtils +import org.broadinstitute.gatk.queue.QSettings +import org.scalatest.Matchers +import org.scalatest.testng.TestNGSuite +import org.testng.annotations.{ AfterClass, DataProvider, Test } + +class TinyCapTest extends TestNGSuite with Matchers { + + def initPipeline(map: Map[String, Any]): TinyCap = { + new TinyCap() { + override def configName = "tinycap" + + override def globalConfig = new Config(map) + + qSettings = new QSettings + qSettings.runName = "test" + } + } + + @DataProvider(name = "tinyCapOptions") + def tinyCapOptions = { + val bool = Array(true) + + for ( + s1 <- bool + ) yield Array("", s1) + } + + @Test(dataProvider = "tinyCapOptions") + def testTinyCap(dummy: String, sample1: Boolean): Unit = { + val map = { + var m: Map[String, Any] = TinyCapTest.config + if (sample1) m = ConfigUtils.mergeMaps(TinyCapTest.sample1, m) + m + } + + if (!sample1) { // When no samples + intercept[IllegalArgumentException] { + initPipeline(map).script() + } + } + + val pipeline = initPipeline(map) + pipeline.script() + // expect 2 instances of HtSeqCount, one for mirna.gff other for transcripts.gtf + pipeline.functions.count(_.isInstanceOf[HtseqCount]) shouldBe 2 + + } + + // remove temporary run directory all tests in the class have been run + @AfterClass def removeTempOutputDir() = { + FileUtils.deleteDirectory(TinyCapTest.outputDir) + } + +} + +object TinyCapTest { + val outputDir = Files.createTempDir() + new File(outputDir, "input").mkdirs() + + val r1 = new File(outputDir, "input" + File.separator + "R1.fq.gz") + Files.touch(r1) + val bam = new File(outputDir, "input" + File.separator + "bamfile.bam") + Files.touch(bam) + + val referenceFasta = new File(outputDir, "ref.fa") + Files.touch(referenceFasta) + val referenceFastaDict = new File(outputDir, "ref.dict") + Files.touch(referenceFastaDict) + val bowtieIndex = new File(outputDir, "ref.1.ebwt") + Files.touch(bowtieIndex) + + val annotationGFF = new File(outputDir, "annot.gff") + val annotationGTF = new File(outputDir, "annot.gtf") + val annotationRefflat = new File(outputDir, "annot.refflat") + Files.touch(annotationGFF) + Files.touch(annotationGTF) + Files.touch(annotationRefflat) + + val config = Map( + "output_dir" -> outputDir, + "reference_fasta" -> (referenceFasta.getAbsolutePath), + "bowtie_index" -> (bowtieIndex.getAbsolutePath), + + "annotation_gff" -> annotationGFF, + "annotation_gtf" -> annotationGTF, + "annotation_refflat" -> annotationRefflat, + + "md5sum" -> Map("exe" -> "test"), + "fastqc" -> Map("exe" -> "test"), + "seqtk" -> Map("exe" -> "test"), + "sickle" -> Map("exe" -> "test"), + "cutadapt" -> Map("exe" -> "test"), + "bowtie" -> Map("exe" -> "test"), + "htseqcount" -> Map("exe" -> "test"), + "igvtools" -> Map("exe" -> "test"), + "wigtobigwig" -> Map("exe" -> "test") + ) + + val sample1 = Map( + "samples" -> Map("sample1" -> Map("libraries" -> Map( + "lib1" -> Map( + "R1" -> r1.getAbsolutePath + ) + ) + ))) + +} \ No newline at end of file diff --git a/public/toucan/pom.xml b/public/toucan/pom.xml index 64edb91cb6e6da1bcdbc8724591239c5756d1af2..781e458c31cc8128843b55873781e3aaa9f8b1e0 100644 --- a/public/toucan/pom.xml +++ b/public/toucan/pom.xml @@ -25,7 +25,7 @@ <parent> <groupId>nl.lumc.sasc</groupId> <artifactId>Biopet</artifactId> - <version>0.6.0-SNAPSHOT</version> + <version>0.7.0-SNAPSHOT</version> <relativePath>../</relativePath> </parent> diff --git a/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweActivateAfterAnnotImport.scala b/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweActivateAfterAnnotImport.scala index 9ed3b00d01d42e21eec88ac3e3ccb5a03af4203d..9326bb9248a89fe1c07ae995c59492754136e6cb 100644 --- a/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweActivateAfterAnnotImport.scala +++ b/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweActivateAfterAnnotImport.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.toucan import java.io.File diff --git a/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweDownloadAfterAnnotate.scala b/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweDownloadAfterAnnotate.scala index e49ff7e5ed4d7ba91fd0bd1ef8b877bf58c3c296..04733ff9fa7ab9ce81a16e631196b6f1cdbbd4bc 100644 --- a/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweDownloadAfterAnnotate.scala +++ b/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/ManweDownloadAfterAnnotate.scala @@ -1,3 +1,18 @@ +/** + * Biopet is built on top of GATK Queue for building bioinformatic + * pipelines. It is mainly intended to support LUMC SHARK cluster which is running + * SGE. But other types of HPC that are supported by GATK Queue (such as PBS) + * should also be able to execute Biopet tools and pipelines. + * + * Copyright 2014 Sequencing Analysis Support Core - Leiden University Medical Center + * + * Contact us at: sasc@lumc.nl + * + * A dual licensing mode is applied. The source code within this project that are + * not part of GATK Queue is freely available for non-commercial use under an AGPL + * license; For commercial users or users who do not want to follow the AGPL + * license, please contact us to obtain a separate license. + */ package nl.lumc.sasc.biopet.pipelines.toucan import java.io.File diff --git a/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/Toucan.scala b/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/Toucan.scala index c04211a9db60a35c1256b735ef30c19d755e474f..dc1dc476e829493ffecc63fcc44a6c59537bc7c1 100644 --- a/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/Toucan.scala +++ b/public/toucan/src/main/scala/nl/lumc/sasc/biopet/pipelines/toucan/Toucan.scala @@ -68,6 +68,7 @@ class Toucan(val root: Configurable) extends QScript with BiopetQScript with Sum vep.output = new File(outputDir, inputVCF.getName.stripSuffix(".gz").stripSuffix(".vcf") + ".vep.vcf") vep.isIntermediate = true add(vep) + addSummarizable(vep, "variant_effect_predictor") val normalizer = new VepNormalizer(this) normalizer.inputVCF = vep.output