- Aug 21, 2020
-
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
Add test to make sure the markdup and baserecal groups receive the correct inputs when a sample has multiple readgroups.
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
- Aug 12, 2020
-
-
van den Berg authored
The directory output for the fastqc tasks is causing issues on the shared file system of the cluster, since it cannot properly determine the age of the folder. As a result, it re-runs the fastqc tasks every time a workflow is restarted, regardless of whether the task has already completed. To prevent this, a single dummy output file '.done' has been added to the fastqc tasks which will be written when fastqc exits successfully.
-
- Aug 07, 2020
-
-
van den Berg authored
The base recalibration (BQSR) step of the pipeline can take up to 7 hours for WGS samples, which is a significant part of the total run time. The developers of GATK state that BQSR requires at least 100M bases per read group: "We usually expect to see more than 100M bases per read group; as a rule of thumb, larger numbers will work better." A human WGS sample with an average read depth of 43x has almost 1300 times that amount of bases. The analysis of these samples would be sped up greatly by restricting BQSR to a single chromosome.
-
van den Berg authored
The base recalibration step of the pipeline can take up to 7 hours for WGS samples, which is a significant part of the total run time. The developers of GATK state that BQSR requires at least 100M bases per read group: "We usually expect to see more than 100M bases per read group; as a rule of thumb, larger numbers will work better." A human WGS sample with an average read depth of 43x has almost 1300 times that amount of bases. The analysis of these samples would be sped up greatly by restricting BQSR to a single chromosome.
-
van den Berg authored
Use 8 cores instead of just 1, according to the documentation speed should almost scale linear with the cores provided. Also reduce the compression level on the output file, since most time for cutadapt is spent re-compressing the data after trimming the adapters.
-
- Jul 29, 2020
-
-
van den Berg authored
-
van den Berg authored
-
- Jul 28, 2020
-
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
- Jul 24, 2020
-
-
van den Berg authored
-
van den Berg authored
Revert "Remove explicit tmp folder" See merge request !17
-
van den Berg authored
This reverts commit 1b7d807f
-
van den Berg authored
-
- Jul 22, 2020
-
-
van den Berg authored
This should no longer be needed on the slurm cluster, where each task can request the amount of tmp space it requires explicitly. Removing the tmp from the shared filesystem back onto the host running the analysis should also improve the performance of the pipeline.
-
van den Berg authored
-
van den Berg authored
-
van den Berg authored
This way, the same version of picard is used in all tasks in the pipeline. This fixes issue #38
-
van den Berg authored
-
- Jun 26, 2020
-
-
van den Berg authored
Add optional bed coverage output files See merge request !16
-
van den Berg authored
-
- Jun 25, 2020
-
-
van den Berg authored
-
van den Berg authored
-