1. 07 Aug, 2020 1 commit
    • van den Berg's avatar
      Speed up cutadapt · b585ff80
      van den Berg authored
      Use 8 cores instead of just 1, according to the documentation speed
      should almost scale linear with the cores provided.
      
      Also reduce the compression level on the output file, since most time
      for cutadapt is spent re-compressing the data after trimming the
      adapters.
      b585ff80
  2. 29 Jul, 2020 2 commits
  3. 28 Jul, 2020 5 commits
  4. 24 Jul, 2020 4 commits
  5. 22 Jul, 2020 5 commits
  6. 26 Jun, 2020 2 commits
  7. 25 Jun, 2020 2 commits
  8. 24 Jun, 2020 3 commits
  9. 23 Jun, 2020 1 commit
    • van den Berg's avatar
      Add picard DuplicationMetrics to stats.json · 50e50edd
      van den Berg authored
      Unfortunately, this adds a limitation on the sample names that can be
      used with Hutspot, since the naming of the samples in the multiQC parsed
      output of picard MarkDuplicates is partly ambiguous.
      
      This limitation has been added to the readme, and a check has been added
      to the pipeline snakefile to throw an error when overlapping sample
      names are detected.
      50e50edd
  10. 22 Jun, 2020 1 commit
    • van den Berg's avatar
      Simplify the structure of the coverage data · 14a82100
      van den Berg authored
      Previously, Hutspot supported multiple bed files to calculate coverage
      against. Because of this, the stats.json file had a nested structured
      where the coverage based on each bed file was stored, including the name
      of the bed file and the gender, according to that bed file.
      
      Since the current version of Hutspot only supports a single bed file,
      this structure has been simplified. All coverage statistics are now
      directly under 'coverage' for each sample, and the 'gender' has been
      moved out of the 'coverage' statistics to the sample level.
      14a82100
  11. 17 Jun, 2020 2 commits
  12. 03 Jun, 2020 5 commits
  13. 02 Jun, 2020 7 commits
    • van den Berg's avatar
      Clean up collect stats no file passing · ca0c28db
      van den Berg authored
      Pass 'no file' to collect_stats.py by using an empty list in the
      Snakefile and nargs='?' in the python script. This is cleaner than using
      '.' as a special file and parsing that logic in the collect stats
      script.
      ca0c28db
    • van den Berg's avatar
      e3633143
    • van den Berg's avatar
      Add picard CollectHsMetrics to the pipeline · e35be6a7
      van den Berg authored
      If both a target and bait bedfiles have been specified, calculate the
      hybrid-selection (HS) statistics using picard.
      e35be6a7
    • van den Berg's avatar
      Add a singularity prefix to the snakemake profile · c9b4fd84
      van den Berg authored
      Multiple (pytest) processes trying to write the same image to
      /tmp/singularity can lead to corruption, leading to intermittent
      failures in the gitlab-ci tests. By specifying a singularity prefix in
      the snakemake profile, the same images can be re-used, so we only have
      to worry about concurrent processes writing the same image when a new
      image is added to the pipeline.
      c9b4fd84
    • van den Berg's avatar
      Add picard insertSize metrics tot stats.json · 3396fedd
      van den Berg authored
      This also required reordering some snakemake rules to make sure that the
      correct input files are available. When using rule based inputs, the
      rules in the Snakefile have to be sorted, and only rule based inputs
      from rules that occur earlier in the Snakefile can be used.
      3396fedd
    • van den Berg's avatar
      Consolidate collectstats into a single rule · 583aed74
      van den Berg authored
      The implementation is a bit hacky, since snakemake does not allow for
      optional input files. As a workaround, "." is passed when the bedfile is
      not defined, and the collect_stats.py script has been made aware of the
      special meaning of "."
      
      Additionally, Click has been removed as a dependency for collect stats,
      and the structure of the stats.json file has been updated to only allow
      for a single entry of coverage stats instead of a list. This has been
      done to match an earlier change in Hutspot where support for multiple
      bed files has been dropped.
      583aed74
    • van den Berg's avatar
      Remove newlines · e64950fe
      van den Berg authored
      e64950fe