1. 01 Apr, 2019 1 commit
  2. 29 Mar, 2019 1 commit
  3. 19 Mar, 2019 1 commit
    • Hoogenboom, Jerry's avatar
      TSSV v2.0.0 · bf7a9aff
      Hoogenboom, Jerry authored
        - Removed dependency on external tssv package (it is no longer compatible).
        - Greatly increased performance by deduplicating the input reads.
        - Removed the -q/--is-fastq option in favour of automatic detection.
        - Changed the default value for -m/--mismatches from 0.08 to 0.1.
        - Changed the default value for -n/--indel-score from 1 to 2.
        - Added the -X/--no-deduplicate option to disable deduplication.
        - Fixed potential crash that could occur under very specific circumstances.
      bf7a9aff
  4. 03 Jul, 2018 1 commit
  5. 15 Mar, 2017 1 commit
  6. 14 Mar, 2017 1 commit
  7. 08 Mar, 2017 1 commit
    • Hoogenboom, Jerry's avatar
      FDSTools v1.1.0.dev3: Fixes and pipelining enhancements · be8dbe46
      Hoogenboom, Jerry authored
      * General changes in v1.1.0.dev3:
        * Allele name heuristics: don't produce insertions at the end of the
          prefix or at the beginning of the suffix; just include extra STR
          blocks.
        * FDSTools will no longer crash with a 'column not found' error when
          an input file is empty. This situation is now treated as if the
          expected columns existed, but no lines of actual data were present.
          This greatly helps in tracking down issues in pipelines involving
          multiple tools, as tools will now shutdown gracefully if an upstream
          tool fails to write output.
      * Allelefinder v1.0.1:
        * Fixed crash that occurred when converting sequences to allele names
          format while no library file was provided.
        * Don't crash when output pipe is closed.
      * BGAnalyse v1.0.1:
        * Don't crash when output pipe is closed.
      * BGCorrect v1.0.2:
        * Don't crash on empty input files.
        * Don't crash when output pipe is closed.
      * BGEstimate v1.1.2:
        * Don't crash when output pipe is closed.
      * BGHomRaw v1.0.1:
        * Clarified the 'Allele x of marker y has 0 reads' error message with
          the sample tag.
        * Don't crash when output pipe is closed.
      * BGHomStats v1.0.1:
        * Error messages about the input data now contain the sample tag of
          the sample that triggered the error.
        * Don't crash when output pipe is closed.
      * BGMerge v1.0.3:
        * Don't crash when output pipe is closed.
      * BGPredict v1.0.2:
        * Don't crash on empty input files.
        * Don't crash when output pipe is closed.
      * FindNewAlleles v1.0.1:
        * Don't crash on empty input files.
        * Don't crash when output pipe is closed.
      * Libconvert v1.1.2:
        * Don't crash when output pipe is closed.
      * Library v1.0.3:
        * Don't crash when output pipe is closed.
      * Seqconvert v1.0.2:
        * Don't crash when output pipe is closed.
      * Samplestats v1.1.1:
        * Don't crash on empty input files.
        * Don't crash when output pipe is closed.
      * Stuttermark v1.5.1:
        * Don't crash on empty input files.
        * Don't crash when output pipe is closed.
      * Stuttermodel v1.1.2:
        * Don't crash when output pipe is closed.
      * TSSV v1.1.0 (additionally):
        * When running analysis in parallel, make tasks of 1 million alignments.
          Previously, this was 10k reads, with the number of alignments per task
          depending on the size of the library file. This caused memory issues for
          huge libraries like whole mt interval libraries.
        * Don't crash when output pipe is closed.
      * Vis v1.0.4:
        * Don't crash when output pipe is closed.
      be8dbe46
  8. 09 Feb, 2017 1 commit
  9. 07 Feb, 2017 1 commit
    • Hoogenboom, Jerry's avatar
      FDSTools v1.1.0.dev1: Visualisation fixes · 4a23b60f
      Hoogenboom, Jerry authored
      * Samplevis v2.2.0:
        * Fixed incorrect calculation of 'percentage of highest' if the 'sequence'
          with the highest read count within a marker is the aggregated 'Other
          sequences' data. In exceptional cases, this could have resulted in the
          erroneous omission of an allele in the visualisation (graphs and/or
          tables).
      * Stuttermodelvis v2.0.3:
        * Fixed bug that caused HTML visualisations with embedded data to fail
          while loading.
        * Fixed glitch where, in HTML visualisations with embedded data and a custom
          title, the custom title was truncated to the last '.' as if it were a file
          name.
      * Pipeline v1.0.3:
        * Fixed glitch that caused the 'bgprofiles.html' output file of the
          reference-database analysis to lack a proper title.
      * BGRawVis v2.0.1:
        * Fixed glitch where, in HTML visualisations with embedded data and a custom
          title, the custom title was truncated to the last '.' as if it were a file
          name.
        * Changed default save filen...
      4a23b60f
  10. 21 Dec, 2016 2 commits
    • Hoogenboom, Jerry's avatar
      Release v1.0.1 · bbb08573
      Hoogenboom, Jerry authored
      bbb08573
    • Hoogenboom, Jerry's avatar
      FDSTools v1.0.1.dev3 · 14b17206
      Hoogenboom, Jerry authored
      * General changes in v1.0.1:
        * Fixed crash that occurred when using the -i option to run the same command
          on multiple input files.
        * The usage string now always starts with 'fdstools', even if FDSTools was
          invoked using some other command (e.g. on Windows, FDSTools gets invoked
          through a file called 'fdstools-script.py').
        * Fixed bug with the -d/--debug option being ignored if placed before the
          tool name on systems running Python 2.7.9 or later.
        * FDSTools library files may now contain IUPAC ambiguous bases in the
          prefix and suffix sequences of STR markers (except the first sequence,
          as it is used as the reference).  Additionally, optional bases may be
          represented by lowercase letters.
        * If no explicit prefix/suffix is given for an alias, the prefix/suffix of
          the corresponding marker is assumed instead. This situation was not
          handled correctly when converting from raw sequences to TSSV or allelename
          format, which resulted in the alias remaining unused.
      * Stuttermodelvis v2.0.2:
        * Added filtering option for the stutter amount (-1, +1, -2, etc.).
        * Added filtering option for the coefficient of determination (r squared
          value) of the fit functions.
      * Libconvert v1.1.1:
        * Adjustments for supporting IUPAC notation in prefix and suffix sequences
          when converting from FDSTools to TSSV library format.
      * Library v1.0.2:
        * Added documentation for IUPAC support to the descriptive comment of the
          [prefix] section.
      14b17206
  11. 26 Oct, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      FDSTools v1.0.1.dev2 · 98c434ed
      Hoogenboom, Jerry authored
      * Samplevis v2.1.2 (additionally):
        * The net effect of the allele calling thresholds (table filtering options)
          is now visualised in the graphs as a dashed vertical red line.
        * Fixed issue with allele calling thresholds not working anymore after having
          used the 'Save page' link in HTML visualisations.
      98c434ed
  12. 13 Oct, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      FDSTools v1.0.1.dev1 · f2ccd67d
      Hoogenboom, Jerry authored
      * Samplevis v2.1.2:
        * Added 'Save page' link to HTML visualisations, which offers for download a
          copy of the entire HTML visualisation including the user's changes.
        * Added automatic allele calling to static visualisations.
      * Pipeline v1.0.2:
        * Added -A/--in-allelelist option to the pipeline tool to provide an existing
          allele list file when running the ref-db analysis, bypassing Allelefinder.
      * Vis v1.0.3:
        * The -n/--min-abs and -s/--min-per-strand options now accept non-integer
          values as well.
        * Added six options to control the Table Filtering Options of Samplevis.
        * The Display Options now have a separate option group on the command line.
      f2ccd67d
  13. 03 Oct, 2016 3 commits
    • Hoogenboom, Jerry's avatar
    • Hoogenboom, Jerry's avatar
    • Hoogenboom, Jerry's avatar
      FDSTools v1.0.0 Release Candidate 1 · de3593a8
      Hoogenboom, Jerry authored
      * General changes in v1.0.0rc1:
        * Fixed bug that caused variant descriptions in allele names of
          non-STR markers to be prepended with plus signs similar to suffix variants
          in STR markers.  When attempting to convert these allele names back to raw
          sequences, FDSTools would crash with an 'Invalid allele name' error.
      * Allelevis v2.0.1 (additionally):
        * In the tooltip in HTML visualisations, a line break may now only be
          inserted in allele names after an underscore character (_) or after a
          repeat block in STR allele names.  If the input file contains raw
          sequences, line breaks may now be introduced anywhere in the sequence.
      * Samplevis v2.1.1:
        * Added tooltip support to HTML visualisations.  Moving the mouse pointer
          over one of the alleles in the graph now displays a tooltip giving
          per-strand read counts of that allele.  The tooltip may include a
          'new allele' note if the input sample was analysed with FindNewAlleles.
        * The allele t...
      de3593a8
  14. 20 Sep, 2016 1 commit
  15. 06 Sep, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      FDSTools v0.0.5: new tools, changed defaults · abba1c04
      Hoogenboom, Jerry authored
      * General changes in v0.0.5:
        * The TSSV tool now depends on version 0.4.0 of TSSV.
        * Added new Pipeline tool that runs one of three default analysis pipelines
          automatically given a configuration file with tool options and input/output
          file names. The three available pipeline options are 'reference-sample',
          analysing a single reference sample with TSSV and Stuttermark;
          'reference-database', analysing a collection of reference samples with
          BGEstimate and Stuttermodel; and 'case-sample', analysing a single case
          sample with TSSV, BGPredict, BGMerge, BGCorrect, and Samplestats.
        * Added new Library tool that creates an empty FDSTools library file. Users
          may optionally specify the intented use of the library (STR markers,
          non-STR-markers, or both). Only the sections that apply to the given types
          of markers will be included in the output. The [aliases] section is not
          included by default, but an option is available to add it...
      abba1c04
  16. 26 Jul, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      Various bug fixes and refinements throughout FDSTools · 08cf6ddd
      Hoogenboom, Jerry authored
      * Global changes in v0.0.4:
        * FDSTools will now print profiling information to stdout when the -d/--debug
          option was specified.
        * Fixed bug where specifying '-' as the output filename would be taken
          literally, while it should have been interpreted as 'write to standard out'
          (Affected tools: BGCorrect, Samplestats, Seqconvert, Stuttermark).
        * Added more detailed license information to FDSTools.
      * BGEstimate v1.1.0:
        * Added a new option -g/--min-genotypes (default: 3). Only alleles that occur
          in at least this number of unique heterozygous genotypes will be
          considered. This is to avoid 'contamination' of the noise profile of one
          allele with the noise of another. If homozygous samples are available for
          an allele, this filter is not applied to that allele. Setting this option
          to 1 effectively disables it. This option has the same cascading effect as
          the -s/--min-samples option, that is, if one allele does not meet...
      08cf6ddd
  17. 02 Feb, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      Big update: Bumped version to v0.0.3 · ebf700a7
      Hoogenboom, Jerry authored
      Updated Stuttermark to v1.5. WARNING: This version of Stuttermark is
      INCOMPATIBLE with output from previous versions of FDSTools and TSSV.
      
      Introducing TSSV-Lite
      * New tool tssv acts as a wrapper around TSSV-Lite (tssvl). Its primary
        purpose is to allow running TSSV-Lite without having to convert the
        FDSTools library to TSSV format, and to offer allelename output. Like
        all other tools in FDSTools, it also works with TSSV library files but
        its allele name generation capabilities are limited in that case.
      
      Changed:
      * TSSV-Lite and the new TSSV tool in FDSTools have two columns renamed
        w.r.t. the original TSSV program: 'name' has been changed to 'marker',
        and 'allele' has been changed to 'sequence'. All tools in FDSTools
        have been updated to use the new column names. This change affects
        Allelefinder, BGCorrect, BGEstimate, BGHomRaw, BGHomStats, BGPredict,
        Blame, Samplestats, Samplevis, Stuttermark, Stuttermodel, and
        Seqconvert. Note that this change will BREAK COMPATIBILITY of these
        tools with old data files.
      
      Fixed:
      * In Samplevis HTML visualisations, the "percentage recovery" table
        filtering option used the absolute number of recovered reads instead.
      * Added PctRecovery to the tables in Samplevis HTML visualisations.
      * BGPredict will now print a nice error message if the -n/--min-pct
        option is set to zero or a negative number, to avoid division by zero.
      * Samplestats would crash if the input file contained the flags column.
      * FDSTools would crash when trying to convert sequences to allele names
        using a TSSV library.
      
      Improved:
      * Libconvert will no longer include duplicate sequences in the STR
        defenition when converting to TSSV format and the reference sequence
        of one of the markers is the same as one of its aliases, or when
        aliases of one marker share one or more prefix or suffix sequences.
      * Updated add_input_output_args() such that the output file is a
        positional argument (instead of -o) for tools that have a single input
        file and no support for batches.
      * Updated add_sequence_format_args() such that the library file can be
        made a required argument.
      * Refined the FDSTools package description, since FDSTools does more
        than just noise filteirng.
      * FDSTools will now do a marginally better job at producing allele names
        for sequences that do not exactly match the provided STR pattern. When
        seeking the longest matching portion of the sequence, it will now also
        test the reversed sequence with a reversed pattern, which sometimes
        yields a longer match. It is still not optimal, though, but some
        refactoring has been done to move away from regular expressions.
      * BGCorrect will now also fill in correction_flags for newly added
        sequences.
      * Adjusted the help text of Samplestats to include the fact that the -c
        and -y options have an OR relation instead of an AND relation.
      * BGCorrect, BGEstimate, BGHomRaw, BGHomStats, BGPredict, and
        Stuttermodel will now ignore special values that may appear in the
        place of a sequence (currently: 'Other sequences' and 'No data').
      
      Removed:
      * The -m/--marker-column and -a/--allele-column arguments of BGPredict
        had no effect and have been removed.
      
      Visualisations:
      * Updated bundled D3 to v3.5.12.
      * In HTML visualisations, if the page is scrolled to the right edge when
        an option is changed that causes the graphs to become wider, the page
        now remains scrolled to the right.
      * Samplevis HTML visualisations:
        * Added 'Clear manually added/removed' link to the table filtering.
        * Reduced flicker of the mouse cursor in Internet Explorer.
        * Added 'Common axis range' checkbox (only available when 'Split
          markers' is off).
        * Added 'Save table' link to save the table of selected alleles to a
          tab-separated file.
        * Added 'PctRecovery' column to the tables of selected alleles.
        * An alert box is now shown when a data file is loaded that contains
          markers that have 'No data'.
        * Added 'Percentage of total reads' to the graph filtering options.
        * Added a note to the table filtering options to explain that the
          minimum percentage correction and recovery have an OR relation.
      ebf700a7
  18. 04 Nov, 2015 1 commit
    • Hoogenboom, Jerry's avatar
      Implemented support for non-STR markers, improved file handling and more · 1083919c
      Hoogenboom, Jerry authored
      Additions and improvements to the FDSTools library file format:
      * New [genome_position] section in FDSTools-style library files allows
      for specifying the chromosome and position of each marker.
      * New [no_repeat] section in FDSTools-style library files allows for
      including non-STR markers.
      * Comma/semicolon/space-separated values in FDSTools-style library files
      can now also be separated by tab characters and multiple consecutive
      separators are no longer collapsed (with the exception of whitespace).
      * If no prefix and/or suffix has been specified for an alias, the
      prefix/suffix of the marker itself is used.
      * Implemented support for non-STR markers (e.g. SNP clusters) and mtDNA
      markers. Allele names of the latter follow mtDNA nomenclature.
      * Improved the logic of generating STR allele names for sequences that
      have a prefix or suffix sequence that was not included in the library
      file.
      * Updated and clarified various explanatory texts in...
      1083919c
  19. 29 Jul, 2015 1 commit
    • jhoogenboom's avatar
      Introducing bgestimate · be745e64
      jhoogenboom authored
      I could write about all its features here, but instead I will point
      out some future plans to highlight the things that are possibly not
      optimal in their current implementation.
      
      There are a number of things I plan to change in the future:
      * The output format is currently JSON, perhaps a carefully designed
        tabular format is a better choice. The benefit of switching to a
        tabluar format is that the data can be loaded into e.g. Excel as
        well.
      * The profiles are currently produced separately for forward and
        reverse reads. I would prefer to integrate these into a single
        computation that estimates allele balance in the heterozygotes
        using both strands as well.
      * I would like to add information about strand bias of the alleles
        as well. The most straightforward way to do this is to set only
        the forward reads of the true allele to 100 and treat the reverse
        reads the same as all background products. You will then obtain a
        number of reverse reads observed for ever 100 forward reads of
        the true allele.
      * I think it would be appropriate to make sure the values in the
        allele balance matrices of each sample ('Ax' in the source code)
        should add up to 1. For homozygotes, it is currently a scalar 1,
        the sum of the elements tend to be more than 1. This means that a
        heterozygous sample has a stronger influence on the profiles than
        a homozygous sample.
      be745e64
  20. 23 Jul, 2015 1 commit
    • jhoogenboom's avatar
      Laying foundations · 160594c5
      jhoogenboom authored
      * Introducing a new, extended library file format to support
        allele name generation.  The new libconvert tool can convert
        TSSV libraries to the new format and vice versa.
      * Added functions for converting between raw sequences, TSSV-style
        sequences, and allele names.
      * Added global -d/--debug option.
      
      Stuttermark updates:
      * Stuttermark now automatically converts input sequences to
        TSSV-style if a library is provided.
      * Stuttermark will no longer crash if there is no 'name' column.
        Instead, all sequences are taken to belong to the same marker.
      
      New tools:
      * libconvert converts between FDSTools and TSSV library formats.
      * seqconvert converts between raw sequences, TSSV-style sequences,
        and allele names.
      * allelefinder detects the true alleles in reference samples.
      160594c5
  21. 02 Jul, 2015 1 commit
    • jhoogenboom's avatar
      Initial commit · 668970ed
      jhoogenboom authored
      FDSTools v0.0.1 with Stuttermark v1.3.
      Other tools will come later.
      668970ed