1. 29 Feb, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      Expected allele lengths, and more · 29fcc171
      Hoogenboom, Jerry authored
      Added:
      * Added a new section expected_allele_length to the FDSTools library
        format. In this section, the minimum and (optionally) maximum allele
        length of each marker can be specified.
      * Added -L/--check-length option to the TSSV tool. If specified, the
        tool will use the expected_allele_length values to filter the results.
      * Samplevis can now truncate long allele names to a given number of
        characters (defaulting to 70).
      * Added an option to Samplestats to keep negatives when filtering (abs
        filter).
      
      Changed:
      * Renamed the --aggregate-below-minimum option of the TSSV tool to
        --aggregate-filtered.
      
      Improved:
      * Added an option to read_sample_data_file such that other code can
        request or require that the X_corrected columns are used.
      * Samplestats will now round to 4 or 5 significant digits if a value is
        above 1000 or 10000, respectively.
      * BGHomRaw will no longer round the forward, reverse, and total columns.
      * When generating mtDNA allele names, FDSTools will now try to avoid
        creating gaps in the alignment of the sequences against the reference.
      * Grouped the filtering options of the TSSV tool in its help text.
      * Cleaned up some leftover code for special sequence value handling
        (more specifically: code that expected ensure_sequence_format to
        return False for special sequence values, which it no longer does).
      * Cleaned up some dead legacy code in reduce_read_counts.
      29fcc171
  2. 02 Feb, 2016 1 commit
    • Hoogenboom, Jerry's avatar
      Big update: Bumped version to v0.0.3 · ebf700a7
      Hoogenboom, Jerry authored
      Updated Stuttermark to v1.5. WARNING: This version of Stuttermark is
      INCOMPATIBLE with output from previous versions of FDSTools and TSSV.
      
      Introducing TSSV-Lite
      * New tool tssv acts as a wrapper around TSSV-Lite (tssvl). Its primary
        purpose is to allow running TSSV-Lite without having to convert the
        FDSTools library to TSSV format, and to offer allelename output. Like
        all other tools in FDSTools, it also works with TSSV library files but
        its allele name generation capabilities are limited in that case.
      
      Changed:
      * TSSV-Lite and the new TSSV tool in FDSTools have two columns renamed
        w.r.t. the original TSSV program: 'name' has been changed to 'marker',
        and 'allele' has been changed to 'sequence'. All tools in FDSTools
        have been updated to use the new column names. This change affects
        Allelefinder, BGCorrect, BGEstimate, BGHomRaw, BGHomStats, BGPredict,
        Blame, Samplestats, Samplevis, Stuttermark, Stuttermodel, and
        Seqconvert. Note that this change will BREAK COMPATIBILITY of these
        tools with old data files.
      
      Fixed:
      * In Samplevis HTML visualisations, the "percentage recovery" table
        filtering option used the absolute number of recovered reads instead.
      * Added PctRecovery to the tables in Samplevis HTML visualisations.
      * BGPredict will now print a nice error message if the -n/--min-pct
        option is set to zero or a negative number, to avoid division by zero.
      * Samplestats would crash if the input file contained the flags column.
      * FDSTools would crash when trying to convert sequences to allele names
        using a TSSV library.
      
      Improved:
      * Libconvert will no longer include duplicate sequences in the STR
        defenition when converting to TSSV format and the reference sequence
        of one of the markers is the same as one of its aliases, or when
        aliases of one marker share one or more prefix or suffix sequences.
      * Updated add_input_output_args() such that the output file is a
        positional argument (instead of -o) for tools that have a single input
        file and no support for batches.
      * Updated add_sequence_format_args() such that the library file can be
        made a required argument.
      * Refined the FDSTools package description, since FDSTools does more
        than just noise filteirng.
      * FDSTools will now do a marginally better job at producing allele names
        for sequences that do not exactly match the provided STR pattern. When
        seeking the longest matching portion of the sequence, it will now also
        test the reversed sequence with a reversed pattern, which sometimes
        yields a longer match. It is still not optimal, though, but some
        refactoring has been done to move away from regular expressions.
      * BGCorrect will now also fill in correction_flags for newly added
        sequences.
      * Adjusted the help text of Samplestats to include the fact that the -c
        and -y options have an OR relation instead of an AND relation.
      * BGCorrect, BGEstimate, BGHomRaw, BGHomStats, BGPredict, and
        Stuttermodel will now ignore special values that may appear in the
        place of a sequence (currently: 'Other sequences' and 'No data').
      
      Removed:
      * The -m/--marker-column and -a/--allele-column arguments of BGPredict
        had no effect and have been removed.
      
      Visualisations:
      * Updated bundled D3 to v3.5.12.
      * In HTML visualisations, if the page is scrolled to the right edge when
        an option is changed that causes the graphs to become wider, the page
        now remains scrolled to the right.
      * Samplevis HTML visualisations:
        * Added 'Clear manually added/removed' link to the table filtering.
        * Reduced flicker of the mouse cursor in Internet Explorer.
        * Added 'Common axis range' checkbox (only available when 'Split
          markers' is off).
        * Added 'Save table' link to save the table of selected alleles to a
          tab-separated file.
        * Added 'PctRecovery' column to the tables of selected alleles.
        * An alert box is now shown when a data file is loaded that contains
          markers that have 'No data'.
        * Added 'Percentage of total reads' to the graph filtering options.
        * Added a note to the table filtering options to explain that the
          minimum percentage correction and recovery have an OR relation.
      ebf700a7
  3. 04 Nov, 2015 1 commit
    • Hoogenboom, Jerry's avatar
      Implemented support for non-STR markers, improved file handling and more · 1083919c
      Hoogenboom, Jerry authored
      Additions and improvements to the FDSTools library file format:
      * New [genome_position] section in FDSTools-style library files allows
      for specifying the chromosome and position of each marker.
      * New [no_repeat] section in FDSTools-style library files allows for
      including non-STR markers.
      * Comma/semicolon/space-separated values in FDSTools-style library files
      can now also be separated by tab characters and multiple consecutive
      separators are no longer collapsed (with the exception of whitespace).
      * If no prefix and/or suffix has been specified for an alias, the
      prefix/suffix of the marker itself is used.
      * Implemented support for non-STR markers (e.g. SNP clusters) and mtDNA
      markers. Allele names of the latter follow mtDNA nomenclature.
      * Improved the logic of generating STR allele names for sequences that
      have a prefix or suffix sequence that was not included in the library
      file.
      * Updated and clarified various explanatory texts in generated FDSTools
      library files.
      
      Fixed:
      * Fixed a bug that caused prefix/suffix variants in aliases to go
      missing in allele names.
      
      Improved file handling:
      * Library files are now closed immediately after parsing them.
      * Sample data input files are opened one at a time now.
      
      Visualisations:
      * Updated Vega to version 2.3.1.
      * Worked around a bug in Google Chrome that caused the 'Save image' link
      to stop working after having been used once.
      1083919c
  4. 03 Sep, 2015 1 commit
    • jhoogenboom's avatar
      Introducing StuttermodelVis (not complete yet) · e0eef88d
      jhoogenboom authored
      * Added StuttermodelVis HTML file and JSON spec. The rendering
        works, but some of the options are not implemented yet. It is
        also not yet added to the Vis tool.
      * Changed the order of stuttermodel's coefficients: 'a' used to be
        the most significant coefficient, now it is the least significant
        coefficient (the shift). The benefit of this is that when moving
        to higher-order polynomials, the extra coefficients do not change
        the meaning of the others. So 'a' is now always the shift, 'b' is
        the linear component, 'c' the quadratic, etc.
      * Added some development notes (including todo list) that I had
        kept outside of the project until now.
      e0eef88d
  5. 01 Sep, 2015 1 commit
    • jhoogenboom's avatar
      Cleanup and minor enhancements · 03fc3d49
      jhoogenboom authored
      * BGCorrect and Stuttermark will now exit with an error message if
        more than one input file for the same sample is specified and no
        separate output files are given. Previously these tools would
        just overwrite the output file repeatedly, discarding the output
        of all but the last data file of the sample.
      * Removed to main() functions and related stubs from the tools
        because they are not actually runnable directly anyway.
      * Added some more help text to some of the tools.
      * Doubled the size of the marker name filter input element on the
        HTML visualisations.
      03fc3d49
  6. 12 Aug, 2015 1 commit
    • jhoogenboom's avatar
      Introducing BGMerge · 6207d485
      jhoogenboom authored
      * New tool BGMerge can be used to merge background noise profiles
        (e.g., merge BGPredict output with a database previously
        obtained from BGEstimate).
      * Fixed two major bugs in BGPredict that resulted in incorrect fit
        functions being used.
      * BGEstimate, BGPredict, BGHomStats, Blame, and StutterModel no
        longer crash if a library file is specified.
      * Added reverse strand profile estimation to BGPredict.
      6207d485
  7. 11 Aug, 2015 1 commit
    • jhoogenboom's avatar
      Introducing BGPredict · 276a0439
      jhoogenboom authored
      * New tool BGPredict predicts background noise profiles (containing
        only stutter products) for user-supplied alleles/sequences using
        a trained stutter model obtained from Stuttermodel. Currently
        only the amounts of the forward strand are predicted.
      * New option -L/--min-lengths for Stuttermodel allows to set a
        minimum required number of unique repeat lengths to base the
        fits on (default: 5).
      * Updated formatting of output of Stuttermodel: added '+' sign to
        positive stutter, limited r2 scores to 3 decimal places, and now
        all coefficients are written in scientific notation with 3
        decimal places.
      * The --output-column option of SeqConvert now defaults to using
        the value of --allele-column.
      276a0439
  8. 10 Aug, 2015 1 commit
    • jhoogenboom's avatar
      Intoducing StutterModel · 818ddd2b
      jhoogenboom authored
      * New tool StutterModel fits polynomials to stutter ratio vs repeat
        length.
      * Changed -R to -Q (--limit-reads) so that I can reassign -R to an
        option that is used more often.
      * Changed -r to -R (--report) to make sure it will not collide with
        the -r option in Stuttermark, if I ever want to add report output
        to Stuttermark.
      * BGHomStats now checks whether all alleles are detected
      818ddd2b