README.rst 6.84 KB
Newer Older
jhoogenboom's avatar
jhoogenboom committed
1 2 3 4 5 6 7 8 9
Forensic DNA Sequencing Tools
=============================

Tools for filtering and interpretation of Next Generation Sequencing data of
forensic DNA samples. To obtain a list of included tools with a brief
description of each tool, run:

    ``fdstools -h``

10
For a complete description of a specific tool and its command line arguments,
jhoogenboom's avatar
jhoogenboom committed
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
run:

    ``fdstools -h TOOLNAME``


Installation
------------

The recommended way to install FDSTools is by using the ``pip`` package
installer. If you have ``pip`` installed, you can easily install FDSTools by
typing:

    ``pip install fdstools``

Alternatively, FDSTools can be installed by running:
26

jhoogenboom's avatar
jhoogenboom committed
27 28 29
    ``python setup.py install``


30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
Release Notes
-------------
v1.2.0 (2019-03-29)
    Major improvements and fixes to the TSSV tool. Most notably, it no longer
    relies on the external ``tssvl`` program because that is no longer
    compatible with FDSTools. Furthermore, the new TSSV tool v2.0.0 comes with
    a major performance upgrade and has some updated command-line arguments.

    This release also fixes an issue in Samplestats and adds the ability to
    apply graph filtering before noise correction in Samplevis, making the
    effects of noise correction more apparent.

v1.1.1 (2017-03-15)
    Fixeds incorrect calculation of tLeft, fLeft, rLeft, tRight and fRight
    columns in the report output file of TSSV, when -T/--num-threads was set to
    2 or higher. The primary output was unaffected.

v1.1.0 (2017-03-14)
    In STR allele names for sequences that don't exactly match the description
    given in the library file, no more insertions are produced at the end of
    the prefix or the beginning of the suffix, in favour of extra STR blocks.

    Empty input files and broken pipelines are now handled gracefully across
    all tools. Specifically, an empty input file is now treated as if the
    expected columns existed, but no lines of actual data were present. This
    greatly helps in tracking down issues in pipelines involving multiple
    tools, as tools will now shutdown gracefully if an upstream tool fails to
    write output. Only the failing tool will output an error.

    Furthermore, a new option has been added to the TSSV tool, enabling
    multithreading support. This can greatly reduce analysis time by using
    more (or all) cores of the system's processor simultaneously.

    Finally, various small bugs and glitches were fixed.

v1.0.1 (2016-12-21)
    FDSTools library files may now contain IUPAC ambiguous bases in the prefix
    prefix and suffix sequences of STR markers (except the first sequence, as
    it is used as the reference). Additionally, optional bases may be
    represented by lowercase letters.

    An option was added to the Pipeline tool to skip running Allelefinder,
    using a user-supplied allele list file instead. Multiple options have been
    added to the Vis tool and some have been regrouped to more easily find the
    option you are looking for.

    It is now possible to save the a Samplevis HTML visualisation after having
    made changes, preserving the changes made.

    And various minor bug fixes and improvements throughout.

v1.0.0 (2016-10-03)
    Fixed an issue with variant descriptions in allele names of non-STR markers
    that made it impossible to convert those back to raw sequences.

    Added various useful options. Most notably, Samplevis now displays a
    tooltip when the mouse pointer is over an allele, providing various details
    about that allele.

    And various minor bug fixes.

v0.0.5 (2016-09-06)
    Added the Library tool, for creating a template library file that includes
    helpful commentary and examples to get new users started. Creating an empty
    library file used to be a somewhat confusing option in the Libconvert tool.
    Also, the Blame tool was replaced with the more advanced BGAnalyse tool.

    Added the Pipeline tool, which implements some ready-made pipelines
    involving most of the other tools in FDSTools. Three pipelines are
    provided: one for noise reference sample analysis, one for case sample
    analysis, and one for generating a background noise database from the
    reference samples.

    In Samplestats, the default allele calling option thresholds have changed:
104 105
        - Changed default value of -m/--min-pct-of-max from 5.0 to 2.0
        - Changed default value of -p/--min-pct-of-sum from 3.0 to 1.5
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157

    The TSSV tool was updated with an option to increase the panelty given to
    insertions and deletions in the flanking sequences. It now requires TSSV
    version 0.4.0 to be installed.

    Various upgrades to visualisations, bringing a new responsive design to all
    HTML visualisations and fixing various issues.

v0.0.4 (2016-07-26)
    Improved debugging: FDSTools will now print profiling information to stdout
    when the -d/--debug option was specified. Also, all tools now correctly
    interpret '-' as the output filename as 'write to standard out'.

    BGEstimate has gained a new option to require a minimum number of unique
    genotypes in which a specific allele must have been seen before it will be
    considered for noise estimation. This is to avoid 'contamination' of the
    noise profile of one allele with the noise of another. If homozygous
    samples are available for an allele, this filter is not applied to that
    allele.

    Reduced the memory usage of BGPredict and BGMerge. Also, BGPredict will now
    output nonzero values below the threshold set by -n/--min-pct if the
    predicted noise ratio of the same stutter on the other strand is above the
    threshold. Previously, values below the threshold were clipped to zero,
    which may cause unnecessarily high strand bias in the predicted profile.
    Similarly, by default Stuttermodel will no longer output a fit on one
    strand if no fit could be optained on the other strand.

    Changes have been made to rounding and column order in Samplestats.

    Various minor fixes and enhancements have been made, mostly to the
    visualisations.

v0.0.3 (2016-02-02)
    First version of FDSTools with all strings attached. Introduces 15 new tools
    and five visualisations.

    In Stuttermark, the column names 'name' and 'allele' have been changed to
    'marker' and 'sequence', respectively, reflecting those of all the other
    tools. WARNING: Stuttermark is now INCOMPATIBLE with output from TSSV, but
    made compatible with TSSV-Lite and the new, bundled TSSV tool instead.

v0.0.2 (2015-07-23)
    Added a new global option: -d/--debug. This option disables the suppression
    of technical details that would normally be visible when an error occurs.

    Stuttermark now accepts raw sequences and allele names as input, which are
    automatically rewritten as TSSV-style sequences using a specified library
    file. Also, the 'name' column is now optional.

v0.0.1 (2015-07-02)
    Initial version of FDSTools, featuring a single tool: Stuttermark v1.3.