fdstools/__init__.py · be745e642d3327812bc732ae6b6d7b60e0948bfd · Hoogenboom / fdstools

jhoogenboom authored Jul 29, 2015
I could write about all its features here, but instead I will point
out some future plans to highlight the things that are possibly not
optimal in their current implementation.

There are a number of things I plan to change in the future:
* The output format is currently JSON, perhaps a carefully designed
  tabular format is a better choice. The benefit of switching to a
  tabluar format is that the data can be loaded into e.g. Excel as
  well.
* The profiles are currently produced separately for forward and
  reverse reads. I would prefer to integrate these into a single
  computation that estimates allele balance in the heterozygotes
  using both strands as well.
* I would like to add information about strand bias of the alleles
  as well. The most straightforward way to do this is to set only
  the forward reads of the true allele to 100 and treat the reverse
  reads the same as all background products. You will then obtain a
  number of reverse reads observed for ever 100 forward reads of
  the true allele.
* I think it would be appropriate to make sure the values in the
  allele balance matrices of each sample ('Ax' in the source code)
  should add up to 1. For homozygotes, it is currently a scalar 1,
  the sum of the elements tend to be more than 1. This means that a
  heterozygous sample has a stronger influence on the profiles than
  a homozygous sample.
be745e64