• jhoogenboom's avatar
    Introducing bgestimate · be745e64
    jhoogenboom authored
    I could write about all its features here, but instead I will point
    out some future plans to highlight the things that are possibly not
    optimal in their current implementation.
    
    There are a number of things I plan to change in the future:
    * The output format is currently JSON, perhaps a carefully designed
      tabular format is a better choice. The benefit of switching to a
      tabluar format is that the data can be loaded into e.g. Excel as
      well.
    * The profiles are currently produced separately for forward and
      reverse reads. I would prefer to integrate these into a single
      computation that estimates allele balance in the heterozygotes
      using both strands as well.
    * I would like to add information about strand bias of the alleles
      as well. The most straightforward way to do this is to set only
      the forward reads of the true allele to 100 and treat the reverse
      reads the same as all background products. You will then obtain a
      number of reverse reads observed for ever 100 forward reads of
      the true allele.
    * I think it would be appropriate to make sure the values in the
      allele balance matrices of each sample ('Ax' in the source code)
      should add up to 1. For homozygotes, it is currently a scalar 1,
      the sum of the elements tend to be more than 1. This means that a
      heterozygous sample has a stronger influence on the profiles than
      a homozygous sample.
    be745e64
lib.py 27 KB