Option to penalise indels more heavily than substitutions
Made two changes:
-
Added -n/--indel-score argument. Insertions and deletions in the flanking sequences are penalised this number of times more heavily than mismatches. The default is 1, i.e., the penalty for insertions and deletions is the same as for mismatches. Therefore, default behaviour has not changed. We set this value to 2 in our analysis pipeline because we were having issues with TSSV favoring a deletion in the flanking sequence over a substitution. The result was that some alleles ended with the first base of the right flank attached, which showed up in downstream analysis as an insertion.
-
TSSV will now install the compiled
_sg_align
module inside the tssv package directory. Earlier versions would install this module straight into the top-level site-packages or dist-packages directory. Installing the new version will leave behind the_sg_align.so
file of the previous installation in the site-packages or dist-packages directory, but TSSV will use the new file in TSSV's project directory. The old file can safely be deleted.
This version has been tested over the previous month or so with no issues. I propose to release this as v0.3.1 and have taken the liberty to add this version to the README.md
as well.