Splitting a profile and palindromes
As correctly noted in !6 (merged), splitting a profile currently has no special handling of palindromes. Perhaps we should change that, not really sure.
Let's first make it clear that we can never really know what to do with palindrome counts. We just have to decide on a behaviour here.
Currently, palindromes end up in both lists, effectively increasing the combined total k-mer count. But we might want to maintain the total k-mer count as an invariant.
The alternative to the current approach is dividing the palindrome over both lists, where we have several options what to do with an odd palindrome count:
- Always add the remainder to the first list.
- Add the remainder to a random list.
- Add the remainder to one of the list, depending on the k-mer (e.g., starting with A or T to the first list).
I would probably prefer option 3. It maintains the invariant, is deterministic, and does on average not favor one list over the other.
@j.f.j.laros any thoughts?