- 17 Nov, 2020 1 commit
-
-
Mihai authored
* Fix #428 * Improve warning messages for positions outside of the sequence range (#479). * Improve intronic positions with non-genomic references warning (#464). * Improve duplication warning (#466).
-
- 04 Dec, 2018 1 commit
-
-
Mihai authored
-
- 14 Jun, 2018 1 commit
-
-
Mihai authored
-
- 11 Jun, 2018 1 commit
-
-
Mihai authored
- Switch to new LRG location (fix for #447). - Extract only the gene name from the `updatable` section. - Fix for LRG transcripts with no coding region. - Adapted LRG examples on the website. - Extract the annotation set based on attribute type. - More informative error message (part of #135). - Makes #338 obsolete.
-
- 24 May, 2018 1 commit
-
-
Mihai authored
-
- 22 Jun, 2016 1 commit
-
-
mkroon authored
-
- 15 Jun, 2016 1 commit
-
- 12 Jun, 2016 1 commit
-
-
Vermaat authored
The NCBI is phasing out the use of sequence GIs, so we have no other choice than to drop support for them. https://www.ncbi.nlm.nih.gov/news/03-02-2016-phase-out-of-GI-numbers/ Fixes #349
-
- 25 May, 2016 1 commit
-
-
Vermaat authored
This is not perfect yet, but a slight improvement for input variants of a type we don't support. Fixes for example #375
-
- 18 Dec, 2015 1 commit
-
-
Vermaat authored
This fixes a bug where transcripts created from CDS by construction did not show up in the legend because the legend was created before that construction.
-
- 23 Sep, 2015 3 commits
-
-
Vermaat authored
The alternative variant protein sequence translated from a non-reference start codon (created by the variant), was not color-diffed as normal variant protein sequences are. In the process we also rename the `oldprotein` and `newprotein` fields in the output object to `oldProtein` and `newProtein` to be more consistent with other field names.
-
Vermaat authored
In the case of an alternative start codon (in the reference CDS), protein changes were not visualised. This is fixed and a WALTSTART warning is also issued. Also, if a new non-reference start codon is created by the variant, visualise this as such.
-
Vermaat authored
In case of an alternative start codon, the variant CDS was not translated to a protein starting with M. This caused the protein description machinery to conclude a variant affecting the start codon, hence reporting `p.?`. We fix this by always translating the start codon to M (except when the variant actually affects it). Example: `NM_024426.4:c.1107A>G` (a synomymous mutation) should yield `NM_024426.4(WT1_i001):p.(=)`, not `p.?`. The start codon for that protein is `CTG`.
-
- 15 Jul, 2015 1 commit
-
-
Vermaat authored
When a variant results in a frame shift or extension and we don't see a new stop codon in the RNA, the protein description should use the notation for an uncertain stop codon, e.g., `p.(Gln730Profs*?)` instead of `p.(Gln730Profs*96)` where 96 is just the last codon in our transcript [1]. To detect this, we now use `to_stop=False` in our `.translate()` calls, since that will explicitely return `*` characters for stop codons. We also slightly fix the coloring of changes in the protein sequence where previously changed stop codon characters where not included. [1] http://www.hgvs.org/mutnomen/FAQ.html#nostop
-
- 20 Oct, 2014 1 commit
-
-
Vermaat authored
Don't fix what ain't broken. Unfortunately, string handling in Mutalyzer really is broken. So we fix it. Internally, all strings should be represented by unicode strings as much as possible. The main exception are large reference sequence strings. These can often better be BioPython sequence objects, since that is how we usually get them in the first place. These changes will hopefully make Mutalyzer more reliable in working with incoming data. As a bonus, they're a first (small) step towards Python 3 compatibility [1]. Our strategy is as follows: 1. We use `from __future__ import unicode_literals` at the top of every file. 2. All incoming strings are decoded to unicode (if necessary) as soon as possible. 3. Outgoing strings are encoded to UTF8 (if necessary) as late as possible. 4. BioPython sequence objects can be based on byte strings as well as unicode strings. 5. In the database, everything is UTF8. 6. We worry about uploaded and downloaded reference files and batch jobs in a later commit. Point 1 will ensure that all string literals in our source code will be unicode strings [2]. As for point 4, sometimes this may even change under our eyes (e.g., calling `.reverse_complement()` will change it to a byte string). We don't care as long as they're BioPython objects, only when we get the sequence out we must have it as unicode string. Their contents are always in the ASCII range anyway. Although `Bio.Seq.reverse_complement` works fine on Python byte strings (and we used to rely on that), it crashes on a Python unicode string. So we take care to only use it on BioPython sequence objects and wrote our own reverse complement function for unicode strings (`mutalyzer.util.reverse_complement`). As for point 5, SQLAlchemy already does a very good job at presenting decoding from and encoding to UTF8 for us. The Spyne documentation has the following to say about their `String` and `Unicode` types [3]: > There are two string types in Spyne: `spyne.model.primitive.Unicode` and > `spyne.model.primitive.String` whose native types are `unicode` and `str` > respectively. > > Unlike the Python `str`, the Spyne `String` is not for arbitrary byte > streams. You should not use it unless you are absolutely, positively sure > that you need to deal with text data with an unknown encoding. In all other > cases, you should just use the `Unicode` type. They actually look the same > from outside, this distinction is made just to properly deal with the quirks > surrounding Python-2's `unicode` type. > > Remember that you have the `ByteArray` and `File` types at your disposal > when you need to deal with arbitrary byte streams. > > The `String` type will be just an alias for `Unicode` once Spyne gets ported > to Python 3. It might even be deprecated and removed in the future, so make > sure you are using either `Unicode` or `ByteArray` in your interface > definitions. So let's not ignore that and never use `String` anymore in our webservice interface. For the command line interface it's a bit more complicated, since there seems to be no reliable way to get the encoding of command line arguments. We use `sys.stdin.encoding` as a best guess. For us to interpret a sequence of bytes as text, it's key to be aware of their encoding. Once decoded, a text string can be safely used without having to worry about bytes. Without unicode we're nothing, and nothing will help us. Maybe we're lying, then you better not stay. But we could be safer, just for one day. Oh-oh-oh-ohh, oh-oh-oh-ohh, just for one day. [1] https://docs.python.org/2.7/howto/pyporting.html [2] http://python-future.org/unicode_literals.html [3] http://spyne.io/docs/2.10/manual/03_types.html#strings
-
- 15 Oct, 2014 1 commit
-
-
Vermaat authored
The `getGS` website view for LOVD2 would report "transcript not found" if the genomic reference has multiple transcripts annotated or if the variant description raises an error in the variant checker.
-
- 01 Jul, 2014 1 commit
-
- 01 Mar, 2014 1 commit
-
-
Vermaat authored
The name checker supports reverse complement ranges in insertions and insertions-deletions, for example `3_4ins8_12inv'. Reverse complement range insertions and insertion-deletions are not part of the current HGVS nomenclature, but will be proposed.
-
- 28 Feb, 2014 1 commit
-
-
Vermaat authored
The name checker supports ranges in insertions and insertion- deletions, for example `3_4ins8_12`, and compound insertions and insertion-deletions, for example `3_4ins[ATC;8_12]`. The inserted sequences are accepted and concatenated before any further processing, so reported descriptions show only the concatenated sequences. The support for ranges is limited to genomic descriptions. The position converter supports compound insertions and insertion-deletions, not ranges. Compound insertions and insertion-deletions are not part of the current HGVS nomenclature, but will be proposed.
-
- 16 Jan, 2014 1 commit
-
-
Vermaat authored
This includes changing a lot of routes and parameter names to be more consistent. We try to remain backwards compatible as much as possible by providing redirects from old routes and parameter names.
-
- 10 Jan, 2014 1 commit
-
-
Vermaat authored
This introduces a proper notion of genome assemblies. Transcript mappings for alle genome assemblies are in the same database, which is better for maintenance. Updating transcript mappings is also simplified a lot, especially from NCBI mapview files where we now require a preprocessing sort on the input file. Overall, this port touches a lot of Mutalyzer code, so beware.
-
- 04 Jan, 2014 1 commit
-
-
Vermaat authored
-
- 23 Dec, 2013 1 commit
-
- 19 Dec, 2013 1 commit
-
-
Vermaat authored
Remove the dependency on configobj and have default values for all configuration settings. User settings are defined in a Python module pointed to by the MUTALYZER_SETTINGS environment variable. We also clean up many configuration settings and remove some that are no longer used.
-
- 25 Mar, 2013 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/mapping-mouse@683 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 12 Feb, 2013 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@667 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 15 Dec, 2012 1 commit
-
-
Laros authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@655 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 29 Nov, 2012 1 commit
-
-
Vermaat authored
Changed codes: WCDSSELECTED -> WCDS WCDS -> WCDS_OTHER WSPLICESELECTED -> WSPLICE WSPLICE -> WSPLICE_OTHER git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@644 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 26 Nov, 2012 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@641 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 04 Oct, 2012 2 commits
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@617 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@616 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 26 Jul, 2012 1 commit
-
-
Laros authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@589 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 21 Jun, 2012 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@557 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 21 May, 2012 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@528 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 20 Mar, 2012 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@499 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 21 Feb, 2012 1 commit
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@488 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 18 Feb, 2012 1 commit
-
-
Laros authored
Added simple error handling for the description extractor. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@480 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- 31 Jan, 2012 3 commits
-
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@473 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@472 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Vermaat authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@471 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-