Commits · d94f20cf326b4c9116e8ea672298621e6f96d36b · Mirrors / mutalyzer

Oct 13, 2015
- Refactor unit tests using common py.test layout and fixtures · d94f20cf
  Vermaat authored 9 years ago
  
  d94f20cf
Jul 09, 2015
- Convert DNA to uppercase when reading from plain text · 93159a0e
  Vermaat authored 9 years ago
  
  93159a0e
May 31, 2015
- Configurable maximum input length for description extractor · ee390387
  Vermaat authored 9 years ago
  
  Adds a `EXTRACTOR_MAX_INPUT_LENGTH` configuration setting, defaulting to 50 Kbp.
  ee390387
May 18, 2015
- New description extractor web interface · 55d10b82
  Jeroen F.J. Laros authored 9 years ago and Vermaat committed 9 years ago
  
  We can now compare two sequences by supplying their sequence strings, accession numbers, or uploaded file.
  55d10b82
Nov 24, 2014
- Fix form buttons and general language issues · 9e6ca731
  Vermaat authored 10 years ago
  
  9e6ca731
- Many fixes in templates · 5fc78480
  Vermaat authored 10 years ago
  
  5fc78480
- New website layout by Landscape · 5010bbec
  Jeroen Laros authored 10 years ago and Vermaat committed 10 years ago
  
  5010bbec
Oct 21, 2014
- Unit tests for unicode strings · 66629914
  Vermaat authored 10 years ago
  
  66629914
Oct 20, 2014

Vermaat authored 10 years ago

Don't fix what ain't broken. Unfortunately, string handling in Mutalyzer
really is broken. So we fix it.

Internally, all strings should be represented by unicode strings as much as
possible. The main exception are large reference sequence strings. These can
often better be BioPython sequence objects, since that is how we usually get
them in the first place.

These changes will hopefully make Mutalyzer more reliable in working with
incoming data. As a bonus, they're a first (small) step towards Python 3
compatibility [1].

Our strategy is as follows:

1. We use `from __future__ import unicode_literals` at the top of every file.
2. All incoming strings are decoded to unicode (if necessary) as soon as
   possible.
3. Outgoing strings are encoded to UTF8 (if necessary) as late as possible.
4. BioPython sequence objects can be based on byte strings as well as unicode
   strings.
5. In the database, everything is UTF8.
6. We worry about uploaded and downloaded reference files and batch jobs in a
   later commit.

Point 1 will ensure that all string literals in our source code will be
unicode strings [2].

As for point 4, sometimes this may even change under our eyes (e.g., calling
`.reverse_complement()` will change it to a byte string). We don't care as
long as they're BioPython objects, only when we get the sequence out we must
have it as unicode string. Their contents are always in the ASCII range
anyway.

Although `Bio.Seq.reverse_complement` works fine on Python byte strings (and
we used to rely on that), it crashes on a Python unicode string. So we take
care to only use it on BioPython sequence objects and wrote our own reverse
complement function for unicode strings (`mutalyzer.util.reverse_complement`).

As for point 5, SQLAlchemy already does a very good job at presenting decoding
from and encoding to UTF8 for us.

The Spyne documentation has the following to say about their `String` and
`Unicode` types [3]:

> There are two string types in Spyne: `spyne.model.primitive.Unicode` and
> `spyne.model.primitive.String` whose native types are `unicode` and `str`
> respectively.
>
> Unlike the Python `str`, the Spyne `String` is not for arbitrary byte
> streams. You should not use it unless you are absolutely, positively sure
> that you need to deal with text data with an unknown encoding. In all other
> cases, you should just use the `Unicode` type. They actually look the same
> from outside, this distinction is made just to properly deal with the quirks
> surrounding Python-2's `unicode` type.
>
> Remember that you have the `ByteArray` and `File` types at your disposal
> when you need to deal with arbitrary byte streams.
>
> The `String` type will be just an alias for `Unicode` once Spyne gets ported
> to Python 3. It might even be deprecated and removed in the future, so make
> sure you are using either `Unicode` or `ByteArray` in your interface
> definitions.

So let's not ignore that and never use `String` anymore in our webservice
interface.

For the command line interface it's a bit more complicated, since there seems
to be no reliable way to get the encoding of command line arguments. We use
`sys.stdin.encoding` as a best guess.

For us to interpret a sequence of bytes as text, it's key to be aware of their
encoding. Once decoded, a text string can be safely used without having to
worry about bytes. Without unicode we're nothing, and nothing will help
us. Maybe we're lying, then you better not stay. But we could be safer, just
for one day. Oh-oh-oh-ohh, oh-oh-oh-ohh, just for one day.

[1] https://docs.python.org/2.7/howto/pyporting.html
[2] http://python-future.org/unicode_literals.html
[3] http://spyne.io/docs/2.10/manual/03_types.html#strings

2a4dc3c1

Oct 15, 2014

Fix several error cases in LOVD2 getGS call · bcef1633

Vermaat authored 10 years ago

The `getGS` website view for LOVD2 would report "transcript not found" if
the genomic reference has multiple transcripts annotated or if the variant
description raises an error in the variant checker.

bcef1633

Sep 22, 2014
- Announcement in info webservice method · 763ab1f7
  Vermaat authored 10 years ago
  
  Closes #11
  763ab1f7
Aug 27, 2014
- Move from nose to pytest for unit tests · e6f19d1c
  Vermaat authored 10 years ago
  
  See http://pytest.org/
  e6f19d1c
Feb 17, 2014
- Update BioPython dependency to 1.63 · 0de48334
  Vermaat authored 11 years ago
  
  0de48334
Jan 22, 2014

Use fixtures in the unit tests · c49d49f0

Vermaat authored 11 years ago

This is The Good Stuff. The entire test suite can now be run without
having to setup a database, running the batch checker, any of the web
services or the website. It even passes without an internet connection.
In, like, 30 seconds! Awesome!

This means tests don't randomly fail after some reference sequence
changes on the NCBI server and it doesn't take an entire configured
server with mapping database setup to run the tests. Those are things
of the past! No more frustrations, Mutalyzer is testable!

Going down now...

The mountain screamed three times today
I guess it thought it'd like to play
How much does one have to pay
To fry a peak and melt away
Launching titan's breath on mine
The sweating measure lands on time

And the old man, down by the river
Well he walks up and he walks on down
To the spaceship that's parked at your doorstep
And it's waiting to take you away now

Goin' down now
Goin' down now

Looking for the rate that crowed
He's hooked up down in Mexico
Slap my nerve now give me more
It's my disaster friend, not yours

And the old man, down by the river
Well he walks up and he walks on down
To the spaceship that's parked at your doorstep
And it's waiting to take you away now

And the last one, it's down by the river
Where he gets up and he walks on down
To the spaceship that's parked at your doorstep
And it's waiting to take you away now

It's down by the river, it's always this way now
It's down by the river, it's always this way now

Going down now
Going down now
now, now, now

down, down, down

c49d49f0

Jan 10, 2014

Port Mapping database module to SQLAlchemy · e9bf1bc9

Vermaat authored 11 years ago

This introduces a proper notion of genome assemblies. Transcript
mappings for alle genome assemblies are in the same database, which
is better for maintenance. Updating transcript mappings is also
simplified a lot, especially from NCBI mapview files where we now
require a preprocessing sort on the input file.

Overall, this port touches a lot of Mutalyzer code, so beware.

e9bf1bc9

Jan 04, 2014
- Some fixes for running the unit tests · f2a6cc59
  Vermaat authored 11 years ago
  
  f2a6cc59
- Temporarily skip tests using AL449423.14 (no longer valid) · 323a8be1
  Vermaat authored 11 years ago
  
  323a8be1
Dec 23, 2013

Fix unit tests with SQLAlchemy · 94df7c07

Vermaat authored 11 years ago

This involves making the SQLAlchemy session reconfigurable at run-time,
which is done automatically on updating the Mutalyzer configuration using
configuration update callbacks.

94df7c07

Dec 19, 2013
- Update unit tests for new style configuration · 135866c3
  Vermaat authored 11 years ago
  
  135866c3
Dec 13, 2013
- Fix unit tests in virtualenv · 37935324
  Vermaat authored 11 years ago
  
  37935324
Sep 18, 2013

Update unit tests to pass with latest NCBI data · 4ddd28ad

Vermaat authored 11 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@743 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

4ddd28ad

Jan 14, 2013

Accept trailing tabs in batch input · 2195ace0

Vermaat authored 12 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@663 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

2195ace0

Oct 05, 2012

Fix unit test for restriction site analysis · 8f31e51d

Vermaat authored 12 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@620 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

8f31e51d

Aug 21, 2012

Rename 'webservice' to 'web service' (for Peter) · 028f66b7

Vermaat authored 12 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@601 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

028f66b7

Jul 11, 2012

Update unit tests for r563 · 6445361f

Vermaat authored 12 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@567 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

6445361f

Jun 07, 2012

Unit test for non-interactive links · 6a90c42e

Vermaat authored 12 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@548 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

6a90c42e

May 30, 2012

Documentation on HTTP/RPC+JSON webservice · 450ed861

Vermaat authored 12 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/rpclib-branch@544 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

450ed861

May 09, 2012

Refactor name checker to accept GET requests · d5f1e49d

Vermaat authored 12 years ago

Until now, the name checker form used POST requests. As a special case for
linking from LOVD, GET requests were handled by showing results without the
full interface (header, menu, etc). Ordinary linking was done to a separate
checkForward location which then set a cookie with the variant name and a
redirect to the name checker results. This was all very hackish and somewhat
broken (see Track issue #94).

This commit refactors the name checker to use GET requests, so ordinary
bookmarking of result pages is possible. All old entrypoints are handled with
a redirect for backwards compatibility.

Worth mentioning is that variant descriptions in the name checker are now
limited in length by the maximum query string length. However, this maximum
is several thousand characters with Internet Explorer having the lowest
maximum of just over 2000. Longer descriptions are not practically checked
with the name checker web interface anyway, so this should not be a problem.

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@521 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

d5f1e49d

Feb 20, 2012

Unit tests for describe module · f143415d

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@483 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

f143415d

Jan 26, 2012

Fix unit test for commit -r449 · 8bdac8dd

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@456 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

8bdac8dd

Jan 08, 2012

Fix BED track for UCSC Genome Browser · 1f06518e

Vermaat authored 13 years ago

Previously, the 'color' field in the BED track we provide for the Genome
Browser was set to the empty string, but Gerard informed us that this is not
(always) handled correctly:

   Expecting number field 5 line 2 of https://mutalyzer.nl/bed?..., got +

We now explicitely set the 'color' field to '0'.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@439 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

1f06518e

Nov 24, 2011

Add strandedness to BED tracks · 325130bb

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/browser-link-branch@423 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

325130bb

Nov 10, 2011

Add restriction site effects to batch output · 383b55f8

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@417 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

383b55f8

Oct 21, 2011

Fix url in BED track · df171de1

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/browser-link-branch@396 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

df171de1

Oct 10, 2011

Give an error on protein reference sequences · 18e228a6

Vermaat authored 13 years ago

Until protein reference sequences are supported, give an error with a message
instead of crashing. Fixes #62.

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@388 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

18e228a6

Accept IVS locations in Variant_info LOVD interface · 302ba055

Vermaat authored 13 years ago

The fix is a hack, basically copying the same functionality from the
namechecker. I think this should eventually be merged somehow.

Fixes #63


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@387 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

302ba055

Sep 29, 2011

Rename GenBank Uploader to Reference File Loader · 234ff625

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/refactor-mutalyzer-branch@370 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

234ff625

Sep 27, 2011

Add HEAD method to /Reference download · f8d9bab7

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/refactor-mutalyzer-branch@369 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

f8d9bab7

Fix crash in VariantInfo (thanks Ivo) · 6f20208c

Vermaat authored 13 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/refactor-mutalyzer-branch@368 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

6f20208c

Sep 20, 2011

Fix uploading non-genbank files · 3fbe50a1

Vermaat authored 13 years ago

- Issue a warning instead of an UD number when a user uploads something
  that is not a genbank file (fixes #59).
- Fix a Heisenbug in the batch job unit tests. Wait a little before
  downloading the batch job results.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/refactor-mutalyzer-branch@365 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

3fbe50a1