Commit 2f33e62c authored by Vermaat's avatar Vermaat
Browse files

Move to Sphinx for developer documentation

This is quite a large commit, touching many things related to developer
documentation. It is all focussed on getting as much of this as possible
into the new Sphinx-based documentation.

Some highlights:

- Start Sphinx-based developer documentation, including fairly complete
  instructions for installation and configuration.
- Remove epydoc API docs.
- Rework some docstrings to conform to reStructuredText, so they can be
  used in the API docs generated by Sphinx.
- Move all of the top-level text files to reStructuredText so they can
  linked from the Sphinx-based docs and for consistency.
- Remove many obsolete things from the extras/ directory, including old
  installation scripts and migrations.

Many of the installation related documentation and scripts are removed
or adapted in light of the new automated deployment using Ansible.
parent 47d5633b
Leiden University Medical Center department of Human Genetics <>
Jeroen Laros <>
Martijn Vermaat <>
Jonathan Vis <>
Gerben Stouten <>
Gerard Schaafsma <>
Daria Dvorkina <>
Alisa Muraveva <>
Mutalyzer 2 is designed and developed by Jeroen F.J. Laros at the department
of Human Genetics at Leiden University Medical Center where it is currently
maintained by Martijn Vermaat.
The following people have worked on developing Mutalyzer:
- Jeroen Laros
- Martijn Vermaat
- Jonathan Vis
- Gerben Stouten
- Gerard Schaafsma
- Daria Dvorkina
- Alisa Muraveva
Specifications are given by Peter E.M. Taschner and Johan T. den Dunnen.
Furthermore we would like to thank the following people for their valuable
work on previous versions that acted as a guideline for the development of the
current version:
- Ernest van Ophuizen
- Martin Wildeman
- Corinne Bareil
This is a record of changes made between each Mutalyzer release.
Version 2.0.beta-31
Release date to be decided.
Version 2.0.beta-30
Released on February 18th 2014.
- Handle NCBI Entrez response validation errors (fixes, among other things,
`LOVD#29 <>`_).
- Loosen error severity when CDS cannot be translated.
- Mutalyzer development migrated from Subversion to Git for version control.
Version 2.0.beta-29
Released on October 11th 2013.
- Add Jonathan Vis attribution and COMMIT logo to about page.
Version 2.0.beta-28
Released on September 18th 2013.
- Enable the HTTP/RPC+JSON web service to be used with POST requests.
Version 2.0.beta-27
Released on June 18th 2013.
- Fix caching transcript-protein links from NCBI, reducing impact of NCBI
communication problems.
Version 2.0.beta-26
Released on April 9th 2013.
- Added mm10 (Mouse) transcript mappings to position converter.
- LRG parser updated to LRG 1.7 schema (`#127
Version 2.0.beta-25
Released on March 25th 2013.
- Detect incorrect exon annotation in transcript references.
- Move documentation to Trac.
- Exon table is included in `runMutalyzer` webservice results.
- Temporarily disable frameshift detection in experimental description
extractor (`#124
- Allow selectors on transcript references in position converter.
- Syntax checker now supports protein level variant descriptions.
Version 2.0.beta-24
Released on December 10th 2012.
- Rename some warning codes (webservice API) (`#98
- Variants on mtDNA in position converter.
Version 2.0.beta-23
Released on November 8th 2012.
No user-visible changes.
Version 2.0.beta-22
Released on November 2nd 2012.
- Submitting batch jobs via the web services (`#115
- Allow for leading whitespace in batch job input (`#107
- New `descriptionExtract` webservice function.
- Name checker now includes description extractor output as an experimental
- Slice chromosome by gene name in reference file loader is now case
insensitive (`#118
- Warn on missing positioning scheme (`#114
Version 2.0.beta-21
Released on July 23rd 2012.
- Support compound variants in position converter.
- Support non-coding transcripts in position converter (`#102
- Move to new RPC library version, causing slight change in HTTP/RPC+JSON
webservice output (more wrappers around output), but fixes #104.
- Fix position converter for delins with explicit deleted sequence.
- Fix description update from Version 2.0.beta-20 to use- notation instead of
Version 2.0.beta-20
Released on July 21st 2012.
- Disabled the ``-u`` and ``+d`` convention in favour of the official HGVS
Version 2.0.beta-19
Released on June 21st 2012.
- Fix crash on inversions (`#99
Version 2.0.beta-18
Released on June 7th 2012.
- Moved from soaplib to rpclib for webservices (`#66
- Added HTTP/RPC+JSON webservice (`#18
- Fixed name checker errors in some adjacent variants (`#83
- Name checker form now uses GET requests to support easier linking to result
- You can now specify chromosomes by name in the reference file loader (`#92
- Made batch daemon not crash on MySQL restarts (`#91
- Position converter now detects incorrect order in position ranges (`#95
- Added NBIC logo to 'about' page.
Version 2.0.beta-17
Released on April 2nd 2012.
- Fixed crossmapping bug for some transcripts.
- Fixes for NCBI Entrez EFetch Version 2.0 release.
- Better chromosomal variant descriptions.
- Various smaller features and bugfixes.
Version 2.0.beta-16
Released on March 1st 2012.
- Fixed position converter mapping info for some transcripts.
- Fixed deletion with deleted sequence length as argument.
Version 2.0.beta-15
Released on February 20th 2012.
- Added 'Description Extractor' (see the main menu).
- Fixes for NCBI Entrez EFetch Version 2.0 release.
- Added chromosomal positions to `getTranscriptsAndInfo` webservice.
- Fixed chromosome slicing on reverse complement
- Fixed describing NOP variants with ``=``.
- Added Reference sequence info in `runMutalyzer` SOAP function response.
- Fixed mapping info for genes mapped to more than one chromosome.
- Various smaller features and bugfixes.
Version 2.0.beta-14
Released on January 26th 2012.
- Added a SOAP service `getTranscriptsMapping`.
- Various smaller features and bugfixes.
Version 2.0.beta-13
Released on January 25th 2012.
- Accept EX positioning scheme.
- Fix handling of LRG reference sequences.
- Various smaller features and bugfixes.
Version 2.0.beta-12
Released on November 25th 2011.
- Accept plasmid reference sequences.
- View variant position in UCSC Genome Browser (only for transcript
- Retry querying dbSNP if it does not respond the first time.
- Support reference GenBank files built from contigs.
- Add optional argument to SOAP service `numberConversion` to map chromosomal
locations to any gene.
- Various smaller features and bugfixes.
Version 2.0.beta-11
Released on September 30st 2011.
- Major code refactoring:
- Mutalyzer is now structured as a proper Python package.
- Reworked installation and upgrade procedure.
- Remote installation using Fabric.
- Batch scheduler is now a proper system daemon.
- Use mod_wsgi (with instead of the deprecated mod_python.
- Added a lot of internal documentation.
- Introduce unit tests.
- Handle deletions of entire exons.
- Added a SOAP service `info`.
- Handle unknown (fuzzy) intronic positions.
- Automatic synchronization of database and cache between Mutalyzer
- Use NCBI instead of UCSC for transcript mapping info.
- Added a SOAP service `getdbSNPDescriptions`.
- Moved Trac and Subversion repository to new server.
- Implement HTTP HEAD method for ``/Reference/*`` locations.
- Added a SOAP service `ping`.
- Added an optional versions parameter to the SOAP service `getTranscripts`.
- Various smaller features and bugfixes.
Version 2.0.beta-10
Released on July 21st 2011.
- Greatly reduce runtime for large batch jobs.
Version 2.0.beta-9
Released on June 27th 2011.
- Reworked the calculation of new splice site positions.
- Optionally restrict SOAP service `getTranscriptsAndInfo` transcripts to a
- Add raw variants to SOAP service `runMutalyzer` results.
- Provide webservice client examples.
- Various smaller features and bugfixes.
Older versions
The first lines of code for Mutalyzer 2.0 where written July 28th 2009, and
version 2.0.beta-8 was released on January 31st 2011. As far as Mutalyzer 1 is
concerned, archaeology is not really our field of research.
Mutalyzer development
Development of Mutalyzer happens on the GitLab server:
Coding style
In general, try to follow the [PEP 8](
guidelines for Python code and
[PEP 257]( for docstrings.
Unit tests
The unit tests depend on a running batch daemon, webserver, and SOAP
web service:
sudo /etc/init.d/mutalyzer-batchd start
sudo /etc/init.d/apache2 start
Now run the tests with:
nosetests -v
Or, if you are in a hurry, skip the long-running tests with:
Working with feature branches
New features are best implemented in their own branches, isolating the work
from unrelated developments. In fact, it's good practice to **never work
directly on the master branch** but always in a separate branch. For this
reason, the master branch on the GitLab server is locked. Feature branches can
be merged back into master via a **merge request** in GitLab.
Before starting work on your feature, create a branch for it:
git checkout -b your-feature
Commit changes on this branch. If you're happy with it, push to GitLab:
git push origin your-feature -u
Now create a merge request to discuss the implementation with your
colleagues. This might involve adding additional commits which are included in
the merge request by pushing your branch again:
git commit
git push
You may also be asked to rebase your branch on the master branch if it has
changed since you started your work. This will require a forced push:
git fetch
git rebase origin/master
git push -f
If the work is done, a developer can merge your branch and close the merge
request. After the branch was merged you can safely delete it:
git branch -d your-feature
Release management
The current Mutalyzer version is recorded in `mutalyzer/`. See the
comments in that file for more info of the versioning scheme.
On the event of a new release, the following is done:
emacs mutalyzer/
Update `__date__`, remove `dev` from `__version_info__` and set `RELEASE` to
git commit -am 'Bump version to 2.0.beta-XX'
git tag -a 'mutalyzer-2.0.beta-XX'
git push --tags
emacs mutalyzer/
Set `__version_info__` to a new version ending with `'dev'` and set `RELEASE`
to `FALSE`.
git commit -am 'Open development for 2.0.beta-YY'
Be sure to upgrade your installations to the new version as described in the
INSTALL file (e.g. `sudo python develop` for development checkouts).
Development notes
Todo list:
- Improve the web interface design :)
- Test all uses of mkstemp().
- Use naming conventions for modules Crossmap, Db, File, GenRecord, Retriever
and Scheduler.
- Use standard logging module, with rotating functionality. Race conditions
on the log file are probably a problem in the current setup.
Instead of that rotating, we could also use logrotate:
- Setup continuous integration. Currently, I'm most impressed with Hudson.
Or perhaps Jenkins.
- Migrate Javascript to JQuery.
- I think in the long run, the Output object is not really the way to go. It
obscures the control flow. The logging part should use the standard logging
module. The data gathering by the Output object is probably better handled
by explicitely returning data objects from functions.
- Migrate from TAL to a more mondern and maintained Python template library,
for example jinja.
- Develop a large test suite.
- Create a web interface url to watch the progress of a batch job.
- Create web services for the batch jobs (steal ideas from Jeroen's DVD
web service).
- Use virtualenv?
- Use SQLAlchemy?
- Password for MySQL user.
- In deployment, remove old versions of Mutalyzer package?
- Check for os.path.join vulnerabilities.
- Use a standard solution for the database migrations in extras/migrations.
- Use something like Sphinx to generate development documentation from code.
- There are some problems with the batch architecture, especially that there
cannot be multiple workers without synchronisation problems.
Good read:
- Have a normal 404 page.
- Maintenance (and/or read-only) mode.
- Cleanup this document.
- Be more explicit in all the type of descriptions we don't currently support.
Code style guide:
- Follow PEP 8 (code) and PEP 257 (docstrings).
Read the Google Python Style guide:
- Use Epydoc style documentation in docstrings.
- End class and method definitions with their name as comment.
- Executables are in the bin/ directory.
- For examples, check established Python projects:
- A lot of code does not yet adhere to these points, this is an ongoing
Obsoleted features:
- On
/etc/apache2/mods-enabled/rewrite.load contains a rewrite rule that converts
"Variant_info.php" to "Variant_info".
When all LOVD versions are above 2.0-23, this rule can be deleted and the
rewrite module can be disabled.
- In the Variant_info() function a substitution on error messages is
When all LOVD versions are above 2.0-23, this check can be deleted.
Mutalyzer depends on the following (Debian/Ubuntu) packages:
- mysql-server >= 5.1
- python >= 2.6
- python-mysqldb >= 1.2.2
- python-biopython >= 1.54
- python-pyparsing >= 1.5.0
- python-configobj >= 4.4.0
- python-magic >= 5.04-2
- python-psutil >= 0.1.3-1
- python-xlrd >= 0.6.1-2
- python-daemon >= 1.5.5
- python-soappy >= 0.12.0-2
- python-suds >= 0.3.9-1
The web and SOAP interfaces depend on the following packages:
- apache2 >= 2.2.11
- libapache2-mod-wsgi >= 2.8
- python-webpy >= 0.33
- python-rpclib >= 2.8.0-beta
- python-simpletal >= 4.1-6
Automatic remote deployment depends on Fabric:
- fabric >= 0.9.0-2
The unit tests depend on the following packages:
- python-nose >= 0.11
- python-webtest >= 1.2.3
As of 2011-08-23, snakefood reports the following imports from the Mutalyzer
source code (excluding the standard library imports):
Mutalyzer installation instructions
Default configuration notes
The instructions in this file are quite specific to the standard Mutalyzer
environment. This consists of a Debian stable (Squeeze) system with Apache
and Mutalyzer using its mod_wsgi module. Debian conventions are used
The following is an overview of default locations used by Mutalyzer:
Package files /usr/local/lib/python2.6/dist-packages/...
Configuration /etc/mutalyzer/config
Log file /var/log/mutalyzer.log
Cache directory /var/cache/mutalyzer
Batchd init script /etc/init.d/mutalyzer-batchd
Mapping update crontab /etc/cron.d/mutalyzer-mapping-update
Apache configuration /etc/apache2/conf.d/mutalyzer.conf
Static website files /var/www/mutalyzer/base
The default database user is 'mutalyzer' with no password and the database
names are 'mutalyzer', 'hg18', and 'hg19'.
By default, Mutalyzer is exposed under the '/mutalyzer' url by Apache.
All Mutalyzer processes run under the www-data user and files created and/or
modified by Mutalyzer are owned by this user.
If you have a different environment, or want to customize the default
locations, you can read through these instructions and modify them to your
Short version
Run the following commands:
git clone
cd mutalyzer
sudo bash extras/
sudo python install
sudo bash extras/
sensible-browser http://localhost/mutalyzer
Or follow the more detailed instructions below.
Automated deployment on a remote host
For deploying Mutalyzer on a remote (production or testing) host, we recommend
to automate the steps described below by using Fabric and the included
fabfile. You need Fabric installed on your local machine:
easy_install fabric
To do a deployment on a server with an existing configured Mutalyzer
fab deploy -H
To do a fresh deployment on a new server:
fab deploy:boostrap=yes -H
Get Mutalyzer
Since you are reading this, you can probably skip this step. Otherwise, get
your hands on a tarball and:
tar -zxvf mutalyzer-XXX.tar.gz
cd mutalyzer-XXX
Or get the source from GitLab directly:
git clone
cd mutalyzer
Install dependencies
If you are on Debian or Ubuntu, you can use the following command to install
all dependencies:
sudo bash extras/
Otherwise, install them manually (perhaps have a look in the above script for
a useful dependency list).
Install Mutalyzer
Mutalyzer can be installed using Python setuptools. For a production
sudo python install
Alternatively, if you want to have a development environment, use: