Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
.. highlight:: none
.. _admin:
Administration
==============
For administrative tasks, Mutalyzer comes with the ``mutalyzer-admin`` command
line utility. Use its ``-h`` argument in combination with any subcommand for
detailed usage information, for example::
$ mutalyzer-admin setup-database -h
usage: mutalyzer-admin setup-database [-h] [--destructive] [-c ALEMBIC_CONFIG]
Setup database tables (if they do not yet exist).
optional arguments:
-h, --help show this help message and exit
--destructive delete any existing tables and data
-c ALEMBIC_CONFIG, --alembic-config ALEMBIC_CONFIG
path to Alembic configuration file
If Alembic config is given (--alembic-config), this also prepares the database
for future migrations with Alembic (recommended).
Managing genome assemblies
--------------------------
Mutalyzer can be loaded with any number of genome assemblies. Each assembly
includes a list of chromosomes. To list the currently loaded genome
assemblies::
$ mutalyzer-admin assemblies list
GRCh37 (hg19), Homo sapiens (9606)
GRCm38 (mm10), Mus musculus (10090)
Loading a new genome assembly is done with information in a JSON file, of
which there are some examples in the Mutalyzer source tree under the
``extras/assemblies`` directory, for example::
$ mutalyzer-admin assemblies add extras/assemblies/GRCh37.json
For any genome assembly, transcript mappings can be imported. These include
genomic coordinate mappings of their CDS and exons. Currently, three sources
of transcript mappings are supported.
.. note:: The following ``mutalyzer-admin assemblies`` subcommands all accept
an optional ``--assembly`` argument to specify the genome assembly
to import to.
Import mappings from an NCBI mapview file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The NCBI provides FTP downloads of transcript mappings for a large number of
genome assemblies as used by their `Map Viewer
<http://www.ncbi.nlm.nih.gov/mapview/>`_ service. These can be imported with
``mutalyzer-admin``, but only after sorting by the *feature_id* and
*chromosome* columns.
For example, to import transcript mappings for the GRCh37 assembly, run the
following::
$ wget ftp://ftp.ncbi.nlm.nih.gov/genomes/MapView/Homo_sapiens/sequence/ANNOTATION_RELEASE.105/initial_release/seq_gene.md.gz
$ zcat seq_gene.md.gz | sort -t $'\t' -k 11,11 -k 2,2 > seq_gene.sorted.md
$ mutalyzer-admin assemblies import-mapview seq_gene.sorted.md 'GRCh37.p13-Primary Assembly'
.. note:: The last argument, ``GRCh37.p13-Primary Assembly``, defines the group
label to filter the file on. You would usually want to include it.
Examples for other assemblies can be found in `this Gist
<https://gist.github.com/martijnvermaat/ce84945d05b4e42d3584>`_.
Import mappings from an EBI LRG transcripts map file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The EBI provides `FTP downloads <ftp://ftp.ebi.ac.uk/pub/databases/lrgex/>`_
of transcript mappings for all of the LRG sequences on the latest human genome
assembly. These can be imported with ``mutalyzer-admin``.
For example, to import LRG transcript mappings for the GRCh37 assembly, run
the following::
$ wget ftp://ftp.ebi.ac.uk/pub/databases/lrgex/list_LRGs_transcripts_GRCh37.txt -O /tmp/hg19.lrgmap.txt
$ mutalyzer-admin assemblies import-lrgmap -a hg19 /tmp/hg19.lrgmap.txt
Import mappings from the UCSC Genome Browser MySQL database
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Transcript mappings from the `UCSC Genome Browser MySQL database
<https://genome.ucsc.edu/goldenPath/help/mysql.html>`_ can be imported on a
per-gene basis. This is useful when the NCBI mappings do not (yet) include a
certain gene or transcript.
For example, to import all TTN transcript mappings::
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
.. note:: This subcommand chooses the UCSC genome assembly by using the alias
of the specified Mutalyzer genome assembly (`hg19` by default).
Import mappings from a reference file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For transcript mappings that are not available from our usual sources,
importing from a genomic reference is supported::
$ mutalyzer-admin assemblies import-reference NC_012920.1
.. note:: Currently this subcommand is restricted to importing mtDNA
transcripts, since it has the chromosome hard coded and only
supports one exon per transcript.
Showing announcements to users
------------------------------
It is possible to define an announcement to be shown on the website
interface. For example, to display *Hello World!* with a link to the `GNU
Hello World! <http://www.gnu.org/fun/jokes/helloworld.html>`_ page::
$ mutalyzer-admin announcement set 'Hello World!' \
--url http://www.gnu.org/fun/jokes/helloworld.html
To remove the announcement, use ``unset``::
$ mutalyzer-admin announcement unset
Synchronizing the cache with other installations
------------------------------------------------
Using the ``sync-cache`` subcommand, the reference file cache of a remote
Mutalyzer installation can be queried for new entries which are then retrieved
and added to the local cache.
The primary purpose for this is synchronizing reference files loaded by users
with the reference file loader between different servers. These reference
files are assigned a unique accession number (starting with ``UD_``) upon
creation, which is at that point unknown to any other Mutalyzer server.
For example, to synchronize the local reference file cache with the `primary
Mutalyzer server <https://mutalyzer.nl/>`_::
$ mutalyzer-admin sync-cache 'https://mutalyzer.nl/services/?wsdl' \
'https://mutalyzer.nl/Reference/{file}'
Mutalyzer database setup
------------------------
After installation, a database needs to be setup for Mutalyzer to run (see
:ref:`install-setup`)::
$ mutalyzer-admin setup-database --alembic-config migrations/alembic.ini
The ``--alembic-config`` argument points to the ``alembic.ini`` file in the
Mutalyzer source tree and it enables initialization of database migration
management. It is recommended to include it, but you don't need it if you
don't plan to ever upgrade your Mutalyzer installation.
This subcommand also takes an optional ``--destructive`` argument, which can
be used to remove any existing database content.