Commits · 48bb332db0d8e6b89dea8957d1c4cdf16ce5c964 · Mirrors / mutalyzer

Jun 14, 2010

Commit to do a merge with web_dev. This version is not suitable for · 48bb332d

Laros authored 14 years ago

distribution as it is under heavy development.

Most modules will have minor changes because of a difference in set up of both
the Db and Config module.

install.sh:
- Added functionality to enable the cron restart of the Batch Checker.
- Added the auto-generation of a .htaccess file.
- Added permission settings.

mutalyzer.conf:
- Added configuration options for the Scheduler, File and GenRecord modules.

Db.txt:
- Described how to make the new ChrName tables for hg18 and hg19.

errorcodes.txt:
- Added classifications to the messages.

doc:
- Made a set up for the documentation.
  TechnicalReference:
  - This will be a technical document that describes the internals of the
    project. It is only meant for developers.
  API:
  - This is a description of the API, it is auto generated by the mkapidoc.sh
    script. Also only meant for developers.

Mutalyzer.py:
- Added a new roll function that will always find both boundaries.
- Implemented a new protein naming scheme.
- Fixed the trimming of a delins.
- Rewrote the processing of a variant. 
  - Moved post processing of the GenBank record to the GenRecord module.
  - Moved the crossmapper instance to the GenBank module, to make one instance
    per transcript variant.
  - Moved the naming of a variant to the GenBank module, as is strongly
    interacts with the crossmapper instance.
- Moved the constructCDS function to the GenRecord module.

handler.py:
- Added functionality for the batch checker (retrieve results).
- Added functionality for the genbank uploader (retrieve GenBank files).

webservice.py:
- Modified to work with the new Db module.

UCSC_update.py:
- Modified to work with the new Db module.

GenRecord.py:
- Replaced the dictionary structure with a nested list structure to make
  iteration more convenient.
- Added names to the Locus and Gene objects.
- Added all information needed to do a crossmapping in the Locus object.
- Wrote functions to find Loci and Genes.
- Wrote a function that expands a description of a variant (coupled to a Locus).

Mutator.py:
- Added documentation.

Parser.py:
- Added documentation.

Web.py:
- Added documentation.
- Added a function that checks whether a string is an e-mail address.

Scheduler.py:
- Implemented a batch scheduler that uses a MySQL database for queueing.

File.py:
- Implemented a CSV, XLS and ODS parser for use in the Scheduler module.

Output.py:
- Added documentation.

Mapper.py:
- Modified the complex object initialisation.

Config.py:
- Made subclasses to configure the separate modules.

Db.py:
- Added documentation.
- Split the Db modules into different classes, according to functionality, they
  all inherit the query function from the Db base class.
- Added chromosome accession number to name conversion functions and vice versa.
- Added functionality for the batch checker.

Crossmap.py:
- Added documentation.

Retriever.py:
- Added documentation.
- Added fall back functionality when searching for a gene.

index.py:
- Added a batch submit interface.

batch.html:
- The layout of the batch submit interface.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@30 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

48bb332d

May 19, 2010

Merged /branches/web_dev:r20-26 into the trunk. · bdee40a2

Laros authored 14 years ago

Modified the merged changes to work with the new output module.

Modified the layout of the project.
- Removed the Clients, Interfaces and Services subdirectories.

handler.py, Web.py, index.py:
- Modified for the new layout.

Variant_info.py is renamed to VarInfo.py because of a conflict of names with 
the function Variant_info() in index.py.
VarInfo.py:
- Modified for the new layout.

webservice.py:
- Modified for the new layout and for the new Output module.

Mapper.py:
- Added by the merge and modified for the new Output module.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@27 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

bdee40a2

May 18, 2010

Altered the Output module in such a way that all messages are stored in a · 4a8a9986

Laros authored 14 years ago

list with priorities, and all output is stored in a dictionary. This dictionary
can be read at a later time.
The Retriever module is changed to accommodate for uploaded GenBank files and
slices of a chromosome.
A Scheduler was added for batch checking.

mutalyzer.conf:
- Added variables for the Retriever module:
  - maxDldSize, minDldSize ; Maximum and minimum sizes for slices and uploaded 
    GenBank files.
- Added variables for the Output module:
  - loglevel, outputlevel ; Specify default verbosity levels for logging and 
    output.
- Added variables for the Scheduler module:
  - processName ; Name of the running scheduler.

Db.txt:
- Added information on how to create the newly used tables GbInfo, BatchQueue
  and BatchJob.

errorcodes.txt:
- Short description of the error codes used in the Output module.

webservice.py:
- Added a Complex class to test more complicated return types (see the web_dev
  branch).
- Modified the code to work with the new Output module.

index.py:
- Made a first upload page.

Mutalyzer.py:
- Modified the code to work with the new Output module.

UCSC_update.py:
- Modified the code to work with the new Output module.

Variant_info.py:
- Modified the code to work with the new Output module.

GenRecord.py:
- Added documentation.

Mutator.py:
- Modified the code to work with the new Output module.

Misc.py:
- New file, used for generating unique IDs.

Parser.py:
- Made a change to the definition of an UD accession number.
- Modified the code to work with the new Output module.

Scheduler.py:
- Made a batch checker scheduler.
  - isDaemonRunning() ; See if we need to be started.
  - process() ; Start the batch checker.
  - addJob() ; Add jobs to a queue in the database.

Output.py:
- Added a Message class to store all debug, info, warning, error and fatal
  messages.
  - If a message is given that exceeds the configured log level, it will be
    logged immediately.
- A function is added to the Output class to read all messages that exceed a 
  certain verbosity level.
- A function is added to create a named list as an output node.
- With the getOutput function the content of this list can be retrieved.

Config.py:
- Several sub-classes were added for each configurable module.

Db.py:
- Added documentation.
- Added functionality that is used by the Retriever module.
- Added functionality that is used by the Scheduler module.

Retriever.py:
- Added functionality to be able to use custom GenBank files and chromosome
  slices.
  - Information on these created files are stored in a database to be able to
    re-create them when the cache is cleaned.
  - The hash of each file is stored for error detection.

BatchChecker.py:
- A wrapper that is called either from the addJob() function from the Scheduler
  module, or from cron. It dispatches a background process that processes the
  batch jobs.

gbupload.html:
- Test template for uploading files (copied from Mutalyzer 1.0.4).

sp.py:
- Some test with a complex return type.

download.html:
- Did some first tests with a METAL template.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@26 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

4a8a9986

May 12, 2010

In mutalyzer.conf: · 6f527e78

Gerard Schaafsma authored 14 years ago

changed dbName = "hg18"
In webservice.py
changed the names of arguments v1, v2, v3 and v4 in all methods to 'more readable' names
changed C and D to Conf and Database in all methods

method extractChange was added, which extracts from a complete HGVS variant description the part after the coordinates (positions)
and the start position of the variant

method cTogConversion was added, which converts a complete HGVS variant description in c. notation to g. notation

method gTocConversion was added, whicht converts a complete HGVS variant description in g. notation to c. notation

In Mapper.py
method conversionToCoding was added, which converts non-star c. positions to star c.positions
this comment should have been added with the previous commit

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@25 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

6f527e78

· 51dd89e6

Gerard Schaafsma authored 14 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@24 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

51dd89e6

May 03, 2010

In webservice.py, the method varInfo() was replaced by two methods: · c0561c57

Gerard Schaafsma authored 14 years ago

mappingInfo() and transcriptInfo. These methods are used, respectively,
when a variant is present, or when its not.
Mapper.py is the new version of Variant_info.py, with the two methods
described above defined here: mainMapping and mainTranscript
Config.py and Crossmap.py were not actually changed.

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@23 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

c0561c57

Apr 21, 2010

Added to Install.txt: · 37a61215

Gerard Schaafsma authored 15 years ago

	python-soaplib   >= 0.8.1
which had been forgotten

Added to webservice.py:
	from Variant_info import Complex
because you use the Complex object in the SOAP decorator
	@soapmethod(String, String, String, String, _returns = Complex)
Changed some stuff in the documentation
Replaced the return type in the SOAP decorator of varInfo()
	@soapmethod(String, String, String, String, _returns = Complex)
The whole definition of varInfo() was added:
	+        import Variant_info
	+        from Modules import Web
	+        from Modules import Config
	+        from Modules import Db
	+        from Modules import Output
	+    
	+        C = Config.Config()
	+        D = Db.Db(C, "local")
	+        L = Output.Output(C, __file__)
	+    
	+        L.LogMsg(__file__, "Reveived request varInfo(%s %s %s %s)" % (
	+                 v1, v2, v3, v4))
	+
	+        W = Web.Web()
	+        result = Variant_info.main(v1, v2, v3, v4)
	+        del W
	+
	+        L.LogMsg(__file__, "Finished processing varInfo(%s %s %s %s)" % (
	+                 v1, v2, v3, v4))
	+
	+        del L
	+        del D
	+        del C
	+#        return str(result.split("\n")[:-1])
	+        return result
Added to webservic.py:
	+    @soapmethod(String, String, String, String, _returns = Complex)
	+    def mapInfo(self, v1, v2, v3, v4) :
	and so on
because varInfo() now lacks the ability to deal with the possibility that
the variant (v4) is not provided, this functionality has been transferred to
varMap() in Variant_map.py

Variant_info.py has also been copied to /src/Services
Variant_map.py has also been copied to /src/Services

In Variant_info.py a new Python object was defined:
	+class Complex(ClassSerializer) :
	and further
to return an object holding the information about a variant, and not a string
as was previously done, and which had to be parsed.
This also includes the type code definition with TC.struct:
	+Complex.typecode = TC.Struct(Complex, [ TC.Integer('startmain'),
	+                                        TC.Integer('startoffset'),
	+                                        TC.Integer('endmain'),
	+                                        TC.Integer('endoffset'),
	+                                        TC.Integer('start_g'),
	+                                        TC.Integer('end_g'),
	+                                        TC.String('mutationType') ], 'Complex')
	+
	+
Improved the following error messages:
	if not db_version :
	if db_version != version :
Added an error message:
	if not var :
because this functionality moved to Variant_map.py
Changed the return type from string to the Complex object V
And changed the main part to:
	-    __process(LOVD_ver, build, acc, var, Conf, O)
	+    result = __process(LOVD_ver, build, acc, var, Conf, O)
	+    return result
Changed the following stuff in Config.py:
	+        # Figure out where this program is located and go two levels up.
	+        import os
	+        myPath = os.path.dirname(__file__) + "/../.."
	+        os.chdir(myPath)
which is necessary for finding the mutalyzer.conf file



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@22 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

37a61215

Apr 15, 2010

Added protein descriptions in case of a mutation to a stop codon, and a silent · aa545ff8
Laros authored 15 years ago
```
mutation.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@21 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
```
aa545ff8

Second alpha release. · f87343a1

Laros authored 15 years ago

The classes Mutator, Output and Db are now derived classes of Config. The
class Retriever is a derived class of Output. This reduces the amount of
code and variable passing significantly.

mutalyzer.conf:
- Converted the dbName variable to a dbName list, to accommodate for more than
  one database.
- Added variables flanksize, maxvissize and flankclipsize for the visualisation
  in Mutator.py (instead of the previous alignment).

Mutalyzer.py:
- Resolved the range-swap issues.
- Resolved the reverse-complement (cosmetic) issues.
- Fixed a cosmetic bug in the __bprint() function.
- Added a new function __nsplice(), to accommodate for a CDS extension.
- Added a function __toProtDescr(), that gives a protein description in case
  of a simple substitution.
- Added functionality for n. m. and EST notations.
- Added functionality for other species (translation tables).
- Corrected the roll-rule for insertions.
- Added fallbacks for missing CDS and mRNA lists and positions (for an EST for
  example).
- Added an input check for wrong gene symbols.
- Added a temporary exception for in frame stop codons.

webservice.py:
- Added the private functions __checkBuild(), __checkChrom() and __checkPos()
  that do routine checks in a number of services.
- Added a 'build' variable to getTranscripts(), getTranscriptsRange() and
  getGeneName()
- Added exceptions that raise a Fault() object to make the client receive a
  SOAP exception.

UCSC_update.py:
- Added functionality for more than one database.

Variant_info.py:
- Added functionality for more than one database.
- Added functionality to deal with non-coding transcripts.

Genrecord.py:
- Made a RecordObj() object that consists of the old 'genelist' dictionary, 
  combined with 'mol_type' and 'organelle' variables. Also, a fake gene named
  'source' is included to accommodate for sequences that do not contain any
  annotated genes (an EST for example).
- The Locus object is extended with a 'txTable' variable to accommodate for
  different organelles (mitochondria) and other species.

Mutator.py:
- Replaced the alignment visualisation by a home-made one. Also see the
  variables that were added to the configuration file to alter the behaviour of
  this visualisation.
- Modified the shiftpos() function when inserting something on a splice site
  boundary (now it extends the exon).

Output.py:
- Minor modifications for the new inheritance scheme.

Config.py:
- Minor modifications for the new inheritance scheme.

Db.py:
- Minor modifications for the new inheritance scheme.
- Added functionality to handle multiple databases.
- Added an isChrom() function, used by the webservices check functions.

Crossmap.py:
- Made a change in the usage of the '__STOP' variable. It is set to
  transcription stop if there is no stop codon present. This makes conversion
  to an n. notation trivial.

Retriever.py:
- Minor modifications for the new inheritance scheme.
- Added a check for invalid accession numbers (or versions).
- Added a check for erroneous genbank files that can occur when the NCBI is
  overloaded. The erroneous file is purged and the user can try again.

templates/sp.py
- Modified the sample code to accommodate for the new 'build' variable.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@20 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

f87343a1

Feb 26, 2010

Branch web_dev created. · 2fab03f4

Laros authored 15 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@19 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

2fab03f4

Made a first start with a METAL template, Added a documentation web page · 1957ecfc

Laros authored 15 years ago

(autogenerated from source) and wrote checks for the Delins.

handler:
- Added handling of everything in the templates/base directory.

webservice:
- Documented the Var_info webservice.

index:
- Added a documentation page, which generates documentation from source.

Parser:
- Changed argument parsing for Indel, Ins and Inv.

Web:
- Added a tal2() function to test METAL templates.

Mutalyzer:
- Added checks for Delins (trim the longest common prefix and the longest
  common suffix of Arg1 and Arg2).
- Made a general function to check optional arguments.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@18 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

1957ecfc

Feb 19, 2010

Removed the obsoleted WSDL file. · fbf9b333

Laros authored 15 years ago


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@17 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

fbf9b333

Enhanced the mapping capabilities of the Db module, added a new webservice and · f49bc7f6

Laros authored 15 years ago

wrote documentation.

Made an accumulative mapping info table, this requires regular polling of new
data from the UCSC:
- Added:
  - install.sh: A preliminary installation script. Now only used for cron 
    entries.
  - Db.txt: Some loose documentation on how to make the new mapping table, to 
    be incorporated with an installation script.
  - src/UCSC_update.py: The update program, to be called from cron each day.
- Modified:
  - mutalyzer.conf: Added variables needed for the remote database of the UCSC.
  - Db.py: Rewritten nearly every SQL query to work with the new mapping table
    and to be able to download and import updates from the UCSC.
  - Config.py: Modifications to work with the new configuration variables.

Added:
- templates/sp.py: A webservice client template script.
- templates/download.html: The download page for developers.

Modified:
- Install.txt: Added more depenencies.

Switched to soaplib for the generation of a WSDL file. Webservices are now
published by adding functions to the MutalyzerService class in webservice.py,
each function should have a soapmethod decorator to specify the types.

Modified:
- handler.py: To work with soaplib.
- webservice.py: 
  - Put everything in a class to make soaplib able to generate a WSDL file.
  - Added the varInfo() webservice (calls the Variant_info script).
- index.py: Added documentation.
- Mutator.py: Added documentation.
- Web.py: Added documentation.
- Mutalyzer.py: Added generation of a new description in g. and c. notation.
- Db.py: Modified the get_Transcripts function to be able to work with
  overlapping and non-overlapping ranges.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@16 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

f49bc7f6

Feb 03, 2010

Cleaned up the code for a new alpha release. · e4094f10

Laros authored 15 years ago

Added:
- Web.py: A module with some general functions used by the interfaces.
  - A version (this is deliberately kept out of the config file).
  - A run() wrapper that returns standard output of any function as a string.
  - A tal() function that parses a TAL template.
  - A read() function that returns the input of a file.

Modified:
- Install.txt: In the apache config, a PythonPath must be set now (dynamically
  setting it did not give consistent output).
- handler.py : Cleaned the source by using the Web class.
- webservice.py : Cleaned the source by using the Web class.
- index.py : Cleaned the source by using the Web class.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@15 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

e4094f10

Feb 02, 2010

In this version the whole project has been restructured. · 42267dda

Laros authored 15 years ago

The main structure is as follows:

/                     ; Root of the installation.
  - Install.txt
  - Todo.txt
  - Obsoleted.txt
  - mutalyzer.conf
  - var/              ; Variable data.
      - cache/
      - mutalyzer.log
  - templates/        ; HTML, XML, JavaScript, etc.
  - src/
    - Mutalyzer.py
    - Variant_info.py
    - Services/       ; Webservices.
    - Clients/        ; Example clients for webservices.
    - Modules/        ; The core modules.
    - Interfaces/     ; Interfaces to mod_python.

Apart from changes that were needed to deal with this new structure, no changes
in the code were made.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@14 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

42267dda

Feb 01, 2010

Added: · 5371f394

Laros authored 15 years ago

- Todo.txt.
- handler.py: A general handler for mod_python. This handler dispatches
SOAP services and a normal HTML publisher. Furthermore, it is able to handle
raw requests to dump HTML or XML files. This handler is TAL enabled.
- webservice.py: A publisher for webservices. When a new webservice is
added, this is the entrypoint for the server side (just like index.py, add
a function).
- getTranscripts.py: A webservice that reports all transcripts that overlap
with a certain genomic position (chomosome, position).
- getGeneName.py: A webservice that finds the gene name of a given transcript.
- service.wsdl: This is the definition of the interface for webservices.
A client must download this file and parse it to obtain a programming
interface, then the client can use this interface just like any local
function.
- Obsoleted.txt: A list of things that will be deleted in the future (but are
still functional for backwards compatibility).
- client/sp.py: A test client for the two webservices.

Renamed Main.py to Mutalyzer.py.

Modified:
- mutalyzer.conf: Added a configurable date prefix for logging.
- Install.txt:
- To reflect the difference in configuration of apache to work with the new
handler (requires less configuration).
- Added TAL as a new dependency.
- html/check.html: Made it a full TAL template. Title, version and output are
now separated from the HTML design.
- Mutator.py: Made the shiftpos() function public, this is needed for insertion
checking.
- Parser.py: Updated the comments.
- Variant_info: Made all internal functions private.
- Output.py: Updated the comments.
- Config.py: Modified to reflect the changes in mutalyzer.conf.
- Db.py:
- We now keep the handle to the database open until the object is deleted.
- Added a destructor that closes the handle to the database.
- Added getTranscripts(): Get a list of transcripts, given a chromosome and a
position on that chromosome.
- Added get_GeneName(): Get the gene name of a given transcript.
- Retriever.py: Updated the comments.
- Crossmap.py:
- Made a patch that handles a CDS start on the first position of the
transcript.
- Added more unit tests.
- index.py:
- Added a switch for older versions of LOVD, to generate the expected output
in Variant_info.
- Made this publisher compatible with TAL.
- Mutalyzer.py: Made all internal functions private.

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@13 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

5371f394

Dec 31, 2009

Added an html directory, where (slightly modified) html files reside. For normal · 2db71816

Laros authored 15 years ago

html includes, a server directive has to be altered (described in Install.txt).
- check.html: The main page for mutation checks (used to be a large string in 
  index.py).

Output.py:
- Added an instance variable, this is the name of the module that created the
  Output object. This variable is used for more verbose logging.

Variant_info.py:
- Modifications for the new Output.py.
- Added error handling for parse errors.

Main.py:
- Modifications for the new Output.py.
- Renamed function rrr() to main().

index.py:
- Removed the large html string, it is now loaded from file with the 
  __readhtml() function.
- Made a __run() function, that wraps any function, executes it and returns
  standard output of this function as a string.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@12 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

2db71816

Dec 30, 2009

Obsoleted Variant_info.php: · bcdda3bc

Laros authored 15 years ago

- It is now handled by index.py and mod_python.
- Added a main() function in Variant_info.py to make it callable.

Eliminated the need for a whole CDS, only CDS start and stop are needed from
now on: 
- Variant_info.py: the CDS needs not to be built anymore.
- Main.py: We now give the location of te CDS, not the CDS itself.

Crossmapper.py:
- Fixed a bug that occurred when the CDS starts on the last nucleotide of an
  exon.
- Added a ``small CDS'' and a ``CDS start on splice site'' test in the unit
  test of the crossmapper.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@11 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

bcdda3bc

Dec 23, 2009

Added Output.py: A logging facility. Important output like warnings and errors · 1c7b0777

Laros authored 15 years ago

can be sent to this object, which sends errors and explicit logging messages
to a log (defined in mutalyzer.conf) and warning messages to standard output.
Behaviour of this object may change in the future, adding severity and logging
above a certain severity level is one option that would increase debugging 
possibilities.
- NiceName() returns a short description of the calling program (we can not 
  use the default __name__ here.
- ErrorMsg() Print the nice name of the calling module, an error message and 
  log it. Also increase an error counter.
- WarningMsg() Print the nice name of the calling module and an error message.
  Also increase a warning counter.
- LogMsg() Only log the message (nice name of the calling module and the 
  message itself).
- Summary() Give the number of errors and warnings.
- A unit test is also defined (it does not do much at this moment).

Added index.py: The web interface to mutalyzer, it is dependent on mod-python,
we chose for this interface to eliminate the need for php. Also apache is now
added to the list of dependencies. The configuration of mod-python is described
in Install.txt.

GenRecord.py: 
- Moved the splice() function to Main.py. 
- Added an exon list to the Plist class. This list can be used as a fallback
  in case the mRNA tag is missing from a GenBank file.
- Added an empty unit test.

Mutator.py:
- Added a standard alignment for visualisation. This will probably be replaced
  in newer versions.

Parser.py:
- Added functionality for the new output module.
- Made the parser gracefully return, instead of exit on a parse error. This is
  needed for the web interface.

Variant_info.py:
- Added functionality for the new output module. Note that output generated by
  this program should go to a different log, something for a future version.
- Fixed a bug that occurred when a CDS start or stop was on an exon boundary.

Main.py (heavily under development, names of functions are not very descriptive
  yet):
- Added functionality for the new output module.
- Fixed a bug in the roll() function, it returned a wrong value for forward
  genes.
- Added the function bprint(), it formats a large string to be printed in an
  insightful way (like GenBank does it), it also prints the offsets at the
  beginning of each line.
- Obsoleted the ErrorMsg() and WarningMsg() functions.
- Added constructCDS(), a function that is able to construct a CDS from an
  mRNA list, CDS start and CDS stop. In the future we would like to work 
  without a CDS list, so this function will be obsoleted.
- Added the splice() function (from GenRecord.py).
- Made a function rv() that is able to process a RawVar. This function is
  seperated from the ppp() function to be able to work with an allele 
  description.
- Added splicing.
- Added translation to a protein.
- Added a function rrr() which is to be called from Main.py itself or from 
  index.py.

Config.py:
- Added a log variable for Output.py.

Crossmap.py:
- Fixed a bug concerning genes where the entire CDS is in one exon.
- Added more uncertainty handling.

Retriever.py:
- Added functionality for the new output module.
- Added handling of accession numbers with no version. It downloads the latest
  version, and gives a warning.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@10 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

1c7b0777

Nov 30, 2009

Renamed c2g to Variant_info, because of the new functionality. It now does: · 0438e3f1

Laros authored 15 years ago

- c2g and g2c conversion.
- Returns information about translation start, translation end and CDS stop
  when no variant is given.

Variant_info.py:
- Added g2c conversion.
- Added transcription start, transcription end and CDS stop info.
- Made the variant optional (will return info if the variant is empty).
- Made a function getcoords() that returns a triple (main, offset, g) to do
  both c2g and g2c conversions.
- For these conversions to work seamlessly, changes to the crossmapper were
  made. g2x() now returns a tuple (main, offset) in non-star notation and
  helper functions are made to convert from and to HGVS notation.

Parser.py:
- Removed the `i' from the list of possible nucleotides, otherwise the word
  `delins' is ambiguous.
- Replaced the sign `>' to `subst' in the output of the parser, to make the
  downstream code more readable.
- Replaced the `ins' output to `delins' when a combination of `del' and `ins'
  is found.
- Added a PtLoc as a prefix for an indel, this used to be a range only.
- Added parenthesis and a question mark as optional in RawVar and
  SimpleAlleleVar both are used to indicate uncertainty. E.g.
  12del;(12del);(12del)?;12del?;(12del;12del)?
- Added a unit test.

Main.py:
- Use the more readable `subst' instead of `>'.
- Use the g2c() helper function instead of g2x().

Config.py:
- Added a unit test (will crash if no configuration file is found).

Crossmap.py:
- Renamed __star and __rstar to int2main and main2int. They are now helper
  functions that can be used externally.
- Added the __trans_start and __trans_end member variables, used for the new
  info() member function and for the `u' and `d' offset prefixes for upstream
  and downstream UTR positions.
- Added the functions int2offset() and offset2int() to translate a tuple 
  (main, offset) to a HGVS intronic position and an intronic position to an
  offset.
- Added tuple2string() that converts a tuple (main, offset) to a c. notation.
- Added helper function g2c(), see Main.py.
- Added an info() function that returns a triple (trans_start, trans_end, 
  CDS_stop).
- Removed the functions c2str(), star2abs(), off2int() and c2g().
- Convert a `?' as an intronic position to a 0. This may be changed in the
  future.
- Altered the unit test to reflect the changes.
  


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@9 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

0438e3f1

Nov 17, 2009

Modified: · 36c6e7f3

Laros authored 15 years ago

- c2g.php:
  - Used escapeshellargs to harden the code.
  - Added a content line to return plain text instead of html.
- c2g.py:
  - Fully documented the code.
  - Made a function to convert a list of strings to a list of integers (to 
    avoid excessive int() calls).
  - Added error handling for missing NM numbers and missing version numbers.
- Parser.py:
  - Made a unit test for a rather complicated variant.
- Config.py:
  - Made a unit test (will crash if no config file is found).
- Db.py:
  - Added error handling for the get_NM_version() function. It will now return
    0 if an error occurs. It will also return an integer instead of a string
    from now on.
  - Made a unit test that gets the username / db information from the config
    file and tests all query functions (will crash if the MySQL database is not
    configured correctly, or if the config file does not contain the right
    info).
- Crossmap.py:
  - Fixed a bug in the Crossmap module (positions shifted when the CDS starts 
    in an other exon).
  - Made an extensive unit test:
    - Check the splice sites in c. notation of a hypothetical gene.
    - Check whether the splice sites are the same for the same gene in reverse
      orientation.
    - Do some conversion checking from c. to g. and vice versa in both
      orientations.
    - Do all tests above for the n. notation too.
    - Check whether the c. notation of the start codon does not change if an
      upstream exon is removed (also see the previous bug).
  - Removed the __STAR variable and introduced the __STOP variable. The
    difference is that __STOP contains the real stop position, which makes
    conversion to a position relative to the start codon easier.
  - Made star2abs(), which converts a c. position in star-notation to one that
    is relative to the start codon (for c2g).
  - Made off2int(), which converts a sign, offset pair to an integer (for c2g).
- Retriever.py:
  - Added a unit test that retrieves an accession number (will crash if the
    config file does not contain the right info).

- Mutator.py, Scheduler.py:
  - Added an empty unit test (tests needed).


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@8 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

36c6e7f3

Nov 11, 2009

Added: · 41ac9420

Laros authored 15 years ago

- src/c2g.py
  - This program returns the genomic coordinates of a position in c. notation,
    it depends on the Config, Db Crossmap and Parser modules.
  - The input consists of the following variables:
    - LOVD_ver      ; The version of LOVD, ignored for now.
    - build         ; The build of the human genome, assumed to be hg19 for now.
    - accno.version ; The accession number and version on which the c. 
                      positions are defined.
    - var           ; The variation (without an accession number).
  - If the accession number and version are present in the database, it returns
    all positions (one or two) in g. notation relative to the chromosome.
- c2g.php 
  - This is a web interface for c2g.py, it is called in the following way:
    c2g.php?LOVD_ver=a&build=b&acc=c.d&var=e where a, b, c, d and e are 
    described above (note that the variation (e) should be HTML encoded).

Altered (with respect to c2g.py):
- Install.txt
  - Documented how to set up the c2g program.
- Db.py
  - Added two new query functions:
    - get_NM_info    ; Get exonStarts, exonEnds, cdsStart, cdsEnd and strand
                       information, given an NM accession number.
    - get_NM_version ; Get the NM version, given an NM accession number.
  - Made a general query function, that is called by the specific query 
    functions.
- Crossmap.py
  - Added helper functions for the output of c2g.
    - c2str ; Returns a string given mainsgn, main, offsgn and offset.
    - c2g   ; Returns a genomic position given mainsgn, main, offsgn and offset.

Altered (for Mutalyzer itself):
- GenRecord.py
  - Added default locus tag handling.
- Mutator.py
  - Added a duplication function.
- Parser.py
  - Added argument passing:
    - Substitutions: Arg1, Arg2 (Arg1>Arg2 for example).
    - Deletions: Arg1 (delArg1).
    - Ins: Arg1 (insArg1).
- Retriever.py
  - Added code to make the cache directory if it does not exist. This 
    eliminates the need for the clean.sh script.
- Main.py (most of the new functions have to be migrated elsewhere)
  - Altered the roll function to be able to roll in both directions.
  - Made a palindrome snoop function that finds the smallest string that is 
    not invariant under reverse complement, this function is also used to 
    detect perfect palindromes.
  - Made PtLoc2main and PtLoc2offset; get integers from the locations returned
    by the nomenclature parser.
  - Made Error and Warning message wrapper functions.
  - Made a function that tests if a string is an integer.
  - Made the parsing less verbose.
  - Extracted the positions from raw variations.
  - Wrote code to deal with reversed ranges.
  - Added error handling and warning messages for:
    - Substitutions.
    - Deletions.
    - Duplications.
    - Inversions.
    - Insertions.

Removed:
  - clean.sh
    - It is no longer needed, since all temporary files and directories are no
      longer in subversion.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@7 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

41ac9420

Oct 21, 2009

Modified GenRecord.py and Crossmap.py to deal with the deviating numbering · 85f2594f

Laros authored 15 years ago

used in BioPython.
- GenBank starts counting at 1, BioPython at 0.
- GenBank uses absolute positions, BioPython uses interbase positions.

Main.py still contains test cases to test the modules, so it's always under
development.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@6 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

85f2594f

Oct 09, 2009

Extracted the parsing of the configuration file from the Retriever module and · 5a36a807

Laros authored 15 years ago

put it in a new module named Config. This is done because an other new module
named Db also needs the configuration file.

De file Db.py is new and contains a function that interfaces to a MySQL 
database. This database is used for the mapping of NM to NP accession numbers.

The file mutalyzer.conf now has two more options: a MySQL user name and a 
database name. 

Started the documentation of a fresh installation, see Install.txt for more
details.

In Mutator.py: Added a function that calculates the positions of splice sites
after mutatations.

In Scheduler.py: Added some code (still in comment) that can parallelise jobs
by using treads and detecting the number of processors present on the host.



git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@5 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

5a36a807

Oct 01, 2009

Added: · 03b99925

Laros authored 15 years ago

  GenRecord.py ; Provides a class, which main purpose is to extracts relevant 
                 data from GenBank parse objects. The output is a nested 
                 dictionary with per-gene information:
                 - A list of genes, indexed by gene name.
                   - A list of locus tags.
                     - The orientation.
                     - An optional mRNA field.
                       - A location.
                       - An mRNA list.
                     - An optional CDS list.
                       - A location.
                       - An CDS list.
                 This structure may be changed in the future.

                 Furthermore, it contains a function that does the actual 
                 splicing.
  Mutator.py   ; Provides a class that is capable of mutating a string, while
                 keeping track of all mutations. This way, the coordinates of 
                 the original string can be used for mutations.
  Scheduler.py ; Provides a class that can schedule both interactive and batch
                 jobs.
                 - Once every two turns, an interactive job is selected.
                 - Each other turn, a batch queue is selected, from which a 
                   job is selected.

Modified:
  Main.py      ; Will always change, since it contains test functions and 
                 prototypes.
  Parser.py    ; Moved all the test functions to Main.py, added better error
                 messaging for parse errors, added comment. Made various 
                 changes to the parser itself, not noteworthy of further 
                 explanation since the parser is still under development.
  Crossmap.py  ; Changed some comment, commented out a debug function and
                 added some white space for readability.
  Retriever.py ; Changed some comment.


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@4 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

03b99925

Aug 28, 2009

Added a context-free grammar parser, the code is very similar to BNF, thus · 87b457c2

Laros authored 15 years ago

highly flexible and understandable. The output is a hierarchical parse object
which contains the relevant data. The parser accepts nested expressions. 
Detailed information on the parser should be provided in the future (when
all the functionality is tested).

In the file Main.py a couple of test functions are added (later to be
exported to separate modules).
- A function that finds the last occurrence of a pattern or one of its cyclic
  permutations within a repeating sequence.
- A function that extracts relevant data from the GenBank parse object. The
  output is a nested dictionary with per-gene information:
  - A list of genes, indexed by gene name.
    - A list of locus tags.
      - The orientation.
      - An optional mRNA field.
        - A location.
        - An mRNA list.
      - An optional CDS list.
        - A location.
        - An CDS list.
  This structure may be changed in the future.
- A function that extracts a raw list from the mRNA and CDS lists, to be used
  in the mapping functions (g. to c. and so on).
- A function that does the actual splicing, to be used as input for mutation
  functions.
  


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@3 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

87b457c2

Jul 28, 2009

The first files for Mutalyzer version 2.0. · 926cfc9c

Laros authored 15 years ago

- Crossmap.py does the conversions g. -> c. or n. and vice versa.
- Retriever.py fetches a record from the NCBI or the cache.
- Main.py contains a couple of functions to test Crossmap.py and Retriever.py.
- clean.sh is a script that removes cache/* and src/*.pyc.
- mutalyzer.conf contains configuration variables.

Make sure to call all functions from . (not ./src), otherwise relative paths
can no longer be used (now used in Retriever.py and mutalyzer.conf).


git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@2 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

926cfc9c

Initial import. · 298874e0

Laros authored 15 years ago

git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@1 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1

298874e0