1. 28 Jul, 2010 1 commit
      Rewrote the GBparser module to cope with multiple linking methods (to link · f299d5e6
      mRNA and CDS fields). Added a SNP converter. Implemented a first check for
      variants that hit splice sites. Added a chromosomal position for NC slices.
      Added a check for the use of intronic positions in a transcript reference 
      - Added variables for splice site mutation detection.
      - Merged the restriction sites (added, deleted) in one object.
      - Changed the return type of getIndexedOutput() and getOutput() from None to
        an empty list for convenience.
      - Completely rewrote this module.
        - Will now collect all CDS, mRNA and exon information.
        - Tries to match a CDS and mRNA based upon the range.
          - When this succeeds, try to match on protein, locus tag or product name.
          - If all fails and there is only one option left, link it.
        - Will remove genes that are not fully annotated (half outside the record
          for example).
      - Added a snpConvert() function.
      - Added a check for uploaded records or slices that have no sequence in them
        (a complete contig for example).
      - Added variables to cope with chromosomal coordinates.
      - Added a toChromPos() function to convert a g. notation to a g. on a
      - Added a addToChromDescription() function to generate a chromosomal
        description of a variant.
      - Modified some warnings concerning missing mRNA or missing CDS.
      - Added a checkIntron() function that gives a warning when a variant hits a
        splice site.
      - Added a nomenclature version variable.
      - Added a urlEncode() function to generate valid links.
      - Made the getMimeType() function public.
      - Added a __intronicPosition() function that checks whether the user used an
        intronic position.
      - Added checks for illegal use of intronic positions.
      - Fixed a bug in the __toProtDescr() function.
      - Added more checks for the translation and transcription of transcripts.
      - Added a better CDS start site mutation detection.
      - Added a chromosomal description if available.
      - Added more information to the legend (product and linking method).
      - Added guessing of mime types for downloadable files (it used to default to
      - Added a snp() function to interface with the snp() function of the Retriever.
      - Added a checkForward() function to accommodate for HTTP GET links.
      - Removed the `version' variable from all functions, it is now moved to the
      - Modified to cope with the new functionality.
      - Restructured the layout.
      - New page for the SNP converter.
      - Modified to cope with the new functionality.
      - Modified to make some better tables and helper boxes.
      - Completely rewritten it.
      - Restructured the layout to make it uniform with the name checker.
  2. 26 Jul, 2010 1 commit
      Major fix for the batch runs · 808ea897
          * Updated the batch output generator
          * Make result file downloadable
          * Make webservice compatible with the new Mapper module
          * Updated functions to reflect new Mapper module
          * Added a redirect option to move get messages to the check form
          * Updated check to use the Session from redirect
          * implented __checkInt to catch user errors in upload
          * Added a progress option to get updates about batch progress
          * updated batch to show batch progress after submit
          * Added the AJAX request to get a batch progress counter
          * Started the linkify to change all Mutalyzer instances to clickable links
          * Added a hidden form to support the batch progress
          * Display Errors 
          * Added the getMessagesWithErrorCode
          * Made the seq a BioSequence for the restriction function
          * Added BatchFlags support to prevent jobs to hang at the same problem
          * Updated some returnvalues to reflect errors
          * Implemented the Flags to skip or alter other batch entries
          * Cleaned up the nameCheckerbatch significantly
          * Added a try/except clause around the namechecker to prevent batchhangs
          * Changed output fileextension to txt
          * Cleaned up Module
          * mainTranscript function updated to reflect webservice needs
          * Correct errors when retrieving fields from DB in _FieldsFromDb
          * Impemented entriesLeftForJob to get info about how a job is doing
          * Added updateBatchDb and skipBatchDb to skip/alter batch entries
  3. 19 Jul, 2010 1 commit
      TODO: · 7e91c366
      		- update errorcodes.txt
      		- rewrite some webservice.py functions
      		- File.py ods files are not read in correctly. Temp file deleted before it
      			can be used
      		- mutalyzer.conf
      				Added pidfile reference
      				Added Output Headers for the Batch Jobs
      		- Db.txt
      				Updated the Db.txt file so that it reflects the database changes
      					* BatchJob now contains an Arg1 attribute, which is used by the
      						Conversion Batch to store the build version. This could be
      						changed to use the Filter Field.
      					* Dropped the Var table
      					* BatchQueue -> AccNo, Gene, Variant are dropped and replaced by
      						single column input. This fields contains a single full
      						mutalyzer variant description.
      		- Install.txt
      				Added the python-daemon dependency
      		- handler.py
      				Added a downloads handler for the old and new batch files
      				Other handlers are changed so the filter for extensions now matches the
      				trailing characters
      		- webservice.py
      				Made the functions compatible with the new Mapper.Converter class
      				*** This webservice.py can use a bit more structure ***
      		- index.py
      				Moved a lot of logic from the index file to the Mapper.py module
      				Implemented the batch handler, which has multiple entry points
      				positionConverter now uses the Mapper.Converter class
      		- BatchChecker.py
      				Fixed the multiprocessing bugs. Now uses python-daemon to spawn a
      				process and uses the var/batch.pid as a lockfile for the process
      		---- Modules
      		- Ouput.py
      				Added getSoapMessages, which returns Soap compatible Messages for over
      				the wire
      				Added getBatchMessages, which returns messages above %level and filters
      				the ParseError to one line
      		- LRGparser.py
      				Included the usage of transcription.translate and transcribe for the
      				main transcript
      		- Config.py
      				Added the Batch Headers to the configfile
      				Added a Batch Config object for the PID file
      		- Scheduler.py
      				Added the SyntaxCheck Batch
      				Added the Conversion Batch
      				Have the BatchProcesser read the Argument from the Batch
      				Changed outputfile format to csv
      		- File.py
      				Changed the csv sniffer to drop the ":" delimiter in batch entries
      				Updated some Job logic to detect all lines with basic errors
      		- Mapper.py
      				Practically a complete refactor of the module
      				Added the Converter class for chromosome to c and back conversions
      				Added a SoapMessage class which instances can be send over the wire
      				Updated the Mapper.Mapping class to use SoapMessage messages
      				Basically create correct mappings between c 2 chrom 2 c [tested]
      				Reduced the number of Database calls needed significantly
      		- Db.py
      				Implemented getAllFields to retrieve complete dataset of Fields of
      				intereset, this reduced the number of database calls
      				Updated the Db logic to encompass the new Batch Job logic:
      						Input entries are 1 field
      						Jobs have a type
      						Jobs can have arguments (1 at the moment, use BITFLAGS)
      		---- Templates
      				- batchCheck.html				-> All Batch jobs are united in batch.html
      				- batch_convert.html		->  								"
      				- convert.html					-> Replaced by converter.html
      				- batch.html
      						Included big file type help
      						Made template compatible with the three different batchTypes
      				- gbupload.html & menu.html
      						Fixed an onload javascript bug which caused a stream of javascript
      						errors on pages other than gbupload
      				- interface.js
      						Added the changeBatch and toggle_visibility functions
      				- downloads
      						Added example download files for old & new batchfiles
      		- change permission on var/batch.pid to have apache read/write it
  4. 14 Jun, 2010 1 commit
      Commit to do a merge with web_dev. This version is not suitable for · 48bb332d
      distribution as it is under heavy development.
      Most modules will have minor changes because of a difference in set up of both
      the Db and Config module.
      - Added functionality to enable the cron restart of the Batch Checker.
      - Added the auto-generation of a .htaccess file.
      - Added permission settings.
      - Added configuration options for the Scheduler, File and GenRecord modules.
      - Described how to make the new ChrName tables for hg18 and hg19.
      - Added classifications to the messages.
      - Made a set up for the documentation.
        - This will be a technical document that describes the internals of the
          project. It is only meant for developers.
        - This is a description of the API, it is auto generated by the mkapidoc.sh
          script. Also only meant for developers.
      - Added a new roll function that will always find both boundaries.
      - Implemented a new protein naming scheme.
      - Fixed the trimming of a delins.
      - Rewrote the processing of a variant. 
        - Moved post processing of the GenBank record to the GenRecord module.
        - Moved the crossmapper instance to the GenBank module, to make one instance
          per transcript variant.
        - Moved the naming of a variant to the GenBank module, as is strongly
          interacts with the crossmapper instance.
      - Moved the constructCDS function to the GenRecord module.
      - Added functionality for the batch checker (retrieve results).
      - Added functionality for the genbank uploader (retrieve GenBank files).
      - Modified to work with the new Db module.
      - Modified to work with the new Db module.
      - Replaced the dictionary structure with a nested list structure to make
        iteration more convenient.
      - Added names to the Locus and Gene objects.
      - Added all information needed to do a crossmapping in the Locus object.
      - Wrote functions to find Loci and Genes.
      - Wrote a function that expands a description of a variant (coupled to a Locus).
      - Added documentation.
      - Added documentation.
      - Added documentation.
      - Added a function that checks whether a string is an e-mail address.
      - Implemented a batch scheduler that uses a MySQL database for queueing.
      - Implemented a CSV, XLS and ODS parser for use in the Scheduler module.
      - Added documentation.
      - Modified the complex object initialisation.
      - Made subclasses to configure the separate modules.
      - Added documentation.
      - Split the Db modules into different classes, according to functionality, they
        all inherit the query function from the Db base class.
      - Added chromosome accession number to name conversion functions and vice versa.
      - Added functionality for the batch checker.
      - Added documentation.
      - Added documentation.
      - Added fall back functionality when searching for a gene.
      - Added a batch submit interface.
      - The layout of the batch submit interface.
  5. 18 May, 2010 1 commit
      Altered the Output module in such a way that all messages are stored in a · 4a8a9986
      list with priorities, and all output is stored in a dictionary. This dictionary
      can be read at a later time.
      The Retriever module is changed to accommodate for uploaded GenBank files and
      slices of a chromosome.
      A Scheduler was added for batch checking.
      - Added variables for the Retriever module:
        - maxDldSize, minDldSize ; Maximum and minimum sizes for slices and uploaded 
          GenBank files.
      - Added variables for the Output module:
        - loglevel, outputlevel ; Specify default verbosity levels for logging and 
      - Added variables for the Scheduler module:
        - processName ; Name of the running scheduler.
      - Added information on how to create the newly used tables GbInfo, BatchQueue
        and BatchJob.
      - Short description of the error codes used in the Output module.
      - Added a Complex class to test more complicated return types (see the web_dev
      - Modified the code to work with the new Output module.
      - Made a first upload page.
      - Modified the code to work with the new Output module.
      - Modified the code to work with the new Output module.
      - Modified the code to work with the new Output module.
      - Added documentation.
      - Modified the code to work with the new Output module.
      - New file, used for generating unique IDs.
      - Made a change to the definition of an UD accession number.
      - Modified the code to work with the new Output module.
      - Made a batch checker scheduler.
        - isDaemonRunning() ; See if we need to be started.
        - process() ; Start the batch checker.
        - addJob() ; Add jobs to a queue in the database.
      - Added a Message class to store all debug, info, warning, error and fatal
        - If a message is given that exceeds the configured log level, it will be
          logged immediately.
      - A function is added to the Output class to read all messages that exceed a 
        certain verbosity level.
      - A function is added to create a named list as an output node.
      - With the getOutput function the content of this list can be retrieved.
      - Several sub-classes were added for each configurable module.
      - Added documentation.
      - Added functionality that is used by the Retriever module.
      - Added functionality that is used by the Scheduler module.
      - Added functionality to be able to use custom GenBank files and chromosome
        - Information on these created files are stored in a database to be able to
          re-create them when the cache is cleaned.
        - The hash of each file is stored for error detection.
      - A wrapper that is called either from the addJob() function from the Scheduler
        module, or from cron. It dispatches a background process that processes the
        batch jobs.
      - Test template for uploading files (copied from Mutalyzer 1.0.4).
      - Some test with a complex return type.
      - Did some first tests with a METAL template.
  6. 12 May, 2010 1 commit
  7. 15 Apr, 2010 1 commit
    • Laros's avatar
      Second alpha release. · f87343a1
      Laros authored
      The classes Mutator, Output and Db are now derived classes of Config. The
      class Retriever is a derived class of Output. This reduces the amount of
      code and variable passing significantly.
      - Converted the dbName variable to a dbName list, to accommodate for more than
        one database.
      - Added variables flanksize, maxvissize and flankclipsize for the visualisation
        in Mutator.py (instead of the previous alignment).
      - Resolved the range-swap issues.
      - Resolved the reverse-complement (cosmetic) issues.
      - Fixed a cosmetic bug in the __bprint() function.
      - Added a new function __nsplice(), to accommodate for a CDS extension.
      - Added a function __toProtDescr(), that gives a protein description in case
        of a simple substitution.
      - Added functionality for n. m. and EST notations.
      - Added functionality for other species (translation tables).
      - Corrected the roll-rule for insertions.
      - Added fallbacks for missing CDS and mRNA lists and positions (for an EST for
      - Added an input check for wrong gene symbols.
      - Added a temporary exception for in frame stop codons.
      - Added the private functions __checkBuild(), __checkChrom() and __checkPos()
        that do routine checks in a number of services.
      - Added a 'build' variable to getTranscripts(), getTranscriptsRange() and
      - Added exceptions that raise a Fault() object to make the client receive a
        SOAP exception.
      - Added functionality for more than one database.
      - Added functionality for more than one database.
      - Added functionality to deal with non-coding transcripts.
      - Made a RecordObj() object that consists of the old 'genelist' dictionary, 
        combined with 'mol_type' and 'organelle' variables. Also, a fake gene named
        'source' is included to accommodate for sequences that do not contain any
        annotated genes (an EST for example).
      - The Locus object is extended with a 'txTable' variable to accommodate for
        different organelles (mitochondria) and other species.
      - Replaced the alignment visualisation by a home-made one. Also see the
        variables that were added to the configuration file to alter the behaviour of
        this visualisation.
      - Modified the shiftpos() function when inserting something on a splice site
        boundary (now it extends the exon).
      - Minor modifications for the new inheritance scheme.
      - Minor modifications for the new inheritance scheme.
      - Minor modifications for the new inheritance scheme.
      - Added functionality to handle multiple databases.
      - Added an isChrom() function, used by the webservices check functions.
      - Made a change in the usage of the '__STOP' variable. It is set to
        transcription stop if there is no stop codon present. This makes conversion
        to an n. notation trivial.
      - Minor modifications for the new inheritance scheme.
      - Added a check for invalid accession numbers (or versions).
      - Added a check for erroneous genbank files that can occur when the NCBI is
        overloaded. The erroneous file is purged and the user can try again.
      - Modified the sample code to accommodate for the new 'build' variable.
  8. 19 Feb, 2010 1 commit
    • Laros's avatar
      Enhanced the mapping capabilities of the Db module, added a new webservice and · f49bc7f6
      Laros authored
      wrote documentation.
      Made an accumulative mapping info table, this requires regular polling of new
      data from the UCSC:
      - Added:
        - install.sh: A preliminary installation script. Now only used for cron 
        - Db.txt: Some loose documentation on how to make the new mapping table, to 
          be incorporated with an installation script.
        - src/UCSC_update.py: The update program, to be called from cron each day.
      - Modified:
        - mutalyzer.conf: Added variables needed for the remote database of the UCSC.
        - Db.py: Rewritten nearly every SQL query to work with the new mapping table
          and to be able to download and import updates from the UCSC.
        - Config.py: Modifications to work with the new configuration variables.
      - templates/sp.py: A webservice client template script.
      - templates/download.html: The download page for developers.
      - Install.txt: Added more depenencies.
      Switched to soaplib for the generation of a WSDL file. Webservices are now
      published by adding functions to the MutalyzerService class in webservice.py,
      each function should have a soapmethod decorator to specify the types.
      - handler.py: To work with soaplib.
      - webservice.py: 
        - Put everything in a class to make soaplib able to generate a WSDL file.
        - Added the varInfo() webservice (calls the Variant_info script).
      - index.py: Added documentation.
      - Mutator.py: Added documentation.
      - Web.py: Added documentation.
      - Mutalyzer.py: Added generation of a new description in g. and c. notation.
      - Db.py: Modified the get_Transcripts function to be able to work with
        overlapping and non-overlapping ranges.
  9. 03 Feb, 2010 1 commit
    • Laros's avatar
      Cleaned up the code for a new alpha release. · e4094f10
      Laros authored
      - Web.py: A module with some general functions used by the interfaces.
        - A version (this is deliberately kept out of the config file).
        - A run() wrapper that returns standard output of any function as a string.
        - A tal() function that parses a TAL template.
        - A read() function that returns the input of a file.
      - Install.txt: In the apache config, a PythonPath must be set now (dynamically
        setting it did not give consistent output).
      - handler.py : Cleaned the source by using the Web class.
      - webservice.py : Cleaned the source by using the Web class.
      - index.py : Cleaned the source by using the Web class.
  10. 02 Feb, 2010 1 commit
    • Laros's avatar
      In this version the whole project has been restructured. · 42267dda
      Laros authored
      The main structure is as follows:
      /                     ; Root of the installation.
        - Install.txt
        - Todo.txt
        - Obsoleted.txt
        - mutalyzer.conf
        - var/              ; Variable data.
            - cache/
            - mutalyzer.log
        - templates/        ; HTML, XML, JavaScript, etc.
        - src/
          - Mutalyzer.py
          - Variant_info.py
          - Services/       ; Webservices.
          - Clients/        ; Example clients for webservices.
          - Modules/        ; The core modules.
          - Interfaces/     ; Interfaces to mod_python.
      Apart from changes that were needed to deal with this new structure, no changes
      in the code were made.
  11. 01 Feb, 2010 1 commit
    • Laros's avatar
      Added: · 5371f394
      Laros authored
      - Todo.txt.
      - handler.py: A general handler for mod_python. This handler dispatches
        SOAP services and a normal HTML publisher. Furthermore, it is able to handle
        raw requests to dump HTML or XML files. This handler is TAL enabled.
      - webservice.py: A publisher for webservices. When a new webservice is
        added, this is the entrypoint for the server side (just like index.py, add
        a function).
      - getTranscripts.py: A webservice that reports all transcripts that overlap 
        with a certain genomic position (chomosome, position). 
      - getGeneName.py: A webservice that finds the gene name of a given transcript.
      - service.wsdl: This is the definition of the interface for webservices.
        A client must download this file and parse it to obtain a programming 
        interface, then the client can use this interface just like any local 
      - Obsoleted.txt: A list of things that will be deleted in the future (but are
        still functional for backwards compatibility).
      - client/sp.py: A test client for the two webservices.
      Renamed Main.py to Mutalyzer.py.
      - mutalyzer.conf: Added a configurable date prefix for logging.
      - Install.txt: 
        - To reflect the difference in configuration of apache to work with the new 
          handler (requires less configuration).
        - Added TAL as a new dependency.
      - html/check.html: Made it a full TAL template. Title, version and output are
        now separated from the HTML design.
      - Mutator.py: Made the shiftpos() function public, this is needed for insertion
      - Parser.py: Updated the comments.
      - Variant_info: Made all internal functions private.
      - Output.py: Updated the comments.
      - Config.py: Modified to reflect the changes in mutalyzer.conf.
      - Db.py:
        - We now keep the handle to the database open until the object is deleted.
        - Added a destructor that closes the handle to the database.
        - Added getTranscripts(): Get a list of transcripts, given a chromosome and a
          position on that chromosome.
        - Added get_GeneName(): Get the gene name of a given transcript.
      - Retriever.py: Updated the comments.
      - Crossmap.py:
        - Made a patch that handles a CDS start on the first position of the 
        - Added more unit tests.
      - index.py: 
        - Added a switch for older versions of LOVD, to generate the expected output 
          in Variant_info.
        - Made this publisher compatible with TAL.
      - Mutalyzer.py: Made all internal functions private.
  12. 23 Dec, 2009 1 commit
    • Laros's avatar
      Added Output.py: A logging facility. Important output like warnings and errors · 1c7b0777
      Laros authored
      can be sent to this object, which sends errors and explicit logging messages
      to a log (defined in mutalyzer.conf) and warning messages to standard output.
      Behaviour of this object may change in the future, adding severity and logging
      above a certain severity level is one option that would increase debugging 
      - NiceName() returns a short description of the calling program (we can not 
        use the default __name__ here.
      - ErrorMsg() Print the nice name of the calling module, an error message and 
        log it. Also increase an error counter.
      - WarningMsg() Print the nice name of the calling module and an error message.
        Also increase a warning counter.
      - LogMsg() Only log the message (nice name of the calling module and the 
        message itself).
      - Summary() Give the number of errors and warnings.
      - A unit test is also defined (it does not do much at this moment).
      Added index.py: The web interface to mutalyzer, it is dependent on mod-python,
      we chose for this interface to eliminate the need for php. Also apache is now
      added to the list of dependencies. The configuration of mod-python is described
      in Install.txt.
      - Moved the splice() function to Main.py. 
      - Added an exon list to the Plist class. This list can be used as a fallback
        in case the mRNA tag is missing from a GenBank file.
      - Added an empty unit test.
      - Added a standard alignment for visualisation. This will probably be replaced
        in newer versions.
      - Added functionality for the new output module.
      - Made the parser gracefully return, instead of exit on a parse error. This is
        needed for the web interface.
      - Added functionality for the new output module. Note that output generated by
        this program should go to a different log, something for a future version.
      - Fixed a bug that occurred when a CDS start or stop was on an exon boundary.
      Main.py (heavily under development, names of functions are not very descriptive
      - Added functionality for the new output module.
      - Fixed a bug in the roll() function, it returned a wrong value for forward
      - Added the function bprint(), it formats a large string to be printed in an
        insightful way (like GenBank does it), it also prints the offsets at the
        beginning of each line.
      - Obsoleted the ErrorMsg() and WarningMsg() functions.
      - Added constructCDS(), a function that is able to construct a CDS from an
        mRNA list, CDS start and CDS stop. In the future we would like to work 
        without a CDS list, so this function will be obsoleted.
      - Added the splice() function (from GenRecord.py).
      - Made a function rv() that is able to process a RawVar. This function is
        seperated from the ppp() function to be able to work with an allele 
      - Added splicing.
      - Added translation to a protein.
      - Added a function rrr() which is to be called from Main.py itself or from 
      - Added a log variable for Output.py.
      - Fixed a bug concerning genes where the entire CDS is in one exon.
      - Added more uncertainty handling.
      - Added functionality for the new output module.
      - Added handling of accession numbers with no version. It downloads the latest
        version, and gives a warning.
  13. 09 Oct, 2009 1 commit
    • Laros's avatar
      Extracted the parsing of the configuration file from the Retriever module and · 5a36a807
      Laros authored
      put it in a new module named Config. This is done because an other new module
      named Db also needs the configuration file.
      De file Db.py is new and contains a function that interfaces to a MySQL 
      database. This database is used for the mapping of NM to NP accession numbers.
      The file mutalyzer.conf now has two more options: a MySQL user name and a 
      database name. 
      Started the documentation of a fresh installation, see Install.txt for more
      In Mutator.py: Added a function that calculates the positions of splice sites
      after mutatations.
      In Scheduler.py: Added some code (still in comment) that can parallelise jobs
      by using treads and detecting the number of processors present on the host.
  14. 28 Jul, 2009 1 commit
    • Laros's avatar
      The first files for Mutalyzer version 2.0. · 926cfc9c
      Laros authored
      - Crossmap.py does the conversions g. -> c. or n. and vice versa.
      - Retriever.py fetches a record from the NCBI or the cache.
      - Main.py contains a couple of functions to test Crossmap.py and Retriever.py.
      - clean.sh is a script that removes cache/* and src/*.pyc.
      - mutalyzer.conf contains configuration variables.
      Make sure to call all functions from . (not ./src), otherwise relative paths
      can no longer be used (now used in Retriever.py and mutalyzer.conf).
