- Jun 14, 2010
-
-
Laros authored
distribution as it is under heavy development. Most modules will have minor changes because of a difference in set up of both the Db and Config module. install.sh: - Added functionality to enable the cron restart of the Batch Checker. - Added the auto-generation of a .htaccess file. - Added permission settings. mutalyzer.conf: - Added configuration options for the Scheduler, File and GenRecord modules. Db.txt: - Described how to make the new ChrName tables for hg18 and hg19. errorcodes.txt: - Added classifications to the messages. doc: - Made a set up for the documentation. TechnicalReference: - This will be a technical document that describes the internals of the project. It is only meant for developers. API: - This is a description of the API, it is auto generated by the mkapidoc.sh script. Also only meant for developers. Mutalyzer.py: - Added a new roll function that will always find both boundaries. - Implemented a new protein naming scheme. - Fixed the trimming of a delins. - Rewrote the processing of a variant. - Moved post processing of the GenBank record to the GenRecord module. - Moved the crossmapper instance to the GenBank module, to make one instance per transcript variant. - Moved the naming of a variant to the GenBank module, as is strongly interacts with the crossmapper instance. - Moved the constructCDS function to the GenRecord module. handler.py: - Added functionality for the batch checker (retrieve results). - Added functionality for the genbank uploader (retrieve GenBank files). webservice.py: - Modified to work with the new Db module. UCSC_update.py: - Modified to work with the new Db module. GenRecord.py: - Replaced the dictionary structure with a nested list structure to make iteration more convenient. - Added names to the Locus and Gene objects. - Added all information needed to do a crossmapping in the Locus object. - Wrote functions to find Loci and Genes. - Wrote a function that expands a description of a variant (coupled to a Locus). Mutator.py: - Added documentation. Parser.py: - Added documentation. Web.py: - Added documentation. - Added a function that checks whether a string is an e-mail address. Scheduler.py: - Implemented a batch scheduler that uses a MySQL database for queueing. File.py: - Implemented a CSV, XLS and ODS parser for use in the Scheduler module. Output.py: - Added documentation. Mapper.py: - Modified the complex object initialisation. Config.py: - Made subclasses to configure the separate modules. Db.py: - Added documentation. - Split the Db modules into different classes, according to functionality, they all inherit the query function from the Db base class. - Added chromosome accession number to name conversion functions and vice versa. - Added functionality for the batch checker. Crossmap.py: - Added documentation. Retriever.py: - Added documentation. - Added fall back functionality when searching for a gene. index.py: - Added a batch submit interface. batch.html: - The layout of the batch submit interface. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@30 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- May 19, 2010
-
-
Laros authored
Modified the merged changes to work with the new output module. Modified the layout of the project. - Removed the Clients, Interfaces and Services subdirectories. handler.py, Web.py, index.py: - Modified for the new layout. Variant_info.py is renamed to VarInfo.py because of a conflict of names with the function Variant_info() in index.py. VarInfo.py: - Modified for the new layout. webservice.py: - Modified for the new layout and for the new Output module. Mapper.py: - Added by the merge and modified for the new Output module. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@27 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- May 18, 2010
-
-
Laros authored
list with priorities, and all output is stored in a dictionary. This dictionary can be read at a later time. The Retriever module is changed to accommodate for uploaded GenBank files and slices of a chromosome. A Scheduler was added for batch checking. mutalyzer.conf: - Added variables for the Retriever module: - maxDldSize, minDldSize ; Maximum and minimum sizes for slices and uploaded GenBank files. - Added variables for the Output module: - loglevel, outputlevel ; Specify default verbosity levels for logging and output. - Added variables for the Scheduler module: - processName ; Name of the running scheduler. Db.txt: - Added information on how to create the newly used tables GbInfo, BatchQueue and BatchJob. errorcodes.txt: - Short description of the error codes used in the Output module. webservice.py: - Added a Complex class to test more complicated return types (see the web_dev branch). - Modified the code to work with the new Output module. index.py: - Made a first upload page. Mutalyzer.py: - Modified the code to work with the new Output module. UCSC_update.py: - Modified the code to work with the new Output module. Variant_info.py: - Modified the code to work with the new Output module. GenRecord.py: - Added documentation. Mutator.py: - Modified the code to work with the new Output module. Misc.py: - New file, used for generating unique IDs. Parser.py: - Made a change to the definition of an UD accession number. - Modified the code to work with the new Output module. Scheduler.py: - Made a batch checker scheduler. - isDaemonRunning() ; See if we need to be started. - process() ; Start the batch checker. - addJob() ; Add jobs to a queue in the database. Output.py: - Added a Message class to store all debug, info, warning, error and fatal messages. - If a message is given that exceeds the configured log level, it will be logged immediately. - A function is added to the Output class to read all messages that exceed a certain verbosity level. - A function is added to create a named list as an output node. - With the getOutput function the content of this list can be retrieved. Config.py: - Several sub-classes were added for each configurable module. Db.py: - Added documentation. - Added functionality that is used by the Retriever module. - Added functionality that is used by the Scheduler module. Retriever.py: - Added functionality to be able to use custom GenBank files and chromosome slices. - Information on these created files are stored in a database to be able to re-create them when the cache is cleaned. - The hash of each file is stored for error detection. BatchChecker.py: - A wrapper that is called either from the addJob() function from the Scheduler module, or from cron. It dispatches a background process that processes the batch jobs. gbupload.html: - Test template for uploading files (copied from Mutalyzer 1.0.4). sp.py: - Some test with a complex return type. download.html: - Did some first tests with a METAL template. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@26 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- May 12, 2010
-
-
Gerard Schaafsma authored
changed dbName = "hg18" In webservice.py changed the names of arguments v1, v2, v3 and v4 in all methods to 'more readable' names changed C and D to Conf and Database in all methods method extractChange was added, which extracts from a complete HGVS variant description the part after the coordinates (positions) and the start position of the variant method cTogConversion was added, which converts a complete HGVS variant description in c. notation to g. notation method gTocConversion was added, whicht converts a complete HGVS variant description in g. notation to c. notation In Mapper.py method conversionToCoding was added, which converts non-star c. positions to star c.positions this comment should have been added with the previous commit git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@25 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Gerard Schaafsma authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@24 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- May 03, 2010
-
-
Gerard Schaafsma authored
mappingInfo() and transcriptInfo. These methods are used, respectively, when a variant is present, or when its not. Mapper.py is the new version of Variant_info.py, with the two methods described above defined here: mainMapping and mainTranscript Config.py and Crossmap.py were not actually changed. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@23 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Apr 21, 2010
-
-
Gerard Schaafsma authored
python-soaplib >= 0.8.1 which had been forgotten Added to webservice.py: from Variant_info import Complex because you use the Complex object in the SOAP decorator @soapmethod(String, String, String, String, _returns = Complex) Changed some stuff in the documentation Replaced the return type in the SOAP decorator of varInfo() @soapmethod(String, String, String, String, _returns = Complex) The whole definition of varInfo() was added: + import Variant_info + from Modules import Web + from Modules import Config + from Modules import Db + from Modules import Output + + C = Config.Config() + D = Db.Db(C, "local") + L = Output.Output(C, __file__) + + L.LogMsg(__file__, "Reveived request varInfo(%s %s %s %s)" % ( + v1, v2, v3, v4)) + + W = Web.Web() + result = Variant_info.main(v1, v2, v3, v4) + del W + + L.LogMsg(__file__, "Finished processing varInfo(%s %s %s %s)" % ( + v1, v2, v3, v4)) + + del L + del D + del C +# return str(result.split("\n")[:-1]) + return result Added to webservic.py: + @soapmethod(String, String, String, String, _returns = Complex) + def mapInfo(self, v1, v2, v3, v4) : and so on because varInfo() now lacks the ability to deal with the possibility that the variant (v4) is not provided, this functionality has been transferred to varMap() in Variant_map.py Variant_info.py has also been copied to /src/Services Variant_map.py has also been copied to /src/Services In Variant_info.py a new Python object was defined: +class Complex(ClassSerializer) : and further to return an object holding the information about a variant, and not a string as was previously done, and which had to be parsed. This also includes the type code definition with TC.struct: +Complex.typecode = TC.Struct(Complex, [ TC.Integer('startmain'), + TC.Integer('startoffset'), + TC.Integer('endmain'), + TC.Integer('endoffset'), + TC.Integer('start_g'), + TC.Integer('end_g'), + TC.String('mutationType') ], 'Complex') + + Improved the following error messages: if not db_version : if db_version != version : Added an error message: if not var : because this functionality moved to Variant_map.py Changed the return type from string to the Complex object V And changed the main part to: - __process(LOVD_ver, build, acc, var, Conf, O) + result = __process(LOVD_ver, build, acc, var, Conf, O) + return result Changed the following stuff in Config.py: + # Figure out where this program is located and go two levels up. + import os + myPath = os.path.dirname(__file__) + "/../.." + os.chdir(myPath) which is necessary for finding the mutalyzer.conf file git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@22 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Apr 15, 2010
-
-
Laros authored
mutation. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@21 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Laros authored
The classes Mutator, Output and Db are now derived classes of Config. The class Retriever is a derived class of Output. This reduces the amount of code and variable passing significantly. mutalyzer.conf: - Converted the dbName variable to a dbName list, to accommodate for more than one database. - Added variables flanksize, maxvissize and flankclipsize for the visualisation in Mutator.py (instead of the previous alignment). Mutalyzer.py: - Resolved the range-swap issues. - Resolved the reverse-complement (cosmetic) issues. - Fixed a cosmetic bug in the __bprint() function. - Added a new function __nsplice(), to accommodate for a CDS extension. - Added a function __toProtDescr(), that gives a protein description in case of a simple substitution. - Added functionality for n. m. and EST notations. - Added functionality for other species (translation tables). - Corrected the roll-rule for insertions. - Added fallbacks for missing CDS and mRNA lists and positions (for an EST for example). - Added an input check for wrong gene symbols. - Added a temporary exception for in frame stop codons. webservice.py: - Added the private functions __checkBuild(), __checkChrom() and __checkPos() that do routine checks in a number of services. - Added a 'build' variable to getTranscripts(), getTranscriptsRange() and getGeneName() - Added exceptions that raise a Fault() object to make the client receive a SOAP exception. UCSC_update.py: - Added functionality for more than one database. Variant_info.py: - Added functionality for more than one database. - Added functionality to deal with non-coding transcripts. Genrecord.py: - Made a RecordObj() object that consists of the old 'genelist' dictionary, combined with 'mol_type' and 'organelle' variables. Also, a fake gene named 'source' is included to accommodate for sequences that do not contain any annotated genes (an EST for example). - The Locus object is extended with a 'txTable' variable to accommodate for different organelles (mitochondria) and other species. Mutator.py: - Replaced the alignment visualisation by a home-made one. Also see the variables that were added to the configuration file to alter the behaviour of this visualisation. - Modified the shiftpos() function when inserting something on a splice site boundary (now it extends the exon). Output.py: - Minor modifications for the new inheritance scheme. Config.py: - Minor modifications for the new inheritance scheme. Db.py: - Minor modifications for the new inheritance scheme. - Added functionality to handle multiple databases. - Added an isChrom() function, used by the webservices check functions. Crossmap.py: - Made a change in the usage of the '__STOP' variable. It is set to transcription stop if there is no stop codon present. This makes conversion to an n. notation trivial. Retriever.py: - Minor modifications for the new inheritance scheme. - Added a check for invalid accession numbers (or versions). - Added a check for erroneous genbank files that can occur when the NCBI is overloaded. The erroneous file is purged and the user can try again. templates/sp.py - Modified the sample code to accommodate for the new 'build' variable. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@20 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Feb 26, 2010
-
-
Laros authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/branches/web_dev@19 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Laros authored
(autogenerated from source) and wrote checks for the Delins. handler: - Added handling of everything in the templates/base directory. webservice: - Documented the Var_info webservice. index: - Added a documentation page, which generates documentation from source. Parser: - Changed argument parsing for Indel, Ins and Inv. Web: - Added a tal2() function to test METAL templates. Mutalyzer: - Added checks for Delins (trim the longest common prefix and the longest common suffix of Arg1 and Arg2). - Made a general function to check optional arguments. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@18 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Feb 19, 2010
-
-
Laros authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@17 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Laros authored
wrote documentation. Made an accumulative mapping info table, this requires regular polling of new data from the UCSC: - Added: - install.sh: A preliminary installation script. Now only used for cron entries. - Db.txt: Some loose documentation on how to make the new mapping table, to be incorporated with an installation script. - src/UCSC_update.py: The update program, to be called from cron each day. - Modified: - mutalyzer.conf: Added variables needed for the remote database of the UCSC. - Db.py: Rewritten nearly every SQL query to work with the new mapping table and to be able to download and import updates from the UCSC. - Config.py: Modifications to work with the new configuration variables. Added: - templates/sp.py: A webservice client template script. - templates/download.html: The download page for developers. Modified: - Install.txt: Added more depenencies. Switched to soaplib for the generation of a WSDL file. Webservices are now published by adding functions to the MutalyzerService class in webservice.py, each function should have a soapmethod decorator to specify the types. Modified: - handler.py: To work with soaplib. - webservice.py: - Put everything in a class to make soaplib able to generate a WSDL file. - Added the varInfo() webservice (calls the Variant_info script). - index.py: Added documentation. - Mutator.py: Added documentation. - Web.py: Added documentation. - Mutalyzer.py: Added generation of a new description in g. and c. notation. - Db.py: Modified the get_Transcripts function to be able to work with overlapping and non-overlapping ranges. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@16 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Feb 03, 2010
-
-
Laros authored
Added: - Web.py: A module with some general functions used by the interfaces. - A version (this is deliberately kept out of the config file). - A run() wrapper that returns standard output of any function as a string. - A tal() function that parses a TAL template. - A read() function that returns the input of a file. Modified: - Install.txt: In the apache config, a PythonPath must be set now (dynamically setting it did not give consistent output). - handler.py : Cleaned the source by using the Web class. - webservice.py : Cleaned the source by using the Web class. - index.py : Cleaned the source by using the Web class. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@15 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Feb 02, 2010
-
-
Laros authored
The main structure is as follows: / ; Root of the installation. - Install.txt - Todo.txt - Obsoleted.txt - mutalyzer.conf - var/ ; Variable data. - cache/ - mutalyzer.log - templates/ ; HTML, XML, JavaScript, etc. - src/ - Mutalyzer.py - Variant_info.py - Services/ ; Webservices. - Clients/ ; Example clients for webservices. - Modules/ ; The core modules. - Interfaces/ ; Interfaces to mod_python. Apart from changes that were needed to deal with this new structure, no changes in the code were made. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@14 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Feb 01, 2010
-
-
Laros authored
- Todo.txt. - handler.py: A general handler for mod_python. This handler dispatches SOAP services and a normal HTML publisher. Furthermore, it is able to handle raw requests to dump HTML or XML files. This handler is TAL enabled. - webservice.py: A publisher for webservices. When a new webservice is added, this is the entrypoint for the server side (just like index.py, add a function). - getTranscripts.py: A webservice that reports all transcripts that overlap with a certain genomic position (chomosome, position). - getGeneName.py: A webservice that finds the gene name of a given transcript. - service.wsdl: This is the definition of the interface for webservices. A client must download this file and parse it to obtain a programming interface, then the client can use this interface just like any local function. - Obsoleted.txt: A list of things that will be deleted in the future (but are still functional for backwards compatibility). - client/sp.py: A test client for the two webservices. Renamed Main.py to Mutalyzer.py. Modified: - mutalyzer.conf: Added a configurable date prefix for logging. - Install.txt: - To reflect the difference in configuration of apache to work with the new handler (requires less configuration). - Added TAL as a new dependency. - html/check.html: Made it a full TAL template. Title, version and output are now separated from the HTML design. - Mutator.py: Made the shiftpos() function public, this is needed for insertion checking. - Parser.py: Updated the comments. - Variant_info: Made all internal functions private. - Output.py: Updated the comments. - Config.py: Modified to reflect the changes in mutalyzer.conf. - Db.py: - We now keep the handle to the database open until the object is deleted. - Added a destructor that closes the handle to the database. - Added getTranscripts(): Get a list of transcripts, given a chromosome and a position on that chromosome. - Added get_GeneName(): Get the gene name of a given transcript. - Retriever.py: Updated the comments. - Crossmap.py: - Made a patch that handles a CDS start on the first position of the transcript. - Added more unit tests. - index.py: - Added a switch for older versions of LOVD, to generate the expected output in Variant_info. - Made this publisher compatible with TAL. - Mutalyzer.py: Made all internal functions private. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@13 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Dec 31, 2009
-
-
Laros authored
html includes, a server directive has to be altered (described in Install.txt). - check.html: The main page for mutation checks (used to be a large string in index.py). Output.py: - Added an instance variable, this is the name of the module that created the Output object. This variable is used for more verbose logging. Variant_info.py: - Modifications for the new Output.py. - Added error handling for parse errors. Main.py: - Modifications for the new Output.py. - Renamed function rrr() to main(). index.py: - Removed the large html string, it is now loaded from file with the __readhtml() function. - Made a __run() function, that wraps any function, executes it and returns standard output of this function as a string. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@12 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Dec 30, 2009
-
-
Laros authored
- It is now handled by index.py and mod_python. - Added a main() function in Variant_info.py to make it callable. Eliminated the need for a whole CDS, only CDS start and stop are needed from now on: - Variant_info.py: the CDS needs not to be built anymore. - Main.py: We now give the location of te CDS, not the CDS itself. Crossmapper.py: - Fixed a bug that occurred when the CDS starts on the last nucleotide of an exon. - Added a ``small CDS'' and a ``CDS start on splice site'' test in the unit test of the crossmapper. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@11 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Dec 23, 2009
-
-
Laros authored
can be sent to this object, which sends errors and explicit logging messages to a log (defined in mutalyzer.conf) and warning messages to standard output. Behaviour of this object may change in the future, adding severity and logging above a certain severity level is one option that would increase debugging possibilities. - NiceName() returns a short description of the calling program (we can not use the default __name__ here. - ErrorMsg() Print the nice name of the calling module, an error message and log it. Also increase an error counter. - WarningMsg() Print the nice name of the calling module and an error message. Also increase a warning counter. - LogMsg() Only log the message (nice name of the calling module and the message itself). - Summary() Give the number of errors and warnings. - A unit test is also defined (it does not do much at this moment). Added index.py: The web interface to mutalyzer, it is dependent on mod-python, we chose for this interface to eliminate the need for php. Also apache is now added to the list of dependencies. The configuration of mod-python is described in Install.txt. GenRecord.py: - Moved the splice() function to Main.py. - Added an exon list to the Plist class. This list can be used as a fallback in case the mRNA tag is missing from a GenBank file. - Added an empty unit test. Mutator.py: - Added a standard alignment for visualisation. This will probably be replaced in newer versions. Parser.py: - Added functionality for the new output module. - Made the parser gracefully return, instead of exit on a parse error. This is needed for the web interface. Variant_info.py: - Added functionality for the new output module. Note that output generated by this program should go to a different log, something for a future version. - Fixed a bug that occurred when a CDS start or stop was on an exon boundary. Main.py (heavily under development, names of functions are not very descriptive yet): - Added functionality for the new output module. - Fixed a bug in the roll() function, it returned a wrong value for forward genes. - Added the function bprint(), it formats a large string to be printed in an insightful way (like GenBank does it), it also prints the offsets at the beginning of each line. - Obsoleted the ErrorMsg() and WarningMsg() functions. - Added constructCDS(), a function that is able to construct a CDS from an mRNA list, CDS start and CDS stop. In the future we would like to work without a CDS list, so this function will be obsoleted. - Added the splice() function (from GenRecord.py). - Made a function rv() that is able to process a RawVar. This function is seperated from the ppp() function to be able to work with an allele description. - Added splicing. - Added translation to a protein. - Added a function rrr() which is to be called from Main.py itself or from index.py. Config.py: - Added a log variable for Output.py. Crossmap.py: - Fixed a bug concerning genes where the entire CDS is in one exon. - Added more uncertainty handling. Retriever.py: - Added functionality for the new output module. - Added handling of accession numbers with no version. It downloads the latest version, and gives a warning. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@10 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Nov 30, 2009
-
-
Laros authored
- c2g and g2c conversion. - Returns information about translation start, translation end and CDS stop when no variant is given. Variant_info.py: - Added g2c conversion. - Added transcription start, transcription end and CDS stop info. - Made the variant optional (will return info if the variant is empty). - Made a function getcoords() that returns a triple (main, offset, g) to do both c2g and g2c conversions. - For these conversions to work seamlessly, changes to the crossmapper were made. g2x() now returns a tuple (main, offset) in non-star notation and helper functions are made to convert from and to HGVS notation. Parser.py: - Removed the `i' from the list of possible nucleotides, otherwise the word `delins' is ambiguous. - Replaced the sign `>' to `subst' in the output of the parser, to make the downstream code more readable. - Replaced the `ins' output to `delins' when a combination of `del' and `ins' is found. - Added a PtLoc as a prefix for an indel, this used to be a range only. - Added parenthesis and a question mark as optional in RawVar and SimpleAlleleVar both are used to indicate uncertainty. E.g. 12del;(12del);(12del)?;12del?;(12del;12del)? - Added a unit test. Main.py: - Use the more readable `subst' instead of `>'. - Use the g2c() helper function instead of g2x(). Config.py: - Added a unit test (will crash if no configuration file is found). Crossmap.py: - Renamed __star and __rstar to int2main and main2int. They are now helper functions that can be used externally. - Added the __trans_start and __trans_end member variables, used for the new info() member function and for the `u' and `d' offset prefixes for upstream and downstream UTR positions. - Added the functions int2offset() and offset2int() to translate a tuple (main, offset) to a HGVS intronic position and an intronic position to an offset. - Added tuple2string() that converts a tuple (main, offset) to a c. notation. - Added helper function g2c(), see Main.py. - Added an info() function that returns a triple (trans_start, trans_end, CDS_stop). - Removed the functions c2str(), star2abs(), off2int() and c2g(). - Convert a `?' as an intronic position to a 0. This may be changed in the future. - Altered the unit test to reflect the changes. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@9 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Nov 17, 2009
-
-
Laros authored
- c2g.php: - Used escapeshellargs to harden the code. - Added a content line to return plain text instead of html. - c2g.py: - Fully documented the code. - Made a function to convert a list of strings to a list of integers (to avoid excessive int() calls). - Added error handling for missing NM numbers and missing version numbers. - Parser.py: - Made a unit test for a rather complicated variant. - Config.py: - Made a unit test (will crash if no config file is found). - Db.py: - Added error handling for the get_NM_version() function. It will now return 0 if an error occurs. It will also return an integer instead of a string from now on. - Made a unit test that gets the username / db information from the config file and tests all query functions (will crash if the MySQL database is not configured correctly, or if the config file does not contain the right info). - Crossmap.py: - Fixed a bug in the Crossmap module (positions shifted when the CDS starts in an other exon). - Made an extensive unit test: - Check the splice sites in c. notation of a hypothetical gene. - Check whether the splice sites are the same for the same gene in reverse orientation. - Do some conversion checking from c. to g. and vice versa in both orientations. - Do all tests above for the n. notation too. - Check whether the c. notation of the start codon does not change if an upstream exon is removed (also see the previous bug). - Removed the __STAR variable and introduced the __STOP variable. The difference is that __STOP contains the real stop position, which makes conversion to a position relative to the start codon easier. - Made star2abs(), which converts a c. position in star-notation to one that is relative to the start codon (for c2g). - Made off2int(), which converts a sign, offset pair to an integer (for c2g). - Retriever.py: - Added a unit test that retrieves an accession number (will crash if the config file does not contain the right info). - Mutator.py, Scheduler.py: - Added an empty unit test (tests needed). git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@8 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Nov 11, 2009
-
-
Laros authored
- src/c2g.py - This program returns the genomic coordinates of a position in c. notation, it depends on the Config, Db Crossmap and Parser modules. - The input consists of the following variables: - LOVD_ver ; The version of LOVD, ignored for now. - build ; The build of the human genome, assumed to be hg19 for now. - accno.version ; The accession number and version on which the c. positions are defined. - var ; The variation (without an accession number). - If the accession number and version are present in the database, it returns all positions (one or two) in g. notation relative to the chromosome. - c2g.php - This is a web interface for c2g.py, it is called in the following way: c2g.php?LOVD_ver=a&build=b&acc=c.d&var=e where a, b, c, d and e are described above (note that the variation (e) should be HTML encoded). Altered (with respect to c2g.py): - Install.txt - Documented how to set up the c2g program. - Db.py - Added two new query functions: - get_NM_info ; Get exonStarts, exonEnds, cdsStart, cdsEnd and strand information, given an NM accession number. - get_NM_version ; Get the NM version, given an NM accession number. - Made a general query function, that is called by the specific query functions. - Crossmap.py - Added helper functions for the output of c2g. - c2str ; Returns a string given mainsgn, main, offsgn and offset. - c2g ; Returns a genomic position given mainsgn, main, offsgn and offset. Altered (for Mutalyzer itself): - GenRecord.py - Added default locus tag handling. - Mutator.py - Added a duplication function. - Parser.py - Added argument passing: - Substitutions: Arg1, Arg2 (Arg1>Arg2 for example). - Deletions: Arg1 (delArg1). - Ins: Arg1 (insArg1). - Retriever.py - Added code to make the cache directory if it does not exist. This eliminates the need for the clean.sh script. - Main.py (most of the new functions have to be migrated elsewhere) - Altered the roll function to be able to roll in both directions. - Made a palindrome snoop function that finds the smallest string that is not invariant under reverse complement, this function is also used to detect perfect palindromes. - Made PtLoc2main and PtLoc2offset; get integers from the locations returned by the nomenclature parser. - Made Error and Warning message wrapper functions. - Made a function that tests if a string is an integer. - Made the parsing less verbose. - Extracted the positions from raw variations. - Wrote code to deal with reversed ranges. - Added error handling and warning messages for: - Substitutions. - Deletions. - Duplications. - Inversions. - Insertions. Removed: - clean.sh - It is no longer needed, since all temporary files and directories are no longer in subversion. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@7 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Oct 21, 2009
-
-
Laros authored
used in BioPython. - GenBank starts counting at 1, BioPython at 0. - GenBank uses absolute positions, BioPython uses interbase positions. Main.py still contains test cases to test the modules, so it's always under development. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@6 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Oct 09, 2009
-
-
Laros authored
put it in a new module named Config. This is done because an other new module named Db also needs the configuration file. De file Db.py is new and contains a function that interfaces to a MySQL database. This database is used for the mapping of NM to NP accession numbers. The file mutalyzer.conf now has two more options: a MySQL user name and a database name. Started the documentation of a fresh installation, see Install.txt for more details. In Mutator.py: Added a function that calculates the positions of splice sites after mutatations. In Scheduler.py: Added some code (still in comment) that can parallelise jobs by using treads and detecting the number of processors present on the host. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@5 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Oct 01, 2009
-
-
Laros authored
GenRecord.py ; Provides a class, which main purpose is to extracts relevant data from GenBank parse objects. The output is a nested dictionary with per-gene information: - A list of genes, indexed by gene name. - A list of locus tags. - The orientation. - An optional mRNA field. - A location. - An mRNA list. - An optional CDS list. - A location. - An CDS list. This structure may be changed in the future. Furthermore, it contains a function that does the actual splicing. Mutator.py ; Provides a class that is capable of mutating a string, while keeping track of all mutations. This way, the coordinates of the original string can be used for mutations. Scheduler.py ; Provides a class that can schedule both interactive and batch jobs. - Once every two turns, an interactive job is selected. - Each other turn, a batch queue is selected, from which a job is selected. Modified: Main.py ; Will always change, since it contains test functions and prototypes. Parser.py ; Moved all the test functions to Main.py, added better error messaging for parse errors, added comment. Made various changes to the parser itself, not noteworthy of further explanation since the parser is still under development. Crossmap.py ; Changed some comment, commented out a debug function and added some white space for readability. Retriever.py ; Changed some comment. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@4 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Aug 28, 2009
-
-
Laros authored
highly flexible and understandable. The output is a hierarchical parse object which contains the relevant data. The parser accepts nested expressions. Detailed information on the parser should be provided in the future (when all the functionality is tested). In the file Main.py a couple of test functions are added (later to be exported to separate modules). - A function that finds the last occurrence of a pattern or one of its cyclic permutations within a repeating sequence. - A function that extracts relevant data from the GenBank parse object. The output is a nested dictionary with per-gene information: - A list of genes, indexed by gene name. - A list of locus tags. - The orientation. - An optional mRNA field. - A location. - An mRNA list. - An optional CDS list. - A location. - An CDS list. This structure may be changed in the future. - A function that extracts a raw list from the mRNA and CDS lists, to be used in the mapping functions (g. to c. and so on). - A function that does the actual splicing, to be used as input for mutation functions. git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@3 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
- Jul 28, 2009
-
-
Laros authored
- Crossmap.py does the conversions g. -> c. or n. and vice versa. - Retriever.py fetches a record from the NCBI or the cache. - Main.py contains a couple of functions to test Crossmap.py and Retriever.py. - clean.sh is a script that removes cache/* and src/*.pyc. - mutalyzer.conf contains configuration variables. Make sure to call all functions from . (not ./src), otherwise relative paths can no longer be used (now used in Retriever.py and mutalyzer.conf). git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@2 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-
Laros authored
git-svn-id: https://humgenprojects.lumc.nl/svn/mutalyzer/trunk@1 eb6bd6ab-9ccd-42b9-aceb-e2899b4a52f1
-