Can not control build number for sliceChromosomeByGene
Created by: mutalyzerbot
Original ticket: https://humgenprojects.lumc.nl/trac/mutalyzer/ticket/137 Original date: 2013/02/27 Original reporter: I DOT F DOT A DOT C DOT Fokkema AND LUMC DOT nl
L.S.,
Regarding our earlier discussion: I realized today that the sliceChromosomeByGene function (just like the reference sequence file loader form on the website) does not ask me for which build I would like to get the reference sequence from. I understood from you NCBI's method does not support this, and defaults to the most recent build. This currently causes problems for LOVDs still working with hg18, and will in the future cause problems for all LOVDs, when the NCBI decides to switch to hg20.
I would like to propose the following:
- Please add an explanation about this default on the website's form and on the web service's form.
- If possible, please report this to the NCBI (I don't have the necessary info to do this myself).
- As a workaround (for human genes only) I propose the following:
- Allow for hg18, hg19 and later, hg20 as valid values in the Organism field.
- When these values are encountered, query the mapping database for the given build number and the given gene.
- Calculate min(start_position_transcripts) and max(end_position_transcripts)
- Use these values to retrieve the slice.