diff --git a/BioPython/Biopython.ipynb b/BioPython/Biopython.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..4819b54bf12e10d10d2dbf477374b05011611ba0 --- /dev/null +++ b/BioPython/Biopython.ipynb @@ -0,0 +1,861 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Biopython\n", + "\n", + "\n", + "\n", + "## A quick overview\n", + "### [Guy Allard](mailto://w.g.allard@lumc.nl)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# What is Biopython?\n", + "\n", + "## 'Python Tools for Computational Molecular Biology'\n", + "\n", + "- Fully open-source\n", + "- Actively developed\n", + "- Large community" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# What can it do?\n", + "\n", + "Modules, classes and functions for manipulating biological data\n", + "\n", + "- File parsers and writers.\n", + " - Sequence files: fasta, fastq, genbank, abi, sff, etc.\n", + " - Alignment files: clustal, emboss, phylip, nexus, etc.\n", + " - Sequence search outputs: BLAST, HMMER, BLAT, etc.\n", + " - Phylogenetic trees: newick, nexus, phyloxml, etc.\n", + " - Sequence motifs: AlignAce, TRANSFAC, etc.\n", + " - Others: PDB files, etc.\n", + "- Access to remote resources (e.g., Entrez, NCBI BLAST).\n", + "- Application wrappers.\n", + "- A simple graphing tool.\n", + "- Simple algorithms (e.g., pairwise alignment, cluster analysis).\n", + "- References such as codon tables and IUPAC sequences." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Where can I find more information?\n", + "\n", + "- [Biopython Homepage](http://biopython.org/)\n", + "- [Biopython development repository](http://github.com/biopython/biopython)\n", + "- [Biopython mailing list](http://lists.open-bio.org/pipermail/biopython/)\n", + "- [Biopython 'cookbook'](http://biopython.org/DIST/docs/tutorial/Tutorial.html) (essential reading!)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Manipulating Sequence Data\n", + "\n", + "## Bio.SeqIO\n", + "\n", + "Input and output of sequence files.\n", + "\n", + "- SeqIO.read\n", + " - Read a file containing a single sequence\n", + "- SeqIO.parse \n", + " - Iterate over all sequences in a sequence file\n", + "- SeqIO.write\n", + " - write sequences to a file" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "ID: 1\n", + "Name: 1\n", + "Description: 1\n", + "Number of features: 0\n", + "Seq('TGGAACATGTCCCGCTAGCTTCTTCTTGCTAGCAGATTTTTTCAGTTGATCGTC...TCT', SingleLetterAlphabet())\n" + ] + } + ], + "source": [ + "from Bio import SeqIO\n", + "\n", + "# read the first sequence\n", + "for record in SeqIO.parse(\"../data/records.fa\", \"fasta\"):\n", + " dna = record\n", + " break\n", + "\n", + "print dna" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "Each record is an object with several fields, including:\n", + "\n", + "- record.id\n", + " - the sequence id\n", + "- record.name\n", + " - sequence name, usually the same as the id\n", + "- record.description\n", + " - sequence description\n", + "\n", + "The actual sequence is a separate object contained within the record which can be accessed using record.seq\n", + "\n", + "The sequence has an 'alphabet' associated with it which defines which letters are allowed.\n", + "\n", + "Different alphabets are used for DNA, RNA, protein etc." + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "TGGAACATGTCCCGCTAGCTTCTTCTTGCTAGCAGATTTTTTCAGTTGATCGTCACATGCGGTAGACTACCCAAGGTGTGACTACTCGCATGCCTGATCT\n" + ] + } + ], + "source": [ + "print dna.seq" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "There are lots of built in methods that can be used to manipulate the sequence\n", + "\n", + "The sequence acts like a string in many ways" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "length: 100\n" + ] + } + ], + "source": [ + "# get the length of the sequence\n", + "print \"length: \", len(dna.seq)" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "TGGAACATGT\n" + ] + } + ], + "source": [ + "# slice and dice\n", + "print dna.seq[:10]" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "tggaacatgtcccgctagcttcttcttgctagcagattttttcagttgatcgtcacatgcggtagactacccaaggtgtgactactcgcatgcctgatct\n" + ] + } + ], + "source": [ + "# change the case\n", + "print dna.seq.lower()" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "TGGAACATGTTGCCTGATCT\n" + ] + } + ], + "source": [ + "# concatenate the first and last 10 nucleotides\n", + "print dna.seq[:10] + dna.seq[-10:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "But also has more sequence-specific methods" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "ACCTTGTACAGGGCGATCGAAGAAGAACGATCGTCTAAAAAAGTCAACTAGCAGTGTACGCCATCTGATGGGTTCCACACTGATGAGCGTACGGACTAGA\n" + ] + } + ], + "source": [ + "# complement\n", + "print dna.seq.complement()" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "AGATCAGGCATGCGAGTAGTCACACCTTGGGTAGTCTACCGCATGTGACGATCAACTGAAAAAATCTGCTAGCAAGAAGAAGCTAGCGGGACATGTTCCA\n" + ] + } + ], + "source": [ + "# reverse complement\n", + "print dna.seq.reverse_complement()" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "UGGAACAUGUCCCGCUAGCUUCUUCUUGCUAGCAGAUUUUUUCAGUUGAUCGUCACAUGCGGUAGACUACCCAAGGUGUGACUACUCGCAUGCCUGAUCU\n" + ] + } + ], + "source": [ + "# transcribe from DNA to RNA\n", + "rna = dna.seq.transcribe()\n", + "print rna" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "WNMSR*LLLASRFFQLIVTCGRLPKV*LLACLI\n" + ] + } + ], + "source": [ + "# Translate from nucleotide to protein\n", + "protein = dna.seq.translate()\n", + "print protein" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "Sequence records can easily be written to a file.\n", + "\n", + "Specifying the file type allows conversion between different formats.\n", + "\n", + "For example, to convert from a fastq file to fasta format:" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "records = SeqIO.parse(\"../data/easy.fastq\", \"fastq\")\n", + "SeqIO.write(records, \"tmp.fasta\", \"fasta\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Remote files\n", + "\n", + "NCBI allow for remote querying of their Entrez database, and Biopython allows us to use their services from within python.\n", + "\n", + "We can use the Entrez.efetch utility to retrieve various records from one of NCBI's databases.\n", + "\n", + "A full list of these services and their documentation can be found on the [Entrez utilities help page](https://www.ncbi.nlm.nih.gov/books/NBK25500/)" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "collapsed": true, + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "from Bio import Entrez" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "source": [ + "IMPORTANT:\n", + "\n", + "To monitor potential excessive use of their services, NCBI requests you to specify your email address with each request.\n", + "\n", + "With Biopython, you can set it once for your session like this:" + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "collapsed": true, + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "Entrez.email = 'python@lumc.nl'" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "Now we can make a query of the database.\n", + "\n", + "The Entrez.efetch function returns a file-like handle that instead of pointing to a local file, points to a remote resource." + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": { + "collapsed": true, + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [], + "source": [ + "efetch_handle = Entrez.efetch(db=\"nucleotide\", id=\"NM_005804\",\n", + " rettype=\"gb\", retmode=\"text\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "We can use the handle as if it were a normal file handle opened with ```open(\"filename\", \"r\")```, and read from it using SeqIO.read()" + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "ID: NM_005804.3\n", + "Name: NM_005804\n", + "Description: Homo sapiens DExD-box helicase 39A (DDX39A), transcript variant 1, mRNA\n", + "Number of features: 25\n", + "/comment=REVIEWED REFSEQ: This record has been curated by NCBI staff. The\n", + "reference sequence was derived from DA432925.1, BC001009.2 and\n", + "BM792110.1.\n", + "This sequence is a reference standard in the RefSeqGene project.\n", + "On Oct 14, 2010 this sequence version replaced gi:21040370.\n", + "Summary: This gene encodes a member of the DEAD box protein family.\n", + "These proteins are characterized by the conserved motif\n", + "Asp-Glu-Ala-Asp (DEAD) and are putative RNA helicases. They are\n", + "implicated in a number of cellular processes involving alteration\n", + "of RNA secondary structure, such as translation initiation, nuclear\n", + "and mitochondrial splicing, and ribosome and spliceosome assembly.\n", + "Based on their distribution patterns, some members of the DEAD box\n", + "protein family are believed to be involved in embryogenesis,\n", + "spermatogenesis, and cellular growth and division. This gene is\n", + "thought to play a role in the prognosis of patients with\n", + "gastrointestinal stromal tumors. A pseudogene of this gene is\n", + "present on chromosome 13. Alternate splicing results in multiple\n", + "transcript variants. Additional alternatively spliced transcript\n", + "variants of this gene have been described, but their full-length\n", + "nature is not known. [provided by RefSeq, Sep 2013].\n", + "Transcript Variant: This variant (1) represents the longer\n", + "transcript.\n", + "Publication Note: This RefSeq record includes a subset of the\n", + "publications that are available for this gene. Please see the Gene\n", + "record to access additional publications.\n", + " SRR1163655.274234.1 [ECO:0000332]\n", + " SAMEA1965299, SAMEA1966682\n", + " [ECO:0000350]\n", + "COMPLETENESS: complete on the 3' end.\n", + "/source=Homo sapiens (human)\n", + "/taxonomy=['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']\n", + "/structured_comment=OrderedDict([('Evidence-Data', OrderedDict([('Transcript exon combination', 'SRR1163655.176131.1,'), ('RNAseq introns', 'mixed/partial sample support')]))])\n", + "/keywords=['RefSeq']\n", + "/references=[Reference(title='The RNA helicase DDX39B and its paralog DDX39A regulate androgen receptor splice variant AR-V7 generation', ...), Reference(title='Identification of DDX39A as a Potential Biomarker for Unfavorable Neuroblastoma Using a Proteomic Approach', ...), Reference(title='Up-regulation of DDX39 in human malignant pleural mesothelioma cell lines compared to normal pleural mesothelial cells', ...), Reference(title='The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts', ...), Reference(title='Clinical proteomics identified ATP-dependent RNA helicase DDX39 as a novel biomarker to predict poor prognosis of patients with gastrointestinal stromal tumor', ...), Reference(title='The closely related RNA helicases, UAP56 and URH49, preferentially form distinct mRNA export machineries and coordinately regulate mitotic progression', ...), Reference(title='Hcc-1 is a novel component of the nuclear matrix with growth inhibitory function', ...), Reference(title='Growth-regulated expression and G0-specific turnover of the mRNA that encodes URH49, a mammalian DExH/D box protein that is highly related to the mRNA export protein UAP56', ...), Reference(title='Analysis of a high-throughput yeast two-hybrid system and its use to predict the function of intracellular proteins encoded within the human MHC class III region', ...), Reference(title='The BAT1 gene in the MHC encodes an evolutionarily conserved putative nuclear RNA helicase of the DEAD family', ...)]\n", + "/accessions=['NM_005804']\n", + "/molecule_type=mRNA\n", + "/data_file_division=PRI\n", + "/date=11-JUN-2017\n", + "/organism=Homo sapiens\n", + "/sequence_version=3\n", + "/topology=linear\n", + "Seq('AGCAGCAGCCCGACGCAAGAGGCAGGAAGCGCAGCAACTCGTGTCTGAGCGCCC...AAA', IUPACAmbiguousDNA())\n" + ] + } + ], + "source": [ + "ncbi_record = SeqIO.read(efetch_handle, 'genbank')\n", + "\n", + "print ncbi_record" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "It is also possible to query for multiple records" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": { + "collapsed": true, + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [], + "source": [ + "efetch_handle = Entrez.efetch(db=\"nucleotide\", id=[\"NM_005804\",\"NM_000967\"],\n", + " rettype=\"gb\", retmode=\"text\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "Which can then be iterated over using ```SeqIO.parse```" + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "NM_005804.3 Homo sapiens DExD-box helicase 39A (DDX39A), transcript variant 1, mRNA\n", + "NM_000967.3 Homo sapiens ribosomal protein L3 (RPL3), transcript variant 1, mRNA\n" + ] + } + ], + "source": [ + "for record in SeqIO.parse(efetch_handle, 'genbank'):\n", + " print record.id, record.description" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Remote Tools\n", + "\n", + "It is possible to use Biopyton with remote tools.\n", + "\n", + "For example, we can submit a BLAST search to the NCBI service. ([Documentation here](https://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html))\n", + "\n", + "We will use qblast function in the Bio.Blast.NCBIWWW module to perform a BLAST search using the record we retrieved earlier.\n", + "\n", + "NOTE: It can take some time for the search results to become available" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": { + "collapsed": true, + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [], + "source": [ + "from Bio.Blast.NCBIWWW import qblast\n", + "blast_handle = qblast('blastn', 'refseq_mrna', ncbi_record.seq)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "We can the read from the file handle using the ```Bio.SearchIO``` module." + ] + }, + { + "cell_type": "code", + "execution_count": 90, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Program: blastn (2.7.0+)\n", + " Query: No (1558)\n", + " definition line\n", + " Target: refseq_mrna\n", + " Hits: ---- ----- ----------------------------------------------------------\n", + " # # HSP ID + description\n", + " ---- ----- ----------------------------------------------------------\n", + " 0 1 gi|308522777|ref|NM_005804.3| Homo sapiens DExD-box he...\n", + " 1 1 gi|1034056594|ref|XM_016946961.1| PREDICTED: Pan trogl...\n", + " 2 1 gi|1034130164|ref|XM_016935303.1| PREDICTED: Pan trogl...\n", + " 3 1 gi|675689963|ref|XM_003807080.2| PREDICTED: Pan panisc...\n", + " 4 1 gi|1099186172|ref|XM_004060164.2| PREDICTED: Gorilla g...\n", + " 5 1 gi|686757516|ref|XM_002828787.3| PREDICTED: Pongo abel...\n", + " 6 1 gi|795239725|ref|XM_011953207.1| PREDICTED: Colobus an...\n", + " 7 1 gi|1059109912|ref|XM_017851234.1| PREDICTED: Rhinopith...\n", + " 8 1 gi|724815869|ref|XM_010361572.1| PREDICTED: Rhinopithe...\n", + " 9 1 gi|1220191829|ref|XM_009193704.2| PREDICTED: Papio anu...\n", + " 10 1 gi|982311930|ref|XM_005588244.2| PREDICTED: Macaca fas...\n", + " 11 1 gi|967496221|ref|XM_015123082.1| PREDICTED: Macaca mul...\n", + " 12 1 gi|768000518|ref|XM_011527620.1| PREDICTED: Homo sapie...\n", + " 13 1 gi|635036575|ref|XM_007995524.1| PREDICTED: Chlorocebu...\n", + " 14 1 gi|795271240|ref|XM_011768647.1| PREDICTED: Macaca nem...\n", + " 15 1 gi|795144436|ref|XM_011981779.1| PREDICTED: Mandrillus...\n", + " 16 1 gi|1034130166|ref|XM_016935304.1| PREDICTED: Pan trogl...\n", + " 17 1 gi|1034056596|ref|XM_016946962.1| PREDICTED: Pan trogl...\n", + " 18 1 gi|795433285|ref|XM_012094155.1| PREDICTED: Cercocebus...\n", + " 19 1 gi|795433280|ref|XM_012094154.1| PREDICTED: Cercocebus...\n", + " 20 1 gi|795239720|ref|XM_011953206.1| PREDICTED: Colobus an...\n", + " 21 1 gi|1059109914|ref|XM_017851235.1| PREDICTED: Rhinopith...\n", + " 22 1 gi|1220191830|ref|XM_021930788.1| PREDICTED: Papio anu...\n", + " 23 1 gi|685606530|ref|XM_009193706.1| PREDICTED: Papio anub...\n", + " 24 1 gi|1220191832|ref|XM_017952263.2| PREDICTED: Papio anu...\n", + " 25 1 gi|982311931|ref|XM_005588245.2| PREDICTED: Macaca fas...\n", + " 26 1 gi|967496225|ref|XM_015123084.1| PREDICTED: Macaca mul...\n", + " 27 1 gi|967496223|ref|XM_015123083.1| PREDICTED: Macaca mul...\n", + " 28 1 gi|967496227|ref|XM_015123085.1| PREDICTED: Macaca mul...\n", + " 29 1 gi|795271249|ref|XM_011768705.1| PREDICTED: Macaca nem...\n", + " ~~~\n", + " 47 1 gi|826285426|ref|XM_012641398.1| PREDICTED: Propithecu...\n", + " 48 1 gi|947308602|ref|XM_006161013.2| PREDICTED: Tupaia chi...\n", + " 49 1 gi|1220191833|ref|XM_021930789.1| PREDICTED: Papio anu...\n" + ] + } + ], + "source": [ + "from Bio import SearchIO\n", + "qresult = SearchIO.read(blast_handle, 'blast-xml')\n", + "print qresult" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Query: No\n", + " definition line\n", + " Hit: gi|308522777|ref|NM_005804.3| (1558)\n", + " Homo sapiens DExD-box helicase 39A (DDX39A), transcript variant 1, mRNA\n", + " HSPs: ---- -------- --------- ------ --------------- ---------------------\n", + " # E-value Bit score Span Query range Hit range\n", + " ---- -------- --------- ------ --------------- ---------------------\n", + " 0 0 2810.93 1558 [0:1558] [0:1558]\n" + ] + } + ], + "source": [ + "print qresult[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Query: No\n", + " definition line\n", + " Hit: gi|1034056594|ref|XM_016946961.1| (1530)\n", + " PREDICTED: Pan troglodytes ATP-dependent RNA helicase DDX39A (LOC1079...\n", + " HSPs: ---- -------- --------- ------ --------------- ---------------------\n", + " # E-value Bit score Span Query range Hit range\n", + " ---- -------- --------- ------ --------------- ---------------------\n", + " 0 0 2695.52 1519 [0:1519] [11:1530]\n" + ] + } + ], + "source": [ + "print qresult[1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# That was just an overview\n", + "\n", + "This lesson was just a small taste of what can be done with Biopython.\n", + "\n", + "I strongly recommend looking at the [Biopython 'cookbook'](http://biopython.org/DIST/docs/tutorial/Tutorial.html) to get an idea of the wide range of things that you can do with it." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "source": [ + "The lesson was based on previous material by [Wibowo Arindrarto](mailto://w.arindrarto@lumc.nl) and Martijn Vermaat.\n", + "\n", + "License: [Creative Commons Attribution 3.0 License (CC-by)](http://creativecommons.org/licenses/by/3.0)" + ] + } + ], + "metadata": { + "celltoolbar": "Slideshow", + "kernelspec": { + "display_name": "Python 2", + "language": "python", + "name": "python2" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}