"[Michiel van Galen](mailto:m.van_galen@lumc.nl), [Jeroen Laros](mailto:j.f.j.laros@lumc.nl), [Martijn Vermaat](mailto:m.vermaat.hg@lumc.nl), [Way Yi Leung](mailto:w.y.leung@lumc.nl)\n",
"\n",
"[Department of Human Genetics, Leiden University Medical Center](http://humgen.nl)\n",
"\n",
"[Sequencing Analysis Support Core, Leiden University Medical Center](http://sasc.lumc.nl)\n",
"\"High-profile journals have called for increased openness in computational sciences. Some prestigious journals, including Science, have even started to demand of authors to provide the source code for simulation software used in publications to readers upon request.\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Reproducible Research in Computational Science, Roger D. Peng, Science 334, 1226 (2011).\n",
"- Shining Light into Black Boxes, A. Morin et al., Science 336, 159-160 (2012).\n",
"- The case for open computer programs, D.C. Ince, Nature 482, 485 (2012)."
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"The cornerstone of the scientific method:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Replication & Reproduction"
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"To achieve this in scientific computing (programming):"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Any source code which generates data should be:\n",
" - Tracked!\n",
" - Backed up and secured\n",
" - Ideally published online\n",
" - Additionally: Also track external software versions and settings\n"
"- Export your notebooks to PDF or HTML (nbconvert)\n",
"- Share notebooks easily with nbviewer"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Why Notebook?"
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"\"Web-based interactive computational environment where you can combine code execution, text, mathematics, plots and rich media into a single document.\""
]
},
{
"cell_type": "heading",
"level": 4,
"metadata": {},
"source": [
"\"It is based on the IPython shell, but provides a cell-based environment with great interactivity, where calculations can be organized documented in a structured way.\""
]
},
{
"cell_type": "heading",
"level": 3,
"metadata": {},
"source": [
"<br></br><br></br><br></br><br></br>\n",
"<font color='darkgreen'>Track, store and share elegant code using IPython Notebooks & Git!</font>\n"
"Let's add some markdown cells to the notebook you created earlier:\n",
" - Select the top code cell\n",
" - Press ESC to go into command mode\n",
" - Press 'a', this will add a cell above the selected cell\n",
" - Notice the focus is on the new cell\n",
" - Now press 'm' to set the celltype to 'Markdown'\n",
" - Press ENTER and add some code (see below for example)\n",
" - Run the cell and see if it worked\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Example markdown code:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### My next notebook\n",
"\n",
"This notebook is my first notebook with some experimental code.\n",
"\n",
"Author: [Michiel van Galen](mailto:m.van_galen@lumc.nl)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, try to add more python code to the notebook:\n",
" - Make sure to set the cell type to code\n",
" - Add the 'reverse' function to your notebook as shown below"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def reverse(seq):\n",
" rev = seq[::-1]\n",
" return rev"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 42
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Try using both the 'reverse' and 'translate' function."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"reverse('AACGT')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 45,
"text": [
"'TGCAA'"
]
}
],
"prompt_number": 45
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"translate(_)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 46,
"text": [
"'ACGTT'"
]
}
],
"prompt_number": 46
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"\u00a7 Excercise : My first Notebook (3)\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A palindromic sequence is a nucleic acid sequence (DNA or RNA) that is the same whether read 5' (five-prime) to 3' (three prime) on one strand or 5' to 3' on the complementary strand with which it forms a double helix. Palindromic sequences play an important role in molecular biology:\n",