Changes

Anvar · 6985e74b
--- a/home.md
+++ b/home.md
@@ -14,6 +14,7 @@ Leiden University Medical Center <br>
 1. [Tandem 3' UTR Analysis](#tandem-3-utr-analysis-)
 1. [Sequence Motif Analysis Relative to Acceptor and Donor Sites](#sequence-motif-analysis-relative-to-acceptor-and-donor-sites-)
 1. [RNA Binding Motif Analysis](#rNA-binding-motif-analysis-)
+1. [Open Reading Frame Prediction and Proteomics Data Analysis](#open-reading-frame-prediction-and-proteomics-data-analysis-)
 1. [Scripts]()
 1. [Reported Bugs and Fixes](#reported-bugs-and-fixes-)
 1. [Citation](#citation-)
@@ -133,6 +134,12 @@ We locally ran DREME (version 4.11.4) for each region separately and performed a

 <p align="right"> [`TOP`](#full-length-mrna-sequencing-uncovers-a-widespread-coupling-between-transcription-and-mrna-processing)</p><br>

+---
+## **Open Reading Frame Prediction and Proteomics Data Analysis** <br>
+ORF prediction was done on the PacBio MCF-7 sequences using [ANGEL](http://www.github.com/Magdoll/ANGEL). Prediction was done on both the PacBio consensus reads and a genome-corrected version of the transcript, and whichever produced the longer ORF was chosen to represent the transcript CDS. The predicted MCF-7 ORF sequences were concatenated with Gencode version 19 and protein sequences representing common mass spectrometry (MS) contaminants, creating a customized FASTA file (i.e., proteomics search database). The [Morpheus](http://cwenger.github.io/Morpheus/) software (version 131) was employed for MS searching of the custom database against the MCF-7 Thermo Raw files obtained from [Geiger et al.](http://www.mcponline.org/content/11/3/M111.014050.long) study. Unknown precursor charge state range was set to +2 to +4. Absolute and relative MS/MS intensity thresholds were disabled. Maximum number of MS/MS peaks were set to 400. Assign charge state was set to true. De-isotoping was disabled. The protease specificity was set to trypsin with no proline rule enabled. Up to 1 missed cleavage was allowed and N-terminal methionine truncations was variable. Fixed modifications used were carbamidomethylation of cysteines. Variable modification used was oxidation of methionines. Precursor mass tolerance used was 2.1 Daltons (monoisotopic) and product mass tolerance was 0.025 Daltons (monoisotopic). Modified forms of the same peptide were collapsed and treated as one peptide identification for calculation of false discovery rate (FDR). An FDR of 1% was used to filter for final peptide identifications. All identified peptides were categorized as: single-transcript if the peptide matches to only one gene with one transcript; sub-gene if the peptide matches to a subset of transcripts of only one gene; single-gene if the peptide matches to all transcripts of only one gene; and multi-gene if the peptide matches to multiple transcripts from multiple genes.<br>
+
+<p align="right"> [`TOP`](#full-length-mrna-sequencing-uncovers-a-widespread-coupling-between-transcription-and-mrna-processing)</p><br>
+
 ---
 ## **Reported Bugs and Fixes** <br>
 So far, we have not received any bug reports! In this section, we will report any future changes to the procedure or the accompanied scripts. Feel free to send in your suggestions and comments for improvement or additional features. <br>