Commit 0c21158a authored by Beatrice Tan's avatar Beatrice Tan

Updated README.md with correct info.

parent 26f20c5d
......@@ -6,7 +6,7 @@ High-throughput technologies including WGS and WES generate a large amount of da
## Getting Started
This prioritization pipeline can be run by following several steps:
1. Download the installer for Miniconda: https://conda.io/miniconda.html
1. Download the installer for Miniconda: https://conda.io/miniconda.html (or Anaconda if preferred)
2. Run the Miniconda installer:
```
......@@ -14,21 +14,25 @@ bash Miniconda3-latest-Linux-x86_64.sh
```
3. Clone the pipeline repository:
```
git clone https://gitlab.com/BeatriceTan/PrioritizationCNAs.git
cd PrioritizationCNAs
git clone https://git.lumc.nl/bftan/CNAprioritization.git
cd CNAprioritization
```
4. Configure your settings in config.yaml:
- Set workdir to a directory where all output should be saved.
- Set gisticdir to a directory where GISTIC2 should be installed or where GISTIC2 has already been installed.
- Leave other settings as default to use test data <i>or</i> provide your own input files and customize settings:
- Set cancer_type and date_data to download Firehose data or provide own input file.
- Provide the remainder files related to the tumor type of the input file.
6. Run the pipeline shell script, which will create the required conda environment and run snakemake (recommended when snakemake is not installed yet). Make sure the paths in the shell script are configured correctly.:
- Set workdir to the directory where the output should be saved.
- Set gisticdir to the directory where GISTIC2 should be installed or where GISTIC2 has already been installed.
- There are three options for input data:
- Leave all settings as default to use test dataset on SKCM tumor samples.
- Download Firehose data for tumor type of interest:
- Set cancer_type to tumor type abbreviation (see: https://gdac.broadinstitute.org/)
- Set date_data to '2016_01_28' to download the latest dataset or an older dataset (see: http://gdac.broadinstitute.org/runs/info/analyses__runs_list.html)
- Use your own dataset as input by specifying "input_file".
- The settings for running GISTIC2.0 and for benchmarking can be modified if necessary.
6. Run the pipeline shell script, which will create the required conda environment and run snakemake (recommended when snakemake is not installed yet):
```
bash run_pipeline.sh
```
Or manually install the required packages and run snakemake:
Or manually install the required packages and run snakemake (recommended when snakemake is already installed):
```
conda install -c bioconda snakemake=4.4.0
conda install -c conda-forge matplotlib-venn=0.11.5
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment