SamplesTsvToConfig.md 2.55 KB
Newer Older
Peter van 't Hof's avatar
Peter van 't Hof committed
1
# SamplesTsvToConfig
2

Peter van 't Hof's avatar
Peter van 't Hof committed
3
This tool enables a user to create a full sample sheet in JSON format or YAML format, suitable for all our Queue pipelines, from TSV file(s).
Moustakas's avatar
Moustakas committed
4
The tool can be called as follows:
sajvanderzeeuw's avatar
sajvanderzeeuw committed
5

6
~~~ bash
Peter van 't Hof's avatar
Peter van 't Hof committed
7
biopet tool SamplesTsvToConfig
sajvanderzeeuw's avatar
sajvanderzeeuw committed
8
9
~~~

sajvanderzeeuw's avatar
sajvanderzeeuw committed
10
To open the help:
sajvanderzeeuw's avatar
sajvanderzeeuw committed
11

12
~~~ bash
13
biopet tool SamplesTsvToJson -h
sajvanderzeeuw's avatar
sajvanderzeeuw committed
14
15
16
17
18
19
20
21
22
23
Usage: SamplesTsvToJson [options]

  -l <value> | --log_level <value>
        Log level
  -h | --help
        Print usage
  -v | --version
        Print version
  -i <file> | --inputFiles <file>
        Input must be a tsv file, first line is seen as header and must at least have a 'sample' column, 'library' column is optional, multiple files allowed
Peter van 't Hof's avatar
Peter van 't Hof committed
24
25
26
  -t <file> | --tagFiles <file>

  -o <file> | --outputFile <file>
Peter van 't Hof's avatar
Peter van 't Hof committed
27
        When extension is .yml or .yaml output is in yaml format, otherwise in json. When not given output goes to stdout as yaml.
sajvanderzeeuw's avatar
sajvanderzeeuw committed
28
29
~~~

Moustakas's avatar
Moustakas committed
30
A user provides a TAB separated file (TSV) with sample specific properties which are parsed into JSON format by the tool.
Moustakas's avatar
Moustakas committed
31
For example, a user wants to add certain properties to the description of a sample, such as the treatment a sample received. Then a TSV file with an extra column called treatment is provided. 
Peter van 't Hof's avatar
Peter van 't Hof committed
32
The resulting file will have the 'treatment' property in it as well. The order of the columns is not relevant to the end result 
sajvanderzeeuw's avatar
sajvanderzeeuw committed
33

Moustakas's avatar
Moustakas committed
34
The tag files works the same only the value is prefixed in the key `tags`.
Peter van 't Hof's avatar
Peter van 't Hof committed
35

sajvanderzeeuw's avatar
sajvanderzeeuw committed
36
37
38
39
40
41
#### Sample definition

To get the above example out of the tool one should provide 2 TSV files as follows:

----

Sander van der Zeeuw's avatar
Sander van der Zeeuw committed
42
| sample        | library | bam         |
sajvanderzeeuw's avatar
sajvanderzeeuw committed
43
44
45
46
47
48
49
50
51
52
| -------       | ------- | ---------   |
|Sample_ID_1    |Lib_ID_1 |MyFirst.bam  |
|Sample_ID_2    |Lib_ID_2 |MySecond.bam |

----

#### Library definition

The second TSV file can contain as much properties as you would like. Possible option would be: gender, age and family.
Basically anything you want to pass to your pipeline is possible.
53

sajvanderzeeuw's avatar
sajvanderzeeuw committed
54
55
56
57
58
59
60
----

| sample      | treatment |
| ----------- | --------- |
| Sample_ID_1 | heatshock |
| Sample_ID_2 | heatshock |

Peter van 't Hof's avatar
Peter van 't Hof committed
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
#### Example

###### Yaml

~~~ yaml
samples:
  Sample_ID_1:
    treatment: heatshock
    libraries:
      Lib_ID_1:
        bam: MyFirst.bam
  Sample_ID_2:
    treatment: heatshock
    libraries:
      Lib_ID_2:
        bam: MySecond.bam
~~~

###### Json

~~~ json
{
  "samples" : {
    "Sample_ID_1" : {
      "treatment" : "heatshock",
      "libraries" : {
        "Lib_ID_1" : {
          "bam" : "MyFirst.bam"
        }
      }
    },
    "Sample_ID_2" : {
      "treatment" : "heatshock",
      "libraries" : {
        "Lib_ID_2" : {
          "bam" : "MySecond.bam"
        }
      }
    }
  }
}
~~~