Mercurial > repos > george-weingart > maaslin
annotate maaslin-4450aa4ecc84/doc/Merge_Metadata_Read_Me.txt @ 1:a87d5a5f2776
Uploaded the version running on the prod server
author | george-weingart |
---|---|
date | Sun, 08 Feb 2015 23:08:38 -0500 |
parents | |
children |
rev | line source |
---|---|
1
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
1 I. Quick start. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
2 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
3 The merge_metadata.py script has been included in the MaAsLin package to help add metadata to otu tables (or any tab delimited file where columns are the samples). This script was used to make the maaslin_demo.pcl file found in this project. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
4 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
5 The generic command to run the merge_metadata.py is: |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
6 python merge_metadata.py input_metadata_file < input_measurements_file > output_pcl_file |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
7 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
8 An example of the expected files are found in this project in the directory maaslin/input/for_merge_metadata |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
9 An example of how to run the command on the example files is as follows (when in the maaslin folder in a terminal): |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
10 python src/merge_metadata.py input/for_merge_metadata/maaslin_demo_metadata.metadata < input/for_merge_metadata/maaslin_demo_measurements.pcl > input/maaslin_demo.pcl |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
11 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
12 II. Script overview |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
13 merge_metadata.py takes a tab delimited metadata file and adds it to a otu table. Both files have expected formats given below. Additionally, if a pipe-delimited consensus lineage is given in the IDs of the OTUs (for instance for the genus Bifidobacterium, Bacteria|Actinobacteria|Actinobacteria|Bifidobacteriales|Bifidobacteriaceae|Bifidobacterium), the higher level clades in the consensus lineage are added to other otu in the same clade level generating all higher level clade information captured in the otu data*. This heirarchy is then normalized using the same heirarchical structure. This means, after using the script, a sample will sum to more than 1, typically somewhere around 6 but will depend on if your data is originally at genus, species, or another level of resolution. All terminal otus (or the original otus) in a sample should sum to 1. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
14 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
15 *To help combat multiple comparisons, additional clades are only added if they add information to the data set. This means if you have an otu Bacteria|Actinobacteria|Actinobacteria|Bifidobacteriales|Bifidobacteriaceae|Bifidobacterium and no other related otus until Bacteria|Actinobacteria|Actinobacteria|Bifidobacteriales, Bacteria|Actinobacteria|Actinobacteria|Bifidobacteriales|Bifidobacteriaceae will not be added to the data set because it will be no different than the already existing and more specific Bacteria|Actinobacteria|Actinobacteria|Bifidobacteriales|Bifidobacteriaceae|Bifidobacterium otu. Clades at and above Bacteria|Actinobacteria|Actinobacteria|Bifidobacteriales will be included depending on if there are other otus to add them to at those clade levels. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
16 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
17 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
18 III. Description of input files |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
19 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
20 Metadata file: |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
21 Please make the file as follows: |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
22 1. Tab delimited |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
23 2. Rows are samples, columns are metadata |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
24 3. Sample Ids in the metadata file should match the sample ids in the otu table. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
25 4. Use NA for values which are not recorded. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
26 5. An example file is found at input/for_merge_metadata/maaslin_demo_metadata.metadata |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
27 |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
28 OTU table: |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
29 Please make the file as follows: |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
30 1. Tab delimited. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
31 2. Rows are otus, columns are samples (note this is transposed in comparison to the metadata file). |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
32 3. If a consensus lineage is included in the otu name, use pipes as the delimiter. |
a87d5a5f2776
Uploaded the version running on the prod server
george-weingart
parents:
diff
changeset
|
33 4. An example file is found at input/for_merge_metadata/maaslin_demo_measurements.pcl |