Alpha Diversity Analysis

A. Alpha Rarefaction




There exist a work flow script that perform the necessary commands to generate alpha rarefaction plots. The work flow script is called, and performs all the necessary steps for calculating, processing and plotting the data. The script is actually composed of 4 different steps\/scripts that are run in order to produce the final figure. (We can use a parameters file to customize each one of these steps) :

  1. (http:\/\/\/scripts\/single_rarefaction.html)
  2. (http:\/\/\/scripts\/alpha_rarefaction.html)
  3. (http:\/\/\/scripts\/alpha_rarefaction.html)
  4. (http:\/\/\/scripts\/make_rarefaction_plots.html)


----otu_table_fp | -i

Input OTU table in .biom format

--output_dir | -o

The name and location of the output folder. The folder will contain a (.html) file which is an interactive document for viewing your data. Additionally there is a folder with (.png) files of the rarefaction graphs to use for presenting the data.

--mapping_fp | -m

The mapping file that corresponds to the input OTU table.

--tree_fp | -t

The phylogenetic tree file. This will end in (.tre) and if you do not include it, you will get an error, unless you customized a parameters file to not include PD whole Tree.

--max_rare_depth | -e

The upper limit of rarefaction depths. Use the command biom summarize-table to determine the mean\/median\/minimum depth for all samples. To include all samples, choose the minimum sequencing depth.

--parameter_fp | -p (optional)

The parameters file to customize work flow.

Command \
-i otu_table.biom \
-o alpha_output_folder \
-m mapping_file.txt \
-t rep_tree.tre

Note: If you want to include more metrics in your analysis, you must provide a parameters file. By default, the work flow only estimates alpha diversity using the following metrics: Observed Species, Chao1 and Phylogenetic Diversity (PD) Whole Tree. The command below will generate a new parameters file that will add more metrics to the analysis. After creating this text file, you must specify the newly generated file using the -p command in the command. See this link for more detailed information about parameters:

# This command will generate a new text file in a current working directory.
echo "alpha_diversty:metrics observed,shannon,simpson,pd_whole_tree,chao1" >> alpha_parameters.txt

B. Calculating Alpha Diversity Significance



After generating the plots from the work flow, you can then run statistics on each metric. Using the multiple files generated from you can generate p-values for each alpha diversity metric. This command,, must be run on each metric separately. The values for each metric are store in individual text files, so the command must be run on each text file to generate the p-values from any given comparison.


--alpha_diversity_fp | -i

Input alpha diversity metric. If was run, it will be located in the alpha_div_collated/ folder. If was not run, it will be the output from runinng

--output_dir | -o

The name and location of the output folder

--mapping_fp | -m

The mapping file that corresponds to the input alpha diversity estimates.

--categories | -c

The column variable name for the statisical comparison. If you have 3 different groups within the selected column variable, the script will perform statistics on all two group permutations.

--test_type | -t

The test type used within the comparison. Default is non-parametric test which performs a t-test with Monte Carlo simulations.


# PD Whole Tree Significance Calculation \
-i alpha_output/alpha_div_collated/PD_whole_tree.txt \
-o alpha_pdwholetree_stats \
-m mapping_file.txt \
-t nonparametric \
-c SampleType

# Chao1 Significance Calculation \
-i alpha_output/alpha_div_collated/chao1.txt \
-o alpha_chao1_stats \
-m mapping_file.txt \
-t nonparametric \
-c SampleType

# Observed OTU's Significance Calculation \
-i alpha_output/alpha_div_collated/observed_otus.txt \
-o alpha_observed_otus_stats \
-m mapping_file.txt \
-t nonparametric \
-c SampleType

C. Add Alpha Metric to Mapping File



Adding an alpha diversity metric to your mapping file can be very useful. This script takes one of the outputs from and adds the alpha diversity metric value as a column on a given mapping file. Once the metric is available in the mapping file you can use it as a continuous variable when performing correlation testing, beta diversity or other statistical analysis. Because there is a single measurement for each sample in the .biom file, you can relate alpha diversity to many different aspects of your data.


--alpha_fps | -i

Estimated alpha diversity metrics. If was run, it will be located in the alpha_div_collated/ folder. If was not run, it will be the output text file from running

--output_mapping_fp | -o

New mapping file name with alpha metrics added as new columns.

--mapping_fp | -m

Mapping file to change by adding estimated alpha metrics.

Command \
-i alpha_div_collated/PD_whole_tree.txt \
-m mapping_file.txt \
-o mapping_file_with_alpha.txt

D. Exporting Data for use with PRISM

By default the script generates a plot from the python package matplotlib, which is un-customizable and un-editable in Illustrator. If you own the program PRISM (http:\/\/\/scientific-software\/prism\/), you can have full customization over the rarefaction plots. Additionally, you can generate longitudinal lines\/boxplot figures by copying raw alpha diversity measurements into PRISM. Below we will walk through the steps of exporting alpha rarefaction data into PRISM.

Step 1: Run alpha rarefaction script with custom parameters
# Run alpha diversity \
-i otu_table.biom \ 
-o alpha_output_folder \ 
-m mapping_file.txt \ 
-t rep_tree.tre \
-p parameters.txt
Step 2: Find raw data table text file
── alpha_output_folder/
│   └── alpha_rarefaction_plots/
│       └── average_tables/
|            └── observed_otus_Treatment.txt
Step 3: Copy into PRISM 6
Step 4: Customize!

results matching ""

    No results matching ""