Filtering Taxa from OTU-Table
Introduction
The second most common type of OTU-Table filtering is filtering taxa or OTU's from the OTU-Table. Depending upon the question at hand, you may want to only look at the relative abundances of a particular Genus or Phylum. To do this, first you must filter create a new table that will only include the taxonomy of interest.
1. Filtering at the OTU-Level
http://qiime.org/scripts/filter_otus_from_otu_table.html
Command
You may want to filter out OTU's from your OTU table instead of samples. This can occur when you have too many OTU's or if you want to have a more strict analysis that takes into account the prominent taxa and not the low abundant rare taxa. The number to filter by will vary based on the spead of your data, but it is crucial to record at your chosen filter depth.
Parameters
--input_fp | -i
Input OTU table in .biom format
--output_fp | -o
The name and location of the folder to store all the output biom
files
--min_count_fraction
The minimum cutoff for the total OTU observations in the table.
Note about threshold:
This number is a fraction, not a percent. If you specify 0.0001
, this will retain all OTU's that have at least a 0.01%
total abundance in the table. If you wanted to retain OTU's with at least 1%
total abundance, you must specify, 0.01
.
- 0.00001 : 0.001%
- 0.0001 : 0.01%
- 0.001 : 0.1%
- 0.01 : 1%
Command
filter_otus_from_otu_table.py \
-i otu_table.biom \
-o otu_table_n01.biom \
--min_count_fraction 0.0001
2. Filtering at the Taxa-Level
http://qiime.org/scripts/filter_taxa_from_otu_table.html
Description
There maybe an instance when you would like to analyze your data at one particular taxa or phylogenetic level. This can be achieved by using the command filter_taxa_from_otu_table.py
. This command allow you to specify a particular taxonomic name/level as an identifier, to keep or remove from the OTU table.
Note:
You must specify the level before the taxonomic name (e.g p__Firmicutes). This is due to the naming convention with GreenGenes and the way that full names are stored in the OTU table.
Parameters
--input_otu_table_fp | -i
Input OTU table in .biom format
--output_otu_table_fp | -o
The name of the output filtered biom file
--negative_taxa | -n
The names of the groups you want to REMOVE. (separated by comma's)
--positive_taxa | -p
The names of the groups you want to KEEP. (separated by comma's)
2a. Positive Filtering of Taxa
This command will keep only the Phylum level of Bacteroidetes and Firmicutes.
filter_taxa_from_otu_table.py \
-i otu_table.biom \
-o otu_table_only_bacteroidetes_firmicutes.biom \
-p p__Bacteroidetes,p__Firmicutes
2b. Negative Filtering of Taxa
This command will remove only the Phylum level of Bacteroidetes and Firmicutes.
filter_taxa_from_otu_table.py \
-i otu_table.biom \
-o otu_table_no_bacteroidetes_firmicutes.biom \
-n p__Bacteroidetes,p__Firmicutes
2c. Advanced Filtering of Taxa
You can mix and match with the -n and -p flags by removing some taxa, but keeping others. The command below will retain all Firmicutes taxa except for the Clostridia class.
filter_taxa_from_otu_table.py \
-i otu_table.biom \
-o otu_table_all_firmicutes_no_clostridia.biom \
-p p__Firmicutes \
-n c__Clostridia
3. Split an OTU-Table by Phylogenetic Level
http://qiime.org/scripts/split_otu_table_by_taxonomy.html
Description
If you many different biom
files that correspond to each taxa at a particular level, you can run the command split_otu_table_by_taxonomy.py
. This command will take an input biom
file and a chosen level of phylogeny (e.g 1 = kingdom, 2 = phylum, 3 = class, 4 = order, 5 = family 6 = genus, 7 = species) and generate new OTU tables that each correspond to a different taxa at the level.
For example if you wanted an OTU table of all the different Phyla in your data, you can specify -L 2
and the new folder will contain OTU tables for each of the different Phylum.
Parameters
--input_fp | -i
Input OTU table in .biom
format
--output_dir | -o
The name and location of the folder to store all the output biom
files which correspond to different taxa at a particular level.
--level | -L
The minimum cutoff for the total OTU observations in the table.
Command
split_otu_table_by_taxonomy.py \
-i otu_table.biom \
-o otu_table_by_level3 \
-L 3