Pathway Analysis with HUMAnN2
http://huttenhower.sph.harvard.edu/humann2
Description
HUMAnN2 is a program released by the Huttenhower groups which is used to calculate microbial gene abundances from Whole Genome Sequencing data.
This tool can also be used on legacy PICRUSt data to reconstruct KEGG pathways (ko) from KEGG genes (KO). First we need to install HUMAnN2 and HUMAnN (previous version), before processing our 16s data.
Step 1. Install HUMAnN2 using pip.
pip install humann2
Step 2. Download HUMAnN package and decompress. You will only need 1 file from this folder (humann-0.99/data/keggc
). Run the commands below or manually download and decompress: https://bitbucket.org/biobakery/humann/downloads/humann-v0.99.tar.gz
# Download to HOME folder
wget ~/ https://bitbucket.org/biobakery/humann/downloads/humann-v0.99.tar.gz
# Decompress the folder
tar -zxvf ~/humann-v0.99.tar.gz
Step 3. Make new folders to store temporary data/files.
mkdir split_files # Store intermediate files
mkdir humann2_out # Store files generated by HUMAnN2
mkdir humann2_tables # Store final tables
Step 3. Split PICRUSt metagenomic predictions table (.biom
) into individual biom
files.
humann2_split_table \
-i metagenomic_predictions.biom \
-o split_files
Step 4. Loop over each of the files and run HUMAnN2 to produce pathway abundance files
for biom in split_files/*.biom
do
humann2 \
--input $biom \
--output picrust/humann2_out \
--pathways-database ~/humann-0.99/data/keggc
done
Step 5. Join all the output HUMAnN2 tables together into one table. Two files are created: a pathway abundance and and pathway coverage. For further analysis, only the pathway abundance files will be used.
# Join files of pathway abundance
humann2_join_tables \
--input humann2_out/ \
--output humann2_tables/humann2_pathabundance.txt \
--file_name pathabundance
# Join files of pathway coverage
humann2_join_tables \
--input humann2_out/ \
--output humann2_tables/humann2_pathcoverage.txt \
--file_name pathcoverage
Step 6. By default, HUMAnN2 adds _Abundance
to each of the sample names in the resulting table. These names can effect the interaction between the abundance tables and mapping file meta data sample ID's. You can manually edit the file, or run the command below that will remove it using Terminal.
sed 's/_Abundance//g' humann2_tables/humann2_pathabundance.txt > humann2_tables/humann2_pathabundance_fixed.txt
Step 7. Add full names to the KEGG pathway identifiers
humann2_rename_table \
-i humann2_tables/humann2_pathabundance_fixed.txt \
-o humann2_tables/humann2_pathabundance_named.txt \
-n kegg-pathway \
--simplify
Step 8. Normalize the table into relative abundances for use with statistical programs.
humann2_renorm_table \
-i humann2_tables/humann2_pathabundance_named.txt \
-o humann2_tables/humann2_pathabundance_named_relab.txt \
--units relab
Step 9. Analyze with programs such as STAMP, PRISM or LEfSe.