Preparing a Mapping File
http://qiime.org/documentation/file_formats.html
Description
The setup of your mapping file will be crucial to how easily you can perform analyses. Your mapping metadata will be one of the most important parts of running most commands. The mapping file can always be modified as long as the SampleID's do not change. An example of a mapping file and its metadata are shown below or you can view the file within the tutorial data. Essentially, the mapping file is meant to give meaning to the sample ID's of the count data. The OTU-table only contains the number of counts each sample has per OTU, but it doesn't not contain any information regarding treatment group or study timepoint. The sample mapping file allows a researcher to provide as many columns needed to describe each sample in different ways.
The table always includes:
#SampleID
: A unique identifier meant for a particular sample. It is always prefixed by a#
BarcodeSequence
: The unique 12 nucleotide barcode sequence assigned to the particlar sample during library preparation.LinkerPrimerSequence
: The nucleotide sequence adjacent to each barcode, meant for use during sequencing.Description
: The final column in the mapping file that can correspond to any information about a sample.
#SampleID | BarcodeSequence | LinkerPrimerSequence | SampleType | Description |
---|---|---|---|---|
L1S8 | ATCGATCGATCG | CCGGACTAC | gut | 1_Fece_10_28_2008 |
L1S140 | ATCGATCGATCC | CCGGACTAC | gut | 2_Fece_10_28_2008 |
L1S57 | ATCGATCGATCA | CCGGACTAC | gut | 1_Fece_1_20_2009 |
L1S208 | ATCGATCGATCT | CCGGACTAC | gut | 2_Fece_1_20_2009 |
L1S76 | ATCGATCGATAT | CCGGACTAC | gut | 1_Fece_2_17_2009 |
Option 1: Validate a mapping using QIIME
Before starting with any data processing or analysis, you must first be sure that your same metadata is correctly formatted and are free or errors.
The script, validate_mapping_file.py
is meant to ensure that there are no errors within your mapping file that may cause downstream errors, such as duplicate SampleID's or special characters within the column headers. The output of the command is an interactive .HTML
file displaying any errors found.
# Check for errors in mapping file
validate_mapping_file.py \
-m mapping_file.txt \
-o validate_map/
Option 2: Keemei tool
The second option to verify a mapping file is to use Keemei, which is a tool developed for creating QIIME valid mapping files directly in Google Sheets. Having mapping file hosted on Google Sheets allows for greater collaboration on sample metadata as well as security for data loss. Once the mapping file is ready to use for analysis, you can use the command load_remote_mapping_file.py
http://qiime.org/scripts/load_remote_mapping_file.html to download the table to your computer for use with QIIME.
load_remote_mapping_file.py \
--spreadsheet_key 0AnzomiBiZW0ddDVrdENlNG5lTWpBTm5kNjRGbjVpQmc \
--worksheet_name Fasting_Map \
-o example2_map.txt