Preparing Sequences
Description
There are many ways in which you may recieve your 16s rRNA gene sequencing MiSeq data. Depending upon the facility and their chosen methods, you may end recieve the raw data as:
- 3 FASTQ files which correspond the the forward, reverse and index reads
- Individual FASTQ files for every sample that you had included with a unique barcode.
- Or a variation of the two.
In either case, the data must first be joined (if paired-end), filtered and de-multiplexed before picking OTU's and assigning microbial taxonomy. The next two section will demonstrate how to process the two main types of data that you may receive from the sequecing facility, but be aware that some of the default settings may not apply to your data, depending upon:
- If Golay barcodes were used .
- If the barcodes are 12bp in length.
- If the length of the reads are not 150bp.
Note about default filenames:
There are typically 3 files generated from the MiSeq runs. The file which has the letters _R1_
is the forward read, _R3_
is the reverse reads and _R2_
is the barcode read. It is import to make sure you don't mix up these reads. To verify that the reads are in the right format, you can check the size of the _R2_
read. This read should be much smaller in comparison to the _R1_
and _R3_
files as they only include the barcodes for each sequence and are not 150bp or 250bp reads.