README ------ Files posted April 2, 2015 The data contained represent all of the 16S samples processed by the American Gut Project to date. These data are demultiplexed sequence, and originally sourced from the following EBI accessions: ERP003819 ERP003822 ERP003820 ERP003821 ERP005367 ERP005366 ERP005361 ERP005362 ERP005651 ERP005821 ERP005949 ERP006349 ERP008512 ERP008604 ERP008617 ERP009750 The sample IDs are inconsistent between the EBI accessions, so some post-processing was performed. Specifically, all sample IDs were coerced into the following format: . This was done to ensure the samples unique within the project. Files ----- The following sequence files are available. The sample IDs are consistent across the sequence files, and are assured to be unique within the American Gut. - AG.txt - Full mapping file - merged_sequences_full_length.fna.gz - This file contains all of the demultiplexed sequence data from EBI. - merged_sequences_full_length-debloomed.fna.gz - This file contains the demultiplexed sequence data from EBI minus the sequences recruited by the filter for blooms (filter only applied to fecal samples) - merged_sequences_100nt-debloomed.fna.gz - This file contains the demultiplexed sequence data from EBI, trimmed to 100nt, and minus the sequences recruited by the filter for blooms (filter only applied to fecal samples)