Originally posted by litali
View Post
When you are working with a single sample in QIIME without a mapping file, the reads within the file will be considered as an individual sample. A mapping file with bar code is required to sort reads in your *.fa or *.sff file to respective sample. As its obvious that a single sample without multiplex run does not need to be split using split_library.py. But you need to include a mapping file with barcode sequence (you can use a string or sequence (generally 6 mer or 12 mer) at the start of all the sequences in the file. Refer this string of sequence as your bar code sequence in the mapping file against your sample id. Remember there is certain standard format for generating a mapping file which includes, #SampleID, BarcodeSequence, LinkerPrimerSequence and Description. The Linker primer sequence can be kept empty in the mapping file hoping your reads were prepossessed for quality and primer sequence. When running the split_library.py you have to bypass the LinkerPrimerSequence by providing with an optional command (i think its -p) in this split_library.py command (take help from qiime tutorial) otherwise the command wont be executed. Following this a file will be generated which removes the barcode sequence from all your reads in the file and assign them to be coming from a single sample.
The Biom file that you intend to work with works only with conjunction to the mapping_file so you have to provide a mapping file. Otherwise you can only generate the phylogenetic tree file and wont proceed further.
I have personally checked this method to work with Sanger sequenced data of 1kb to 1.5 kb and it works. Hope this helps you.
Leave a comment: