I was using the "path" export option in megan5 to output the full taxonomy of some bacterial sequencing reads. The issue that I am having with this export option is that the number of fields is not equal. For example 1 line contains 9 columns and another might contain only 7 columns:
Bacteria;Actinobacteria<phylum>;Actinobacteria;Actinobacteridae;Actinomycetales;Propionibacterineae;Propionibacteriaceae;Propionibacterium;Propionibacterium acnes
Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales; Comamonadaceae; Delftia;Delftia sp. Cs1-4
The program is just outputting the information from NCBI and some bacteria have class ,subclass, order, suborder, family and subfamily. Where others might only have the fields class, order, family. Eventually I would like to compare about 400 samples on different levels (order, family etc.) but I am not going to be able to parse the file by just pulling out a certain column for the entire file because it might vary per line. I am wondering if there is an output option in megan5 that would limit these levels.
Thank you
Bacteria;Actinobacteria<phylum>;Actinobacteria;Actinobacteridae;Actinomycetales;Propionibacterineae;Propionibacteriaceae;Propionibacterium;Propionibacterium acnes
Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales; Comamonadaceae; Delftia;Delftia sp. Cs1-4
The program is just outputting the information from NCBI and some bacteria have class ,subclass, order, suborder, family and subfamily. Where others might only have the fields class, order, family. Eventually I would like to compare about 400 samples on different levels (order, family etc.) but I am not going to be able to parse the file by just pulling out a certain column for the entire file because it might vary per line. I am wondering if there is an output option in megan5 that would limit these levels.
Thank you