Unconfigured Ad

**yueluo** · 03-31-2014, 09:21 PM

A common aligner such as bowtie2 can output 'mapped' reads.
If you want reads mapped to different mitochondrial genomes going to seperate files. You can either:
1) Make seperate indexes for each mitochondrial genome, then run a mapping process seperately. This way you get a seperate set of reads that align to different genomes.
2) Brian Bushnell recently announced a set of tools including BBMap and BBSplit. If they work like he says, then you should be able to get what you want with just a single run. I can't be sure though, since I haven't tested these tools...

**maubp** · 04-01-2014, 12:23 AM

You could try MITObim,

GitHub - chrishah/MITObim: MITObim - mitochondrial baiting and iterative mapping

https://github.com/chrishah/MITObim

MITObim - mitochondrial baiting and iterative mapping - chrishah/MITObim

Just a moment...

http://dx.doi.org/10.1093/nar/gkt371

**sphil** · 04-01-2014, 01:06 AM

Originally posted by yueluo View Post

1) Make seperate indexes for each mitochondrial genome, then run a mapping process seperately. This way you get a seperate set of reads that align to different genomes.

Since MT-Genomes aren't that large, imo he could also 'cat' all MTgenomes.fa he wants to into one genome. Map against this, since the mapping location will give rise to which genome the reads map you also get the information in just one mapping run. Need to cat before though.

**bioman1** · 04-01-2014, 06:43 AM

Thanks maubp..this tools seems to interesting to try out.
@sphil: do you think is that right approach to combine all plant mitochondrial genomes (93 genome) in to one MTgenome.fa using 'cat' and mapping reads to them will be right approach?. If it works, it will be good idea.

I am also aiming to separate chloroplast reads by using similar approach. Any tools available for this like MITobim for chloroplast?

**JackieBadger** · 04-01-2014, 06:48 AM

If you have a good reference you can map directly to it. I have done this with Newbler before and got quick and good mtDNA genomes out of total DNA pools

**sphil** · 04-01-2014, 08:36 AM

Originally posted by bioman1 View Post

Thanks maubp..this tools seems to interesting to try out.
@sphil: do you think is that right approach to combine all plant mitochondrial genomes (93 genome) in to one MTgenome.fa using 'cat' and mapping reads to them will be right approach?. If it works, it will be good idea.

I am also aiming to separate chloroplast reads by using similar approach. Any tools available for this like MITobim for chloroplast?

I acutally don't see any problem with that. However, I actually have no clue how much sequence similarity there is between those MT-Genomes. But anyways I would definitely give it a try!!!

**Brian Bushnell** · 04-01-2014, 09:20 AM

Originally posted by yueluo View Post

2) Brian Bushnell recently announced a set of tools including BBMap and BBSplit. If they work like he says, then you should be able to get what you want with just a single run. I can't be sure though, since I haven't tested these tools...

That's correct, BBSplit was designed specifically for this scenario.

bbsplit.sh ref=x.fa,y.fa,z.fa in=reads.fq basename=o%.fq

That will create 3 output files, "ox.fq", "oy.fq", and "oz.fq". The reads that map best to x.fa will go to ox.fq, and so forth.

Some reads can map ambiguously to multiple references, if the sequences are highly conserved; you can control that with the "ambig2" flag. "ambig2=best" will map the read to the first best-matching reference; "toss" will discard away ambiguous reads; "all" will send the read to the output for every reference to which it maps; and "split" will put the ambiguous reads in a separate file (i.e. AMBIGUOUS_ox.fq) for each reference to which it maps.

**bioman1** · 04-01-2014, 06:07 PM

@Brian Bunshell, I think BBsplit will helpful in my work. Thanks.
regarding command line

bbsplit.sh ref=x.fa,y.fa,z.fa in=reads.fq basename=o%.fq

for me ref will be 93 genomes, should I need to give individually like ref= r1.fa, r2.fa, r3.fa..r93.fa or can I combine all reference genomes in one file using 'cat' as sphil suggested.

I will be paired end reads, should I need make interleaved as single file?. Did the output best match ox.fa will be single file. Can I use this reads for further denovo assembly?

**Brian Bushnell** · 04-01-2014, 07:53 PM

Originally posted by bioman1 View Post

@Brian Bunshell, I think BBsplit will helpful in my work. Thanks.
regarding command line

bbsplit.sh ref=x.fa,y.fa,z.fa in=reads.fq basename=o%.fq

for me ref will be 93 genomes, should I need to give individually like ref= r1.fa, r2.fa, r3.fa..r93.fa or can I combine all reference genomes in one file using 'cat' as sphil suggested.

I will be paired end reads, should I need make interleaved as single file?. Did the output best match ox.fa will be single file. Can I use this reads for further denovo assembly?

bioman,

If your reads are paired in two files, you should use "in1=reads1.fq in2=reads2.fq". For paired input, the output will be interleaved, but the bbtools package includes another program, "reformat.sh", which can interleave, de-interleave, change quality offset, file format, and do various other things like subsampling and trimming, very fast.

To interleave:
reformat.sh in1=read1.fq in2=read2.fq out=reads.fq

To de-interleave:
reformat.sh in=reads.fq out1=read1.fq out2=read2.fq

By default, if there is a single input file, BBMap will autodetect whether the reads are interleaved or single-ended based on the names, assuming normal Hiseq/Miseq naming conventions. You can override this with the "interleaved=t" or "interleaved=f" flag. All flags are in any order.

If you have any trouble with RAM regarding bbsplit, you can set the amount of memory to use with the flag "-Xmx1g" for (for example) 1 GB RAM. This should be set to about 85% of physical memory, but don't bother unless you encounter a problem.

-Brian

P.S. I forgot to answer your question...

No, you MUST specify "ref=r1.fa,r2.fa,r3.fa..r93.fa" (without any spaces). If you use cat, it will not work.

Topics	Statistics	Last Post
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, Today, 08:59 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Today, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM

Unconfigured Ad

Extracting mitochondrial reads

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News