Seqanswers Leaderboard Ad

**Torst** · 10-12-2012, 10:21 PM

Doesn't it just mean that the read comes from a region of the genome that is repeated? So it aligns to all copies of the repeat?

**fanx** · 10-12-2012, 10:27 PM

Thanks. In fact, I know the problem is reads rather than the reference genome. I believe most reads are chimeric.

**Torst** · 10-13-2012, 07:24 PM

UCHIME claims to be able to do some de novo chimera detection, or use bits of your known reference:

http://bioinformatics.oxfordjournals.org/content/27/16/2194.abstract

**fanx** · 10-13-2012, 09:59 PM

Trost, Thanks your point. I made a search prior to the post, which found UCHIME. I didn't investigate it in detail because I guess UCHIME may need a high coverage, such as PCR amplicons. However, my RNA-Seq data is from human blood that is complicated with RNA from human genes, virus and fungi etc at low coverage.

One way I thought about is to split reads into 2 parts and then see if they turn to be single mapped reads. This spit can be done easily with current scripts but I worry about potential lose of information. To my knowledge, there is really no way to handle this issue.

I am using 454 and I think I should switch to Illumina. Short reads might be helpful.

**bernardo_bello** · 10-22-2012, 09:57 AM

We have recently sequenced a bacterial transcriptome with 316 chip from IonTorrent (1.5 million sequences). After filtering low quality data and trimming adapters we noticed that only 51.33% sequences were mapped on reference genome. Looking for the unmapped sequences we can see that most of them are chimeric transcripts, so impossible mapping for them an also causing bias on results. Also many of the unmapped are sequences lacking homology in 20% of the starting sequence.

I would like to know your opinion about it.
Should I have to move to 454 or Illumina? Our Sequencing Department have no idea of why we have so many chimeras.

Thank you, Bernardo

**fanx** · 10-22-2012, 10:09 AM

To answer your question, I need to know 1) is there any amplification step prior to the sequencing? 2) what's the aligner you used for mapping.

**bernardo_bello** · 10-22-2012, 10:17 AM

Hello fanx,

Thank you for your response.

1) Yes, there is a PCR step. We have used the hole transcriptome procedure described here 'Ion Total RNA-Seq Kit v2'

2) I used BWA for Illumina in Galaxy with default settings.

**fanx** · 10-22-2012, 10:49 AM

1, If there is a PCR step, chimeric reads are not unexpected. Many polymerases, especially those assuming high fidelity, have strand displacement activity. The occurrance of chimeric reads depends on both polymerases and protocols.

2, Some chimeric reads may not be authenic ones. In this situation, I usually increase mapping stringency and found many of "chimeric" reads became single-hit ones.

3, For remaining "true" chimeric reads, there are 2 ways to go. One is just to discard them (as shown in many previous publications where this issue is largely ignored). If your data has a profound depth, I don't think this will affect/bias your final result. The other way is to extract these chimeric reads only and do some trimming, re-aligned to see whether if they become single-hit reads, and finally combine all single hit reads for downstream analysis.

4, Finally, I assume you already done quality control prior to the align.

**bernardo_bello** · 10-22-2012, 11:06 AM

>1, If there is a PCR step, chimeric reads are not unexpected. Many polymerases, especially those assuming high fidelity, have strand displacement activity. The occurrence of chimeric reads depends on both polymerases and protocols.

Ok, I think I'm wasting money.

>2, Some chimeric reads may not be authentic ones. In this situation, I usually increase mapping stringency and found many of "chimeric" reads became single-hit ones.

I'm mapping only >Q20 reads. I've seen were and how they are mapping and they have 100% similarity in both hits. Sometimes there are three hits for one read.

>If your data has a profound depth, I don't think this will affect/bias your final result.

I have 800.000 sequences mapped to a 2 Mb prokaryotic genome. My mean read length is about 150 bp.

>The other way is to extract these chimeric reads only and do some trimming

I would like to finish my PhD, trimming is not feasible! So many chimeras.

>4, Finally, I assume you already done quality control prior to the align.

Of course, only >Q20 and trimming low quality 3' region.

Thank you, Bernardo

**bernardo_bello** · 10-22-2012, 11:13 AM

Sorry, I forgot to say that I have finally 25% mapping sequences (of 3 million). My reference sequence is a draft genome in 47 segments.

**bernardo_bello** · 10-22-2012, 11:48 AM

FASTQC of unmapped reads

If you give me your email I can send you FASTQC output of unmapped reads to know your opinion.

Bernardo

**jshaik** · 06-03-2014, 07:57 AM

aligner from sanger

This aligner seem to address the issue of chimeric reads: http://www.sanger.ac.uk/resources/software/smalt/
I personally didnt try this yet but will try it next time I need to align something.

**bernardo_bello** · 06-03-2014, 10:09 PM

Originally posted by jshaik View Post

This aligner seem to address the issue of chimeric reads: http://www.sanger.ac.uk/resources/software/smalt/
I personally didnt try this yet but will try it next time I need to align something.

Thanks, seems there is not an associated publication for SMALT. There is?

**jshaik** · 06-04-2014, 06:14 PM

No there is no publication associated with it. But people have compared it with other aligners in their works.

Topics	Statistics	Last Post
ASHG 2024 Highlights – Part Two by seqadmin Started by seqadmin, Today, 11:09 AM	0 responses 22 views 0 likes	Last Post by seqadmin Today, 11:09 AM
ASHG 2024 Highlights – Part One by seqadmin Started by seqadmin, Today, 06:13 AM	0 responses 20 views 0 likes	Last Post by seqadmin Today, 06:13 AM
Seq-Scope Expands Possibilities for High-Resolution Gene Expression Analysis by seqadmin Started by seqadmin, 11-01-2024, 06:09 AM	0 responses 30 views 0 likes	Last Post by seqadmin 11-01-2024, 06:09 AM
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, 10-30-2024, 05:31 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-30-2024, 05:31 AM

Seqanswers Leaderboard Ad

Announcement

Chimeric reads

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News