Unconfigured Ad

**dpryan** · 11-13-2013, 07:57 AM

Try blasting a few of the reads that don't map and see what pops up. It's unusual in this day and age to see someone mapping a fasta file, is this old data?

**morning latte** · 11-13-2013, 08:01 AM

Thanks for q quick respond.
No, it is not old data. It was generated from Illumina Hiseq. I convert fastq files to fasta file to reduce size of the dataset. Sounds like quality information of fastq files affects bowtie mapping? In that case, should I use fastq files? Thanks again.

**dpryan** · 11-13-2013, 08:03 AM

Yes, use the fastq files. Also, use the quality information to perform quality trimming (trim adapters while you're doing that). Have a look at trimmomatic or trim_galore if you're unfamiliar with trimming reads.

**morning latte** · 11-13-2013, 08:08 AM

Thanks.

I used quality information for quality trimming. Before assembly, I transformed fastq files to fasta files. Could you explain a bit why fastq files is more useful than fasta for bowtie mapping? In the meanwhile let me try mapping with fastq files.

**dpryan** · 11-13-2013, 08:17 AM

Ah, so you're trying to map against a reference that you assembled with these same reads? I'm surprised that that yielded such a poor result.

Fastq files are generally more useful for mapping as the base-calling quality can be used to moderate mismatch penalties. Suppose you have a read with two equally good alignments, each having a single mismatch, but at different positions. If alignment #1 has the mismatch at a base with a high Phred score while alignment #2 has its mismatch at a base with a low Phred score, then alignment #2 is more likely to be correct (since the mismatch base was more likely to have been miscalled). It's not unusual for there to be slight quality dips near the end of reads, so utilising this information helps deal with that. Having said that, I wouldn't hold my breath that this will help in your situation. Perhaps something went amiss with your assembly. If there's a related organism, you might try mapping against that just for comparison. If the alignments are better, then the assembly is probably just bad.

**swbarnes2** · 11-13-2013, 03:49 PM

Yeah, everyone keeps the fastqs as fastqs, people don't throw away the quality data. Once upon a time, the SRA actually wanted the four channel quality data, not just the single quality number.

So stop converting them.

But notice that most aligners will work with gzipped fastqs. So just keep them compressed. And notice that your .bam contains all the information that your original fastq has, (or with bowite default, the mapped .bam + the unmapped .bam) so once you make those .bams, you can archive the original fastq, if you want.

**morning latte** · 11-16-2013, 05:30 PM

Sorry that I replied late. I really appreciated your advices.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, Yesterday, 11:08 AM	0 responses 6 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 19 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 53 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

bowtie mapping

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News