Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie mapping

    Hello,

    This is my first time to use bowtie and have trouble with running it and interpreting the output.

    Below is the command I used.
    bowtie index -f fastafile > output

    The number of reads with at least one reported alignment is less than 1%. Please help me if there is anyway that I can try to improve it. Thanks.

  • #2
    Try blasting a few of the reads that don't map and see what pops up. It's unusual in this day and age to see someone mapping a fasta file, is this old data?

    Comment


    • #3
      Thanks for q quick respond.
      No, it is not old data. It was generated from Illumina Hiseq. I convert fastq files to fasta file to reduce size of the dataset. Sounds like quality information of fastq files affects bowtie mapping? In that case, should I use fastq files? Thanks again.

      Comment


      • #4
        Yes, use the fastq files. Also, use the quality information to perform quality trimming (trim adapters while you're doing that). Have a look at trimmomatic or trim_galore if you're unfamiliar with trimming reads.

        Comment


        • #5
          Thanks.

          I used quality information for quality trimming. Before assembly, I transformed fastq files to fasta files. Could you explain a bit why fastq files is more useful than fasta for bowtie mapping? In the meanwhile let me try mapping with fastq files.

          Comment


          • #6
            Ah, so you're trying to map against a reference that you assembled with these same reads? I'm surprised that that yielded such a poor result.

            Fastq files are generally more useful for mapping as the base-calling quality can be used to moderate mismatch penalties. Suppose you have a read with two equally good alignments, each having a single mismatch, but at different positions. If alignment #1 has the mismatch at a base with a high Phred score while alignment #2 has its mismatch at a base with a low Phred score, then alignment #2 is more likely to be correct (since the mismatch base was more likely to have been miscalled). It's not unusual for there to be slight quality dips near the end of reads, so utilising this information helps deal with that. Having said that, I wouldn't hold my breath that this will help in your situation. Perhaps something went amiss with your assembly. If there's a related organism, you might try mapping against that just for comparison. If the alignments are better, then the assembly is probably just bad.

            Comment


            • #7
              Yeah, everyone keeps the fastqs as fastqs, people don't throw away the quality data. Once upon a time, the SRA actually wanted the four channel quality data, not just the single quality number.

              So stop converting them.

              But notice that most aligners will work with gzipped fastqs. So just keep them compressed. And notice that your .bam contains all the information that your original fastq has, (or with bowite default, the mapped .bam + the unmapped .bam) so once you make those .bams, you can archive the original fastq, if you want.

              Comment


              • #8
                Sorry that I replied late. I really appreciated your advices.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                25 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                29 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                25 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Working...
                X