Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie map only first 50 nt of reads

    Hi all,

    My Data:
    Illumina whole genome data from ~160 individuals. Reads were initially 100 nt but after quality trimming I have a distribution of read lengths between 50-100 nt.

    My Problem:
    I am interested in regions of the genome that are not annotated. The loci are very small (~80 bp). Because some of my reads are larger than my reference, I can not map them against my reference loci. My solution was to only map the first 50 nt of my reads. However, I can't figure out an efficient way to do this. Bowtie1 has an option to trim bases from the 5' or 3' end of the read (-5/--trim5 <int> and -3/--trim3 <int>), but instead of trimming off bases from a side, I'd rather just use the first 50 of each read.

    Is there a way to do this using bowtie? I'm trying to avoid going back to the raw data...

    Thanks,

    John

  • #2
    John,

    If you map them with BBMap, they can go off the ends of the scaffolds without a problem.

    bbmap.sh in=reads.fq ref=loci.fa out=mapped.sam minratio=0.5

    The "minratio" flag specifies what fraction of a read must be inside the scaffold. You could also, of course, just grab the loci with more padding on the sides...

    Well, there's also a tool in that package that can make the reads 50bp:

    reformat.sh in=reads.fq out=trimmed.fq ftr=50


    That will trim the right side of reads down to length 50. But that seems like a waste of data, and it will make mapping less precise, so I don't recommend it. If I were you, I WOULD go back to the original data and map it untrimmed using BBMap.

    Comment


    • #3
      Brian, thanks for the heads up about BBMap! The "minratio" flag sounds like a very nice alternative to arbitrary trimming. I think I'll try both methods and see how they compare.

      I was asking if this could be accomplished in bowtie specifically, because I've used it quite extensively to measure other elements in the genome and I'm afraid if I switch to a different mapping program I will have to redo all of the analysis.

      Comment


      • #4
        If you're locked in to bowtie, then just use reformat.sh to first trim reads to the fixed length, then map with bowtie. Of course even trimmed, some of them will still hang off the sides, which may bias the analysis depending on what you're trying to measure. For example, a shorter scaffold will end up with lower coverage.

        Comment


        • #5
          If you're willing to use bowtie2 rather than bowtie1 then just use --local alignment. You'll never get global alignment to work if your reads are longer than the regions you map against otherwise (unless you pad things, of course).

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Genetic Variation in Immunogenetics and Antibody Diversity
            by seqadmin



            The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
            11-06-2024, 07:24 PM
          • seqadmin
            Choosing Between NGS and qPCR
            by seqadmin



            Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
            10-18-2024, 07:11 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 11-08-2024, 11:09 AM
          0 responses
          179 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 11-08-2024, 06:13 AM
          0 responses
          136 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 11-01-2024, 06:09 AM
          0 responses
          79 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-30-2024, 05:31 AM
          0 responses
          26 views
          0 likes
          Last Post seqadmin  
          Working...
          X