Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Thanka a bunch.


    Originally posted by Tengfei Liu View Post
    You can use cutadapt to trim both 5' and 3' bps. The fastx_clipper can only trim 3' end. When you use cutadapt, you must use cutadapt -g firstly, and use the processed sequence to do cutadapt -a. If you use -g and -a at the same time, it will only cut one end.

    Comment


    • #17
      I did the same way.

      Thanks for feedback.

      Originally posted by Michael.Ante View Post
      I always use the fastx_trimmer; you can use the -f and -l options to set the first and the last base to be kept.

      Comment


      • #18
        So, just following up on this topic. It has been incredibly helpful. We shouldn't trim the first bases at the 5' end and try to perform the de novo assembly that way correct?

        Thanks!

        Comment


        • #19
          It depends on the library prep. Illumina fragment libraries typically have adapters on the right (3') end, so if you trimmed to the left from the adapter you'd lose all of your genomic sequence. For long mate pair libraries, the answer depends on the protocol.

          Comment


          • #20
            Thanks for your reply, Brian.
            I have mRNA Illumina 100bp paired end reads. I have already removed the adapters, but still have that same the high variation on GC% at the 5' end. For the library prep, TruSeq mRNA prep was used, that's why I am guessing I have the same 5' end bias described before on my dataset. Any thoughts?

            Comment


            • #21
              BBDuk can trim a set number of bases on the left or right side of a read. However, there are some library-prep protocols that are biased, especially near the read start, and thus have suspicious base-frequency histograms, even though they are correct. So, before you trim, I suggest you map the reads to a reference (even the lowest-quality assembly is OK) to determine whether there is actually a higher error rate in the first X bases of the read. If not, then you should not trim them.

              With an assembly, you can determine it like this:

              bbmap.sh in=reads.fq mhist=mhist.txt qhist=qhist.txt

              This will give you histograms of the average qualities by read position, and match/substitution/insertion/deletion/N rates by read position. That will allow you to determine whether the stated read quality is accurate, and thus whether you need to trim the ends of reads.

              If you want to trim a set number of bases on each side, you can use BBDuk's "ftl" (force-trim left) and "ftr" (force-trim right) flags to set the limits of where to trim.

              Comment


              • #22
                The fragmentation sites may be biased, depending on how fragmentation was done. Try mapping with BBMap and using the 'mhist' output, which shows the error rate by read position. If the error rate on the 5' end is not much higher than anywhere else, there's no need to trim it.

                Comment


                • #23
                  Use Trimmomatic

                  Comment


                  • #24
                    I use TrimGalore to trim adapter and the fastqc result also show that there are biases in 5' end, several kmers occurring.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Choosing Between NGS and qPCR
                      by seqadmin



                      Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                      10-18-2024, 07:11 AM
                    • seqadmin
                      Non-Coding RNA Research and Technologies
                      by seqadmin




                      Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                      Nobel Prize for MicroRNA Discovery
                      This week,...
                      10-07-2024, 08:07 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 11-01-2024, 06:09 AM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 10-30-2024, 05:31 AM
                    0 responses
                    14 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 10-24-2024, 06:58 AM
                    0 responses
                    24 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 10-23-2024, 08:43 AM
                    0 responses
                    52 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X