Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • adapter trimming and length/otogenics

    I have received some chip seq data from the company otogenics. They provide two fastq files one that appears to contain the adapter sequence and one that does not based on fastqc reports. Additionally in these fastqc reports the read length does not change with the trimming of the adapter. I am wondering how this is possible and maybe what program they may have used to trim the adapter. Any help would be greatly appreciated.

    Thanks
    Leanne

  • #2
    You can't trim adapters sequence without changing the read length, but you can either throw away reads that have adapter sequence, or (theoretically) produce reads that never had adapter sequence in the first place. Also, if you have for example a fragment library and a long mate pair library, they may have different adapters.

    Comment


    • #3
      Originally posted by lwhitmore View Post
      I have received some chip seq data from the company otogenics. They provide two fastq files one that appears to contain the adapter sequence and one that does not based on fastqc reports. Additionally in these fastqc reports the read length does not change with the trimming of the adapter. I am wondering how this is possible and maybe what program they may have used to trim the adapter. Any help would be greatly appreciated.

      Thanks
      Leanne
      I have used fastx_toolkit for trimming adapters. http://hannonlab.cshl.edu/fastx_toolkit/
      it's actually good and for trimming adapters u should use FASTA/Q Clipper.

      Comment


      • #4
        What is the length of your reads, and do all the reads in each file have the same length according to the fastqc reports?

        Comment


        • #5
          mastal,
          the length of my reads are 100b and all the reads have the same length in both the fastqc reports before and after trimming

          Comment


          • #6
            What do the 2 fastq files represent, before and after trimming, or 2 different samples, or R1 and R2 of paired-end reads??

            Comment


            • #7
              before and after trimming on 1 sample for single end reads

              Comment


              • #8
                How many reads in each sample?

                Comment


                • #9
                  7811028 reads

                  Comment


                  • #10
                    Could you post the first few lines of each file?

                    It seems impossible that the reads would all have the same length after trimming as before if anything was actually trimmed, and if nothing was trimmed, you would expect the fastqc report to still show the adapter sequence in the over-represented reads.

                    Comment


                    • #11
                      from the first file with the adaptor
                      @HWI-ST1129:515:H8V3LADXX:1:1101:2601:1960 1:N:0:CCGTCC
                      NAGAAATTTGGAAAATCAAATGCTTGAAGTAAGAGGACGATATTAAAACTTTTGTAACCAGAGACTACTTTAAGAAAAATCTGCTACTACTTTAACAAAG
                      +
                      #1DFFFHHHHHJJJJJJJJJJJJJJIJHHIIIEIHIHHGHIJJIJJJJJJJJIIJGJJJJJJJJJJJHHHHHHHFFFFEEDEEEEDDDDDEDDDDDDD
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4187:1965 1:N:0:CCGTCC
                      NACAGAGCCTCGCTCTGTCTCCCAGGCTGGATGGAGTGCAGTGGCGCGATGTTGGCTCACTTCAAGCTCCGCGTCCTGTGTTCATGCCATTCTTCTGCCT
                      +
                      #1=DFFFFHHHHHJJJJJJJJJJJJJJJJJHIJJJJHIJIJGHJJJJJJJJJJJHHHHHFFFFFFEEEEEDDDDDDDDCDDDDEEDDDDDDEEEDDDDDD
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4380:1977 1:N:0:CCGTCC
                      NTCCTCCCAAGAGAGATAGAGGAAGGAAAGGGAGAGATGGGACCACCACAGTGAGCAAATGGATCAGATTATTACTCTAAAATGTTCTTTTAGATCGGAA


                      From the second fastq file without the adaptor
                      ACAATGACACTTAGCATTTACTGTGTTAGTTAACATTTAGCAGATCTTTGTTAAAGTAGTAGCAGATTTTTCTTAAAGTAGTCTCTGGTTACAAAAGTTT
                      +
                      CC@FFFFFHHHHHJJJJJJIJJIFHIIJJHIIJJJJIJJJIJIJJJJJIJIJJJJJBGIIGIJJJJJIJJJJIJJJJJCHIEIIIIJH?HEHHFDFFCEE
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4187:1965 2:N:0:CCGTCC
                      ATTAGCCGGGCATAGTGGCAGGGGCCTGTAGTCCCAGCTACTCGGTAGGCTGAGGCAGAAGAATGGCATGAACACAGGGCGCGGGGCTTGAAGTGAGCCA
                      +
                      @@@DDDDDHHHHHIIHIIIIII0??D1)9990?DH#################################################################
                      @HWI-ST1129:515:H8V3LADXX:1:1101:4380:1977 2:N:0:CCGTCC
                      AAAAGAACATTTTAGAGTAATAATCTGATCCATTTGCTCACTGTGGTGGTCCCATCTCTCCCTTTCCTTCCTCTATCTCTCTTGGGAGGAAAGATCGGAA

                      Comment


                      • #12
                        Looks like you have paired-reads from the same sample.

                        This part of the header - 2:N:0:CCGTCC - tells you it's the second read of a pair.

                        So if your reads are all the same length, that would suggest that they haven't been trimmed.

                        Comment


                        • #13
                          ahh ok sorry that i didn't pick up on that i am very new to sequence analysis.

                          One more question if you don't mind
                          Why wouldn't the second file have an over represneted sequence (or an adapter)?

                          Thanks Again!

                          Comment


                          • #14
                            the sequences in the R2 file will be different from the sequences in the R1 file.

                            it's possible that there are other over-represented sequences that are more abundant in the R2 file, so the adapter sequence doesn't make it into the top over-represented sequences.

                            Is it the FastQC report that has flagged the sequences as adapter sequences?

                            Comment


                            • #15
                              Yes it was the fastqc report that flagged the sequence as an adapter

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Essential Discoveries and Tools in Epitranscriptomics
                                by seqadmin




                                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                04-22-2024, 07:01 AM
                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-25-2024, 11:49 AM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-24-2024, 08:47 AM
                              0 responses
                              19 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              62 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              60 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X