Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How come cutadapt and trim_galore don't work at all for me??

    I tried to run cutadapt in the terminal, but I always keep getting the same error.

    I tried to use trim_galore on my data, but same thing happened. I keep getting an error message saying "Illegal division by zero" or something along the line of that. I eventually found out that I need cutadapt, but I can't even get cutadapt to work.

    I asked someone higher up for help and I was told to just make a python script because it would be easier to remove the adapter sequences than trying to get the software to work.

    My questions:

    1. How come cutadapt and trim_galore are impossible for me to run on linux terminal?

    2. What is some other software that I could try? I am trying to clean the data for 2 paired end fastq files after looking at the FASTQC reports.

    I just don't know how I am supposed to come up with a python script that removes adapter sequences when I don't even know what an adapter sequence is.

    I'm pretty much lost and I don't know how to get past the "cleaning stage". I'm sure it will get much more difficult during the "mapping stage" and the "analysis stage", but honestly I am completely lost.

    It's very difficult to keep going at it when the software I'm supposed to use doesn't even work.

  • #2
    Maybe if you gave some more details we could help. What is the exact command that you give and what is the error message? Maybe paste in the first few lines of the FASTQ files as well to give us a sense of what they look like.

    Comment


    • #3
      Originally posted by kopi-o View Post
      maybe if you gave some more details we could help. What is the exact command that you give and what is the error message? Maybe paste in the first few lines of the fastq files as well to give us a sense of what they look like.
      @m00532:8:000000000-a17vf:1:1101:16018:1456 1:n:0:1
      nccctaaatgcaaatcgccggatcagtacggcgacaactgcgaagtgtgcggcgccacctacagcccgacggaactgatcgatccgaagtcnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngagcacttcttcttcgatctgccggccttcagcgaaatgctgcaggcgtgga
      +
      #5<???bbddddddddgggggghhiihiihhhhhhhhiiiihhhhfhihihdhhhhhhhhhhhhggggggggg?bgeggggeggggegggg#####################################008ccgggggggggggggggggea8agggeggggggaegggggggcgggg8?
      @m00532:8:000000000-a17vf:1:1101:14786:1471 1:n:0:1
      naactgataggtccagccgcccgggcaggcgcccagatcgaccgcgtgcatgccgctggccaggcgttcgtcccactcgtcggcggggatgnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnggctcggcgcatcggccgggaagcgcagacgcggaatgcccatgtagaacggcgagttgttgttgctgtaggagtaaccgacatagcagcagacccgcgcgatgaaaaacacatgcaccaaccggc
      +
      #55<??bbd?b?bdddffffffhhfhh?chhhhhhiiiiihhhhhhhhhifhhhhhhhhhh?ddfed=ddeee:ddefffeeeeb??ddae#####################################0008a48).4').0a4?28>28.*'.''.82;;84>81:?ceca*:?aee)88'44'0?:c0*0*1:?0**0::**1*a''''.1***0**0*08*'''.'.'8***108aa8:**01:8*.)'..
      @m00532:8:000000000-a17vf:1:1101:15617:1555 1:n:0:1
      gatatcggctttatggatcacgaattccgcatgttgcagcagggcatcccgaagaccgagactttctgcaccatcaccgacagcctgttgatggcgcgtcgcctgttcccgagcaagcgcaacaacctcgatgccttgtgcagccgttatgagatagataacagcaagcgtacgctgcacggcgcattgctcgatgccgagatcctggcggaagtctatctggcgatgaccggcggtcagaactcgatagc
      +
      ?????b?bdddddd?dgggggghhiiiiihhhhhiiiiihihighiifcfhhhbhiihhhhhiiiiiiiihihihhhhhhhh7fhhhhhfggddfegggdggggggggggggggge?eegggggggggg?cgggegggggggegggeegcgggcegegeggggceegceeeegg>ggagee8<agggggggggggceg2'.<cgcec:c22'88:?eecgcecc8<ac:?c>28ad<')0****1)**0)0
      @m00532:8:000000000-a17vf:1:1101:14715:1593 1:n:0:1
      tatctcttgcctcatcagcacgctgatggtgctggtggtagtcggctatacctcacactattttcatcgcgagcgccgcatcgccggcaagcccgacgacccggaggacaatccatgatccatgaaatctggtggtcgttgccgctgacgctggcgggctattttggcgcgcgctggctggcgcgcaagctggatatgccgctgctcaacccgcttctggagtacatggcagagcacatcacgctgctgcttgcc
      +
      ?,??<<bbbbbbbbbbfc;cfchhhcbhhhgdeg=ef:efff0>7dfhdgcefgfdhhhfdghhffeagd7cehc@c7<c@cde:5abeeee;a<b<8@e82;?;;;eeeee::a/?ee?a?aaaa:ceea**08a:;4aaae????a/?aee?e??'.'8ae::/'0.)'4'4.'.8:?aa;a?2';;**:*8?a:**:;?8''*//0/*88;'))0*0*0/**0::ac*8)*********)*)).*::*****
      @m00532:8:000000000-a17vf:1:1101:17132:1624 1:n:0:1
      cctcatggctgggcggtagatgtgccggcgatgttgatggccgcgcgcggccaaccgatggcggcgatgttccacaaccagcgcaatcagctgatcgactacctgtggaatgccgtgcgtcgtaaattcggcggccgcatgcacgcgcgcaatgacggcatcaaaccgtttatcagttcggttcggcagggatattggggttactatctgcccgatcaggatcacggcgctgagcacagtgagattgtcgacttc
      +
      ?????bbbbbdbdd+@cce@ffcffhd+>cehdcedghbeafccaeccechaaffeebedeedeb?d78?ceeeeeeeeeeeddddeeeeeeeecaee>a?eeecaeeeeeeecda?e;a28>eeeeee)*'48;'8>de?ee:?88?>;>??cee?8>;?c:?cec?08ceeeeccee8;?ae88?>2'4):c0:c**44>ceeeeeeec??eddd?aaeeee:8a;>8>8a*0*0::0*??*00:aaac>88*
      @m00532:8:000000000-a17vf:1:1101:14360:1640 1:n:0:1
      gcaggaaatcttcgctcaaaaacgtcgccatctcgggctccttacttgccgtgcggcacggggttcgaatcacaaaagttatcacaccaattttaactggcgtccagcagatttttcccgatgtctgccgttcatcctctctccacgccctttttcgctcaatttaccggcaaaaatgccctatttcaccggccacagtatcgctcaggcgaataattccttttgtgatatcactcaacttttaaacctgtat
      +
      ??????bbdddddbbacdffffhcffehhhhidhfhhehhihhiiiiafhce=ecbheeeheheefehhhhhhhhhfhfdfffffffefdeedeffffdeeeeebefeffeffffffffeedaeefae??eefffffffeffffff?eed>eeeee>?eedeffffff).4;8ceeffffffaeeff?c?8'8;>a?8:ccaeaef>?:a>;?2caefecee:a?e?:aeeeffee:??:??aecaeaec:?a
      @m00532:8:000000000-a17vf:1:1101:13497:1698 1:n:0:1
      gttgctgctcgacggcggctatgccgcagcaacggtggacgcggtggcgaaacgcgccggggtggcgaaaaaaaccctgtatcgttttgccgctaatcgcgacgaactggtcgcgcaggcggtgagcggctggaccgaggcgtttcagtcggccttcgcccaggatgccgcgcaacgggcggcggtggcgccgctgttgggaaaagggctgcaggccatcgcgcagcaggtattgagcgccgaggcggtgggga
      +
      ???a?bbbddddddddggggggiiihhhhhhhhhhcfhhfcehhehhdhhhggfeegggegg>?degggggggge?egggggggggggggggggggggggggggggdggggggggggggggg8>eg:8><>dgeegggdd<8)48cegeggg8cee?ccegggggee?ed2<8<dg>dg>dg<'8''0*4c<8>><8c:c'.08?cceccc8**000?c*8ca2><>??8*11:*:c:8>4'4'''4...8<2.

      Comment


      • #4
        This is what I typed in the command line.

        python setup.py build (this part worked)

        python setup.py install (this is where I got the error. The error was "error: could not create '/Library/Python/2.7/site-packages/cutadapt': Permission denied".

        Comment


        • #5
          I have just installed fastx_toolkit.

          Will it be enough for me to go through each sequence of each paired end set, use Fastx_clipper with the overrepresented sequence found in FASTQC as the adapter sequence, then use Fastx_trim to trim one or both ends of the sequence with bad base quality scores?

          And then use BWA to align and finally analyze?

          Comment


          • #6
            OK. So the problem is a bit more upstream than I thought as you haven't successfully installed Cutadapt yet. The problem is simply that it wants to write to a directory that you don't have permission to write to. If you are on a Mac where you have superuser/administrator privileges, you can try

            sudo python setup.py install

            (this will ask for your password)
            Otherwise, get someone who does have administrator privileges to do it. There are other ways to install it as well such as pip and easy_install.

            This is really more of a Linux issue than a Cutadapt issue.

            Comment


            • #7
              Originally posted by kopi-o View Post
              OK. So the problem is a bit more upstream than I thought as you haven't successfully installed Cutadapt yet. The problem is simply that it wants to write to a directory that you don't have permission to write to. If you are on a Mac where you have superuser/administrator privileges, you can try

              sudo python setup.py install

              (this will ask for your password)
              Otherwise, get someone who does have administrator privileges to do it. There are other ways to install it as well such as pip and easy_install.

              This is really more of a Linux issue than a Cutadapt issue.
              Ahh ok, this worked, thank you. So does this look like a good system for me to analyze the sequences of 20 specimen and then map for the ancestral specimen ?

              1. Use cutadapt to remove adapter sequence from each paired end sequence (42 fastq files total for 1 ancestor and 20 specimen).

              2. Use fastx_trimmer to remove bad quality scores from either end of the sequence.

              3. Run fastqc on each fastq file to make sure the data is good.

              4. Use BWA to align sequences.

              5. Analyze.

              Comment


              • #8
                Have you first analyzed your data with FastQC and based on those results decided that you do indeed need to trim the data?

                Adapter contamination should not be a big problem in good quality libraries (unless you have short inserts/adapter dimers etc).

                Looks like you have 250 bp reads so you should look into using BWA-MEM (new option for BWA) for alignment.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-27-2024, 06:37 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-27-2024, 06:07 PM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                52 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X