Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #76
    3' ends that have a bad quality can be removed with the -q option, but that does not remove a fixed number of bases (see the --help) and also does not work for 5' ends. In any case, cutadapt always converts csfasta/qual files to FASTQ format.

    Comment


    • #77
      OK, good to know. Thanks,, mmartin!

      Comment


      • #78
        Hi Marcel,

        First of all I meant to say thanks very much for Cutadapt, it is such a neat and useful tool! I was just trying to trim a FastQ file for which lines 1 and 3 do not have the same description:
        Code:
        @1_X:42118728-42118811_R1
        ATTTTGAAATTATATATTTATTAAATAATAAAAATAAATTGAGAAGGTTATTTATGTATAGATAATGTATAGATATAAATATAGATCTAGATATTTTTAG
        +1_R1
        HHHHHHHHHHHHHHHHHHHHHHHHHHGGGGGGGGGGGFFFFFFFEEEEEDDDDCCCBBBAAA@@@??>>==<<;;::99877655432210/..-,+*)(
        This produces the following message:

        Error: At line 3: Two sequence descriptions are given in the FASTQ file, but they don't match ('1_X:42118728-42118811_R1' != '1_R1') perhaps you should try the 'sra-fastq' format?

        Is there a reason that lines 1 and 3 have to be identical? Isn't line 3 often just a '+' in order to save space? (btw 'sra-fastq' won't work in this case, either).

        Thanks,
        Felix

        Comment


        • #79
          Hello Felix,

          Originally posted by fkrueger View Post
          Is there a reason that lines 1 and 3 have to be identical? Isn't line 3 often just a '+' in order to save space?
          Yes, having an empty description is allowed, too. Perhaps the message may be a bit misleading. I’ve now changed it to: “The second sequence description must be either empty or equal to the first description.”

          In both http://dx.doi.org/10.1093.nar.gkp1137 and http://maq.sourceforge.net/fastq.shtml , which are the closest thing to a FASTQ format specification, it is stated that the second description must be the same as the first or be omitted. In short, cutadapt is correct in telling you that the file is broken. There is no option in cutadapt, yet, to ignore this error, but you can simply remove the check from the source code. In cutadapt/seqio.py, search for these lines (should be around line 318):
          Code:
          self.twoheaders = True
          if not line[1:] == name:
          and change them to:
          Code:
          self.twoheaders = False
          if False:
          With these changes, the second description is discarded by cutadapt and your output FASTQ will not have them.

          btw 'sra-fastq' won't work in this case, either
          That part of the message is actually incorrect and I’ve already removed it in cutadapt 1.2.

          Comment


          • #80
            Thanks for clearing this up so quickly. The output is actually from our FastQ simulator Sherman, so I might just change it in there so that it doesn't clash with a (near enough) FastQ format specification.

            Cheers,
            Felix

            Comment


            • #81
              Great, that’s the best solution I think.

              Comment


              • #82
                Alright done. After all its a good way to quickly simulate adapter contaminated reads and then clean them up with Cutadapt straight away... :P

                Comment


                • #83
                  Dear mmartin,

                  I am dealing with paired-end Illumina data and I want to cut primers from both sides. When I use the command to cut both at the same time, cutadapt will just do it for one side. I tried several option, but not succesful yet. If I apply the commands seperately for the adapters, then it cuts perfect. What am I doing wrong and how can I manage to do both steps at once? Thanks a lot!

                  Comment


                  • #84
                    Trimming two adapters simultaneously can be achieved by using the --times 2 option. This is not 100% the same as specifying an "adapter pair", which is not implemented, yet, see http://code.google.com/p/cutadapt/issues/detail?id=34 . In practice, using --times 2 should give very similar results.

                    Comment


                    • #85
                      Originally posted by mmartin View Post
                      Trimming two adapters simultaneously can be achieved by using the --times 2 option. This is not 100% the same as specifying an "adapter pair", which is not implemented, yet, see http://code.google.com/p/cutadapt/issues/detail?id=34 . In practice, using --times 2 should give very similar results.
                      Thank you very much! I will try this on my samples!

                      Comment


                      • #86
                        Hi, mmartin,

                        I am trying to use cutadapt to remove adapters. It is some annoying that it always threw me errors when I tried to install cutadapt.
                        The problem seems related to C compiler. So, I removed the old ones and reinstalled windows SDKs, the version I have now is Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1.
                        The computer is configured with windows 7, python 2.7.

                        However, it still threw errors after I typed 'python setup.py build' in SDK shell as showed in the end of this post. Could you give me some hints ?

                        >python setup.py build

                        running build
                        running build_py
                        running build_ext
                        building 'cutadapt.calign' extension
                        creating build\temp.win-amd64-2.7
                        creating build\temp.win-amd64-2.7\Release
                        creating build\temp.win-amd64-2.7\Release\cutadapt
                        C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Bin\amd64\cl.exe /c /nolog
                        o /Ox /MD /W3 /GS- /DNDEBUG "-IC:\Program Files\Python\include" "-IC:\Program Fi
                        les\Python\PC" /Tccutadapt/calignmodule.c /Fobuild\temp.win-amd64-2.7\Release\cu
                        tadapt/calignmodule.obj
                        calignmodule.c
                        cutadapt/calignmodule.c(48) : warning C4005: 'max' : macro redefinition
                        C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Include\stdlib.h(8
                        49) : see previous definition of 'max'
                        cutadapt/calignmodule.c(49) : warning C4005: 'min' : macro redefinition
                        C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Include\stdlib.h(8
                        50) : see previous definition of 'min'
                        cutadapt/calignmodule.c(120) : error C2143: syntax error : missing ';' before '<
                        class-head>'
                        cutadapt/calignmodule.c(122) : error C2143: syntax error : missing ';' before 't
                        ype'
                        cutadapt/calignmodule.c(123) : error C2143: syntax error : missing ';' before 't
                        ype'
                        cutadapt/calignmodule.c(127) : error C2143: syntax error : missing '{' before '*
                        '
                        cutadapt/calignmodule.c(128) : error C2373: 'column' : redefinition; different t
                        ype modifiers
                        cutadapt/calignmodule.c(127) : see declaration of 'column'
                        cutadapt/calignmodule.c(128) : error C2059: syntax error : ')'
                        cutadapt/calignmodule.c(129) : error C2059: syntax error : 'if'
                        cutadapt/calignmodule.c(135) : error C2059: syntax error : 'for'
                        cutadapt/calignmodule.c(135) : error C2143: syntax error : missing '{' before '<
                        ='
                        cutadapt/calignmodule.c(135) : error C2059: syntax error : '<='
                        cutadapt/calignmodule.c(135) : error C2059: syntax error : '++'
                        cutadapt/calignmodule.c(135) : error C2059: syntax error : ')'
                        cutadapt/calignmodule.c(141) : error C2065: 'm' : undeclared identifier
                        cutadapt/calignmodule.c(141) : error C2099: initializer is not a constant
                        cutadapt/calignmodule.c(143) : error C2065: 'm' : undeclared identifier
                        cutadapt/calignmodule.c(143) : error C2224: left of '.cost' must have struct/uni
                        on type
                        cutadapt/calignmodule.c(145) : error C2065: 'm' : undeclared identifier
                        cutadapt/calignmodule.c(145) : error C2224: left of '.origin' must have struct/u
                        nion type
                        cutadapt/calignmodule.c(148) : error C2065: 'error_rate' : undeclared identifier

                        cutadapt/calignmodule.c(148) : error C2065: 'm' : undeclared identifier
                        cutadapt/calignmodule.c(148) : error C2099: initializer is not a constant
                        cutadapt/calignmodule.c(149) : error C2099: initializer is not a constant
                        cutadapt/calignmodule.c(150) : error C2059: syntax error : 'if'
                        cutadapt/calignmodule.c(154) : error C2059: syntax error : 'for'
                        cutadapt/calignmodule.c(154) : error C2143: syntax error : missing '{' before '<
                        ='
                        cutadapt/calignmodule.c(154) : error C2059: syntax error : '<='
                        cutadapt/calignmodule.c(154) : error C2059: syntax error : '++'
                        cutadapt/calignmodule.c(154) : error C2059: syntax error : ')'
                        cutadapt/calignmodule.c(229) : error C2059: syntax error : 'if'
                        cutadapt/calignmodule.c(246) : error C2371: 'free' : redefinition; different bas
                        ic types
                        C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\Include\stdlib.h(6
                        00) : see declaration of 'free'
                        cutadapt/calignmodule.c(248) : error C2059: syntax error : 'if'
                        cutadapt/calignmodule.c(251) : error C2059: syntax error : 'else'
                        cutadapt/calignmodule.c(257) : error C2099: initializer is not a constant
                        cutadapt/calignmodule.c(258) : error C2059: syntax error : 'return'
                        cutadapt/calignmodule.c(259) : error C2059: syntax error : '}'
                        cutadapt/calignmodule.c(297) : error C2065: 'methods' : undeclared identifier
                        cutadapt/calignmodule.c(297) : warning C4047: 'function' : 'PyMethodDef *' diffe
                        rs in levels of indirection from 'int'
                        cutadapt/calignmodule.c(297) : warning C4024: 'Py_InitModule4_64' : different ty
                        pes for formal and actual parameter 2
                        error: command 'cl.exe' failed with exit status 2

                        Comment


                        • #87
                          Originally posted by ZoeG View Post
                          I am trying to use cutadapt to remove adapters. It is some annoying that it always threw me errors when I tried to install cutadapt.
                          The problem seems related to C compiler. So, I removed the old ones and reinstalled windows SDKs, the version I have now is Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1.
                          I guess that you need to compile cutadapt under mingw
                          http://sourceforge.net/projects/mingw/, or even cygwin.
                          You'll be lucky if you compile it with TDM-GCC Compiler
                          It can be really pain in the, hmmm, head so I guess that it would be much simpler if you'll install Ubuntu 12.04.3 LTS 64bit. I've finished with this result (dualboot system, Ubuntu for all bioinformatics needs).
                          Last edited by lokapal; 09-22-2013, 06:33 AM.

                          Comment


                          • #88
                            Cutadapt 1.4 released

                            Hi, I've just released cutadapt 1.4, which comes with some speed improvements, bug fixes and also new features. See the detailed changelog at http://code.google.com/p/cutadapt/ .

                            Comment


                            • #89
                              Originally posted by mmartin View Post
                              Hi, I've just released cutadapt 1.4, which comes with some speed improvements, bug fixes and also new features. See the detailed changelog at http://code.google.com/p/cutadapt/ .
                              congratulations!
                              Thank you, will try ~

                              Comment


                              • #90
                                Trimming Multiple Adapter Sequences

                                Hello. Thank you in advance.

                                I am in need to trim 3' end sequences off Read1 and all the sequence that follows the 3' end of read 1.

                                So looking at cutadapt --help the -a option seems the right fit.

                                From the illumina website, for RNA seq there is a list of roughly 27 indexed adapter sequences.

                                Is it true that I need to specify all the adapters in the command line for cutadapt to trim them all?

                                I just would like some guidance with how to trim the CHIP Seq RNA seq adapter sequences given below:

                                OR
                                do I just use the universal adapter as the input for cutadapt???




                                FROM ILLUMINA WEBSITE ::::
                                TruSeq Universal Adapter
                                5’ AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
                                TruSeq Adapter, Index 1 6
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 2
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 3
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 4
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 5
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 6
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 7
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 8
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 9
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 10
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 11
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 12
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 13
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 14
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 15
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACATGTCAGAATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 16
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCGTCCCGATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 18 7
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTCCGCACATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 19
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGAAACGATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 20
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGGCCTTATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 21
                                5’ GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTCGTATGCCGTCTTCTGCTTG
                                TruSeq Adapter, Index 22

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-25-2024, 11:49 AM
                                0 responses
                                15 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-24-2024, 08:47 AM
                                0 responses
                                17 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                62 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                60 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X