Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #31
    My question was about the version of unix/linux you are using. Are you the "administrator" or is there someone you can ask help from?

    What program from the fastx_toolkit are you most interested in? There are other options that may allow you to move on with your analysis. We can offer suggestions once we know what you are trying to do.

    Comment

    • Nanu
      Member
      • Sep 2014
      • 30

      #32
      As administrator and I have to learn the RNA seq data analysis. I am new for NGS analysis. I performed firstly fastqc then I got the failure of k-mer & duplication level and warning for per base sequence content . So , I tried to trim the sequences. I don't know am I on the right track .. Can you suggest me for the steps.

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #33
        Originally posted by Nanu View Post
        As administrator and I have to learn the RNA seq data analysis. I am new for NGS analysis. I performed firstly fastqc then I got the failure of k-mer & duplication level and warning for per base sequence content . So , I tried to trim the sequences. I don't know am I on the right track .. Can you suggest me for the steps.
        "Failure" on some aspects of FastQC does not immediately indicate that you have a bad dataset. In some cases duplication of regions is expected/normal.

        If you are looking to trim your sequences then try BBDuk. It is simple to use and will not need any compilation/installation. Alignments (BBMap/TopHat/STAR, only one aligner is needed) followed by DESeq2 (R package) is a fairly standard path to take for RNAseq data analysis. Sequence/Aligner indexes/Annotations are available for several model organisms here: http://support.illumina.com/sequenci...e/igenome.html

        There is a nice tutorial for TopHat here: http://www.nature.com/nprot/journal/....2012.016.html. Brian has examples for BBMap usage in this thread: http://seqanswers.com/forums/showthread.php?t=41057 DESeq2 vignette provides an excellent introduction: http://www.bioconductor.org/packages...c/beginner.pdf Subread package include FeatureCounts (http://bioinf.wehi.edu.au/subread-package/) which you will need for counting purposes.
        Last edited by GenoMax; 10-01-2014, 06:06 AM.

        Comment

        • Michael Love
          Senior Member
          • Jul 2013
          • 333

          #34
          Thanks GenoMax. Just a warning: on October 14, 2014, the Beginner's vignette PDF will move to a Bioconductor workflow hosted here:

          The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.


          Called something like "RNA-Seq at the gene level"

          Writing it up as a workflow allows us to explore other downstream analyses using other packages and not worry about build/check timing.

          Comment

          • Nanu
            Member
            • Sep 2014
            • 30

            #35
            Thanks Genomax,

            I tried the NGSQC toolkit for quality control. Please guide me how to convert .fna file to fastq file. On otherside I am installing Tophat..too. I installed Boost then I got msg of 11 failure, 8 skipped and remaining has been upgraded. Now what to do in both cases.?

            Comment

            • Brian Bushnell
              Super Moderator
              • Jan 2014
              • 2709

              #36
              Nanu,

              You can convert *.fna (another name for fasta) to *.fastq with reformat, but note that the resulting output will not have valid scores. Why are you attempting to do that?

              And FYI, BBMap is substantially easier to install than Tophat; you just unzip it.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #37
                Originally posted by Nanu View Post
                Thanks Genomax,

                I tried the NGSQC toolkit for quality control. Please guide me how to convert .fna file to fastq file. On otherside I am installing Tophat..too. I installed Boost then I got msg of 11 failure, 8 skipped and remaining has been upgraded. Now what to do in both cases.?
                You should get pre-compiled binaries for TopHat. That would simplify things. Binaries are available from the same page where you downloaded the source code.

                Comment

                • Nanu
                  Member
                  • Sep 2014
                  • 30

                  #38
                  Brushnell,

                  I have to do the sequence based triming of adapters. Trimmer scripts need .fastq format.

                  Comment

                  • Nanu
                    Member
                    • Sep 2014
                    • 30

                    #39
                    Ohk I will try BBMap also

                    Comment

                    • Nanu
                      Member
                      • Sep 2014
                      • 30

                      #40
                      I have to do the sequence based triming of adapters. Trimmer scripts need .fastq format. Is any other way to convert the .fna and .qual file to trim without any conversion. If conversion needed then guide me also

                      Comment

                      • Brian Bushnell
                        Super Moderator
                        • Jan 2014
                        • 2709

                        #41
                        Nanu,

                        Reformat can change fasta + qual into fastq, like this:

                        reformat.sh in=reads.fna qfin=reads.qual out=reads.fastq

                        BBDuk can directly trim the fasta (fna) files, or do both at the same time, for example -

                        bbduk.sh in=reads.fna qfin=reads.qual out=reads.fastq ktrim=r k=25 mink=12 hdist=1 ref=truseq.fa


                        Adapter files (truseq and nextera) are included with the BBTools package.

                        Comment

                        • Nanu
                          Member
                          • Sep 2014
                          • 30

                          #42
                          I would like to thanks to everybody, due to them i completed the previous steps. Now I need more help. I have done indexing of reference genome by bowtie2-build . Now I am executing tophat 2.0.13 SO,
                          ./tophat2 /home/me/Downloads/bowtie2-2.2.3/*.bt2/ /home/me/Downloads/bin/Sample_L1_R1_trim.fastq

                          Then I found the following error:
                          [2014-10-10 10:51:30] Beginning TopHat run (v2.0.13)
                          -----------------------------------------------
                          [2014-10-10 10:51:30] Checking for Bowtie
                          Bowtie version: 2.1.0.0
                          [2014-10-10 10:51:30] Checking for Bowtie index files (genome)..
                          Error: Could not find Bowtie 2 index files (/home/himanshu/Downloads/bowtie2-2.2.3/*.bt2/.*.bt2)

                          Comment

                          • GenoMax
                            Senior Member
                            • Feb 2008
                            • 7142

                            #43
                            Please go through the command line examples on how to run a typical TopHat analysis. Though this article is for TopHat (v.1.0) basic principles are the same for TopHat v.2. (http://www.nature.com/nprot/journal/....2012.016.html).

                            Hint: On your tophat command line provide path to "prefix" of your index files i.e. if your index files are named human*.bt2, then you need to provide only the "human" part to the command (with full path, in case the files are not in the current directory).

                            Comment

                            • Nanu
                              Member
                              • Sep 2014
                              • 30

                              #44
                              Dear Genomax,
                              When i mentioned the --prefix to give the path the it was showing the following:
                              ./tophat --prefix= /home/me/Downloads/bowtie2-2.2.3/bowtei/ /home/me/Downloads/bin/Sample_L1_R1_trim.fastq
                              tophat: option --prefix not recognized
                              for detailed help see http://tophat.cbcb.umd.edu/manual.html


                              then what should i do?

                              Comment

                              • GenoMax
                                Senior Member
                                • Feb 2008
                                • 7142

                                #45
                                Originally posted by Nanu View Post
                                When I use the command reformat.sh in bbtools package I am getting the following error::
                                java -ea -Xmx200m -cp /home/himanshu/Downloads/me2/bbmap/current/ jgi.ReformatReads -in=reads.fna qfin=reads.qual out=reads.fasta
                                Executing jgi.ReformatReads [-in=reads.fna, qfin=reads.qual, out=reads.fasta]

                                Input is being processed as unpaired
                                Exception in thread "Thread-1" java.lang.AssertionError
                                at stream.FastaQualReadInputStream3.makeRead(FastaQualReadInputStream3.java:257)
                                at stream.FastaQualReadInputStream3.toReadList(FastaQualReadInputStream3.java:147)
                                at stream.FastaQualReadInputStream3.toReads(FastaQualReadInputStream3.java:113)
                                at stream.FastaQualReadInputStream3.fillBuffer(FastaQualReadInputStream3.java:97)
                                at stream.FastaQualReadInputStream3.hasMore(FastaQualReadInputStream3.java:56)
                                at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:745)
                                at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:737)

                                Please help me
                                Himanshu: Don't post questions in a thread that is originally about a completely different topic. This is not going to help you get an answer since your post will not be visible to someone who can answer the question.

                                e.g. This question would be more appropriate in the BBTools thread (search the forum and find the thread).

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                  by SEQadmin2


                                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                                  Here are nine questions we think about, in roughly the order they matter, before...
                                  06-18-2026, 07:11 AM
                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-17-2026, 06:09 AM
                                0 responses
                                34 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-09-2026, 11:58 AM
                                0 responses
                                97 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                117 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                112 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...