Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The verbose option crashes terminal every time I use it (after a few hours, but before the mapping is done), so I don't know about that.

    The fastqc plots of kmer and overrepresented sequences are rather odd looking, but as this is a transcriptome it is unclear to me what the expectation should be, although from what I can tell by googling around what I have is not unusual for a transcriptome. I did not use barcodes for this dataset.

    I have tried doing some additional quality filtering to see if that makes a difference. I will report back on that.

    Just mapping the reads that do map and completing the pipeline does produce contigs that blast appropriately.

    Comment


    • #17
      Additional qc and the addition of the -v 3 option resulted in .034% of the reads mapping in bowtie, so I still have no idea whats wrong.

      Comment


      • #18
        Originally posted by sasignor View Post
        Just mapping the reads that do map and completing the pipeline does produce contigs that blast appropriately.
        Well, that's not the point - obviously these came from the right species. I rather meant to take the reads that do not map (or all, for simplicity), and feed them through e.g. Trinity. Then take some of the contig and blast them to nt to see what species you have sequenced. If you do it for let's say 10 million reads, this goes really fast and should give you an idea whether contamination is a problem...

        Comment


        • #19
          This is the output of FastQC for one of the files I am using - the other is comparable. Again - it does look unusual for a genome but as far as I can tell not for a transcriptome, and not for transcriptomes I have successfully aligned in the past.
          Attached Files

          Comment


          • #20
            Lots of poly-T/A in there.

            Try using bowtie2 (not bowtie) in '--local' mode.

            Comment


            • #21
              Yeah, lots of poly-T - my transcriptome data doesn't look like that, for sure. The weird GC-content can't be biological IMHO - everything up to base ~55 looks crazy.
              Did you look hard for adaptors? Perhaps you have a lot of weirdly ligated fragments in there, or the sequencing run had problems - talk to your provider.
              A shot in the dark may be to try to clip the sequences up to 55 (cut out 55-95 or so) and try to map that...

              Comment


              • #22
                Hello sasignor
                I am wondering if you diagnosed the problem
                I am facing a similar issue, and would love to hear about your progress
                Thanks

                Comment


                • #23
                  Originally posted by sasignor View Post
                  I am attempting to align a transcriptome sequenced with Hiseq to a reference using bowtie. The parameters I am using are:

                  bowtie -S -p 2 reference -q --phred64-quals

                  And none of the reads align. They also do not align if I do not include the quality parameter, or any modification such as the --ff suggested in related postings.

                  I have checked for adapter contamination and found very little, the reads were cleaned using ngs backbone, although I am not using that pipeline for anything downstream of cleaning. It has also been reported by some that they do not align in paired end but do in single end, my reads do not align in either case. Around 30% of the reas align in bwa.

                  Does anyone have any idea why this is the case?

                  Thanks!
                  Sarah
                  Hi Sarah,

                  Your reads look as if they were produced with CASAVA v1.8, which reports Phred+33 Q-scores (Illumina 1.9/Sanger). If that is the case, removing the --phred64-quals option from the bowtie command may do the trick (phred 33 is default).

                  Cheers,

                  Fernando

                  Comment


                  • #24
                    It looks to me that the first 10 bases of all your reads are similar if not identical sequences. Have you checked overrepresented sequences output from FastQC? It may also give you information about which kind of contamination (adaptors) the overrepresented sequences might be.

                    Try cleaning up reads before aligning them with bowtie, e.g. clip adaptors, trim low-quality bases, trim polyA tails, remove reads with low-complexity regions, etc. Prinseq or seqclean can do this job.

                    Sunny
                    Last edited by Sun-SEQ; 08-01-2012, 12:44 AM.

                    Comment


                    • #25
                      Also might be worth checking with your sequence provider to make sure they sent you the right dataset.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Latest Developments in Precision Medicine
                        by seqadmin



                        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                        Somatic Genomics
                        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                        Today, 01:16 PM
                      • seqadmin
                        Recent Advances in Sequencing Analysis Tools
                        by seqadmin


                        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                        05-06-2024, 07:48 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Today, 07:15 AM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 10:28 AM
                      0 responses
                      15 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 07:35 AM
                      0 responses
                      16 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-22-2024, 02:06 PM
                      0 responses
                      8 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X