Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trouble with box-plot of quality data in SOLID

    hi everyone,
    I have used the SOLiD2std.pl to change the csfasta and qual files to standard fastq files.
    I then ran fastqc to view boxplots of quality data and got these results, attached.
    They seem to have poor quality every 5 nt, as if the primer 5 of the procedure failed....
    Has anyone seen this type of quality plots before???
    I am new with solid data, have worked with ilumina and 454 before.
    Thanks for any guidance in advance.
    cheers
    maximo
    Attached Files

  • #2
    You can not convert colorspace to basespace directly. See here why:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Comment


    • #3
      An alternative route to using FastQC, which is very nice, is to align first and use the resulting BAM file as input.

      Another program I have looked at for duplicate analysis is prinseq, which also offers graphical output.

      Comment


      • #4
        Did you removed adapter info or trimmed your reads.
        what about other fastQC results they are pass or fail.
        Krishna

        Comment


        • #5
          Originally posted by Krish_143 View Post
          Did you removed adapter info or trimmed your reads.
          what about other fastQC results they are pass or fail.
          Some aspects passed and others failed,
          I did NO removal of adapters.... Not sure if it was done when i received the csfasta and qual files..
          Taking into consideration that this is a direct conversion from color to base space, its understandable that there may be confusing results.... (In all about 45% of reads mapped with color-space-novoAlign to a reference, and only 15% usnig colorspace-tophat/bowtie...)
          Any thoughts?
          PASS Basic Statistics Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          FAIL Per base sequence quality Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          WARN Per sequence quality scores Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          WARN Per base sequence content Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          FAIL Per base GC content Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          FAIL Per sequence GC content Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          WARN Per base N content Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          PASS Sequence Length Distribution Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          PASS Sequence Duplication Levels Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          PASS Overrepresented sequences Corrida_4_FC_1_01_01CVATFR001_F3.fastq
          FAIL Kmer Content Corrida_4_FC_1_01_01CVATFR001_F3.fastq

          Comment


          • #6
            I would recommend using SAET to clear up any colour errors before alignment. Normally I get 1-5% more aligned reads using this (apparently a lot more with ECC data).

            I don't think your fastqc boxplot results are too bad, but I've seen better ones.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Best Practices for Single-Cell Sequencing Analysis
              by seqadmin



              While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
              Yesterday, 07:15 AM
            • seqadmin
              Latest Developments in Precision Medicine
              by seqadmin



              Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

              Somatic Genomics
              “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
              05-24-2024, 01:16 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 08:18 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 08:04 AM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 06-03-2024, 06:55 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-30-2024, 03:16 PM
            0 responses
            27 views
            0 likes
            Last Post seqadmin  
            Working...
            X