Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • First Call Missing on SOLiD

    Hi, I am new to sequencing and I have the data from SOLiD where all the reads are having first call missing. What could be the reasons and how can this be fixed? can this be fixed at all?
    John.

  • #2
    By first call you mean that you only have the colorspace numbers and not the lead base? E.g., do you have something like:
    021321232
    or
    G021321232
    or
    something else
    ???

    Comment


    • #3
      Thanks a lot for the reply Westerman. I have something like T.0123123

      Comment


      • #4
        Oh that. Just means that the quality of the call was not very good and thus is missing. You can sometimes see this in the middle of a run -- one of our recent runs had 122,577 reads of 14,250,530 with missing bases. Two points:

        1) Having all of your reads with a missing first base does indicate a problem with your sequencer. I haven't see this problem before (we often get missing bases but not on a consistent basis for all reads) so can not say what is causing it.

        2) Because of how color-space works then missing bases do not have much of an effect on the mapping of reads to the reference. I could go into details but as long as the rest of the read is fine then mapping will fine. Now translating from color-space to base-space is not going to work but you should not be doing that anyway. See the many posts on this forum to read up on why this is so.

        Comment


        • #5
          As for #2, now that I think of it, missing (or poor quality bases) in normal basespace reads (e.g., from Illumina or 454) also do not have much of an effect on mapping. So color-space is not unique in that respect except that (a) usually you have many more reads with the SOLiD and thus do not care too much about missing data and (b) color-space is indeed more robust to sequencer errors.

          Bottom line:
          1) I would ask your service provider why the first base is always missing. (unless you are the service provider in which case let us in SeqAnswers know and maybe we can delve into the problem.

          2) But even without getting an answer from the service provider I would just go ahead with your normal mapping, snp calling, etc. process.

          Comment


          • #6
            Westerman,
            Thanks for the in-depth info. I am not a service provider. I already asked my provider about this issue and I have not got any response from them yet.
            First,
            Like you mentioned, I have the same missing call in all reads in all samples. I don't know whats going on.

            Next,
            I am using NextGene and this will use a conversion from CSFASTA to CSFASTA as the first step where in reads will missing calls are filtered out automatically. Also, I think it converts to base reads as the tech support people told me that the missing call will change the base and eventually affect the quality of the data. So, basically the program is filtering everything out as they all have same missing call and then if I align the reads without filtering, them the program is stopping, showing an error. I did many projects with Nextgene, I never had this issue. Same said by tech support as you...its never heard issue.

            I hope ABI has the answer for this. I am still waiting for the reply from them.

            Comment


            • #7
              I do not use NextGene but they do have a nice PDF about their conversion method. It looks like they keep the file in colorspace (good) or, optionally, can convert it to basespace (bad -- as they point out). It is not obvious to me that reads have to be filtered but tech support undoubtedly knows better than I.

              Since NextGene filters out reads with unknowns in them, a possible work-around is to put in a '4' in place of the period -- that is another way of signifying an 'N'. Or, perhaps more dangerously, just delete the period with the 'G' (e.g., collapse the 2nd base). That would totally throw out the idea of colorspace to basespace conversion but at least would get the sequence through NextGene to the point where NextGene can work with them via colorspace. In either case keep your original files intact. Without some level of command line ability I am not sure how would get the modification/deletion task accomplished. If you were able to run from a Unix prompt then the conversion would trivial.

              -------------

              On a side note, I do like NextGene's brochure which says, " NextGENe’s Windows® based operation removes the complexity found with programs such as CLC bio, Lasergene’s SeqMan & NGEN, MAQ & SOAP, Top Hat, BWA & Bowtie. Biologists simply input the systems raw reads or SAM and BAM files, select the application and click go to perform the analysis of 2nd generation sequencing data."

              Hum. Perhaps so. Simplicity is nice. Until you run into complexity. :-(

              Comment


              • #8
                Yeah, you are right. NextGene works good for people like me. But when we run into problems, then there are not many solutions other than learning the coding. In this case, I don't have much time to learn.

                Regarding putting a 4 or deleting the period, I am not sure that's a good idea as it changes the sequence all together. In any case, I had to get the reply from ABI and then can take next step.

                Thanks for the information Westerman.

                Comment


                • #9
                  Just to emphasize the point, inserting the 4 or deleting the period will only change the sequence "all together" if and only if the reads are converted to base-space. If the reads are kept in color-space -- which they should be -- then all of the bases from #2 onward are still valid and can be used with confidence.

                  That being said, waiting for ABI or your service provider to get back to you is the safest course. Let us hope that they respond before your boss (or granting agency) starts breathing down your back!

                  Comment


                  • #10
                    You are right. I have to tell my boss and am not happy about that. Anyways, the csfasta file is being created again by ABI personnel. Hope this solves the problem.
                    Thanks Westerman.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      The Impact of AI in Genomic Medicine
                      by seqadmin



                      Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                      02-26-2024, 02:07 PM
                    • seqadmin
                      Multiomics Techniques Advancing Disease Research
                      by seqadmin


                      New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

                      A major leap in the field has
                      ...
                      02-08-2024, 06:33 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:12 AM
                    0 responses
                    19 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 02-23-2024, 04:11 PM
                    0 responses
                    67 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 02-21-2024, 08:52 AM
                    0 responses
                    74 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 02-20-2024, 08:57 AM
                    0 responses
                    66 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X