Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Titanium kit - short reads

    Hi,

    we upgraded to 454 Titanium in january 2009 and the first runs seemed to be fine (although we see some small decline in average read length).

    Now we startet a new Titanium sequencing kit and all runs show massive amounts of short reads. The read length is almost equal distributed among the reads.

    I have attached a histogram of the read lengths:

    [IMG]ftp://genome.imb-jena.de/pub/andpet/titanium_read_length_histo.jpg[/IMG]


    Has anyone an idea about this or experienced the same before ?

    Thanks a lot,

    Andreas

  • #2
    Hmm,

    insert picture does not work with ftp ... why not ?

    Andreas

    Comment


    • #3
      Andreas, try attaching it to the post, rather than using the [IMG] tags. As far as I know one cannot in-line display images with FTP.

      Comment


      • #4
        Titanium kit - short reads

        Okay here is the attached sequence length histogram ...

        ANdreas
        Attached Files

        Comment


        • #5
          what type of sequencing was it? amplicon or shotgun?

          Comment


          • #6
            Shotgun sequencing. We tried to sequence three insects and all runs showed the same read profile.

            We always measure the size distribution of our DNA before sequencing (Agilent Chip) and all three cases showed good results (500 - 2000 bp).

            Andreas

            Comment


            • #7
              we had the same profile doing amplicon sequencing and the amplicons were selected to be 200-300 bp (it was before we had the titanium upgrade).

              After further analysing the results we discovered that the graph profile in fact was a combination of a peak around 60 and the normal peak around 200-300. The 60bp peak was caused by priming dimers and priming mismatches. After removing everything from the dataset that had 2 short priming matches, the profile looked ok.

              In a later run the pcr was optimised and there were 95% less priming mismatches etc and the profile was ok then.

              When i look at your graph, it looks like 95% the same to the graph we had... So i guess there must be something wrong with the emulsion PCR? Are there primer contaminations in the room or something else that would cause priming dimers to appear?

              Are you doing de novo sequencing or just a resequencing experiment to validate the techniques? It might be useful to map the ok reads on an existing genome and see if the 60bp peak has a different 'mapping behaviour'. We have discovered that the down stream analysis can greatly improve the upfront library preparation. It takes some time to analyse your data, or write some scripts to automate it. But future experiments can be QC-ed in 1h and you can always use the results to optimize the upfront work.

              Comment


              • #8
                This looks like a shape of poor Titanium run.

                For good Titanium shotgun runs, the peak length usually at 500bp, and average length is around 400bp. Short length regions may have a hump, but should not have a big peak. In other word, the length distribution should be mainly one peak around 500bp.

                Besides mapping method suggested above, I would say use output of quality filter metrics of 454 pipeline to make judgement on the run.

                Run metrics prints out lots of information, particuarlly useful is those metrics of raw reads, filtered reads, dots, mixed etc breakdown. Use sffinfo (offline package 454 software) to see whether the short reads was trimmed by quality filter. Just run" sffinfo sffile" you will get all filtered reads information with non-trimmed full length there.

                Another way to easy tell something is wrong or not is to check reads count. Titanium should generate 1 million plus per run filtered reads passing quality filter. If you don't get this number of reads per run, then lots of raw reads were rejected (and trimmed) by quality filter, meaning something is wrong on library, EMPCR or whatever upstream procedure.

                If the short reads were not result of trimming and rejecting by quality filter, then the story may change. The sample may have massive amount of short reads (primer dimers etc) and 454 machine would favor those short fragement and sequencing them in junk amount, in a high quality manner.

                Comment


                • #9
                  We are seeing some similar runs, intermittent.

                  Something else we tried to eliminate library prep issues was to do a filter template and in <qualityFilter> add a <doPrimerTrimming>false</doPrimerTrimming>


                  This will allow you to see the B-adaptor sequence if present. We would expect library issues to present B-adaptors on short sequences.

                  joa_ds what did you do for PCR optimization?

                  Comment


                  • #10
                    well, we did amplicon sequencing and had a pcr reaction prior to the emPCR, which of course amplified problems that were already there.

                    We added a purification step prior to the emPCR. I don't know the details, as I am not in the lab, but i know they are using some sort of chromatographic technique to remove short sequences 50-100bp long. This simple technique improved our mapping efficiency from around 60% to around 95%.

                    We add MID tags and detection of an MID is also something we use as a quality check to verify if sequences are ok. We expect very short sequences to have a reverse complement of the MID at the end of a sequence.

                    Another thing we do is map and check if the mapped sequence is just the primersequence, they appear ok at first sight, but something is wrong of course then. After using the chromatography, and doing some minor tweaking, the distribution looks like what it should be

                    Comment


                    • #11
                      Following hlu's suggestion (thanks did not realize the sff file had complete reads) I dug into a run that had a similar profile to andpet's

                      In this case the length is there it is just low quality, ie the untrimmed lengths have a mode right around 500.

                      Looking at some of the trimmed sequences they are trimmed for a reason, low quality scores are evident. The question still is though why do we get reads with low quality?

                      Anybody know how to dig out the signals for a particular read, I guess I have the pixel coordinates ...
                      Attached Files

                      Comment


                      • #12
                        Hi everybody,
                        we recently upgraded to the Titanium system. What we observed is that the breaking step of the Large Volume emPCR could be tricky.
                        Sometimes we found some oil above the beads pellet (usually more visible at the end of the procedure when the beads are transferred to a 1,5 mL tube).
                        In the Small Volume preparation (where the breaking is performed by using the filters), performed to titrate the library, we didn't find oil on the beads.
                        Both the beads from the Small and the Large preparation were sequenced and we obtained good results for the "Small" (expected number of reads, median length 450 bp) and very bad results for the "Large" ones (40-50% of the expected reads, median length 260 bp).
                        The output graph of the "oily sample" was very similar to that obtained by Andeas, while for the other beads it was as expected.
                        As the library used for both preparations was the same, we hypothesize the oil could hamper something during the enrichment steps or during the sequencing reaction.

                        Comment


                        • #13
                          Hi Bia,

                          Thanks for sharing your emPCR experience. We got problem of too low % filter passed reads -- about 25%. One question, were the % filter passed reads differ or similar between your results from Large vs. Small emPCR?

                          Comment


                          • #14
                            hlu,

                            what do you mean by primer dimers in the sequencing reaction? wouldnt these be removed by SPRI during LC?

                            Comment


                            • #15
                              Originally posted by boss_hoss View Post
                              hlu,

                              what do you mean by primer dimers in the sequencing reaction? wouldnt these be removed by SPRI during LC?

                              Sometimes, the content of reads were very short, 40bp, mostly formed from a pair of primer sequences.

                              This happened in the past in GS-FLX, and GS-20 times.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-27-2024, 06:37 PM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-27-2024, 06:07 PM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              69 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X