Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Titanium kit - short reads

    Hi,

    we upgraded to 454 Titanium in january 2009 and the first runs seemed to be fine (although we see some small decline in average read length).

    Now we startet a new Titanium sequencing kit and all runs show massive amounts of short reads. The read length is almost equal distributed among the reads.

    I have attached a histogram of the read lengths:

    [IMG]ftp://genome.imb-jena.de/pub/andpet/titanium_read_length_histo.jpg[/IMG]


    Has anyone an idea about this or experienced the same before ?

    Thanks a lot,

    Andreas

  • #2
    Hmm,

    insert picture does not work with ftp ... why not ?

    Andreas

    Comment


    • #3
      Andreas, try attaching it to the post, rather than using the [IMG] tags. As far as I know one cannot in-line display images with FTP.

      Comment


      • #4
        Titanium kit - short reads

        Okay here is the attached sequence length histogram ...

        ANdreas
        Attached Files

        Comment


        • #5
          what type of sequencing was it? amplicon or shotgun?

          Comment


          • #6
            Shotgun sequencing. We tried to sequence three insects and all runs showed the same read profile.

            We always measure the size distribution of our DNA before sequencing (Agilent Chip) and all three cases showed good results (500 - 2000 bp).

            Andreas

            Comment


            • #7
              we had the same profile doing amplicon sequencing and the amplicons were selected to be 200-300 bp (it was before we had the titanium upgrade).

              After further analysing the results we discovered that the graph profile in fact was a combination of a peak around 60 and the normal peak around 200-300. The 60bp peak was caused by priming dimers and priming mismatches. After removing everything from the dataset that had 2 short priming matches, the profile looked ok.

              In a later run the pcr was optimised and there were 95% less priming mismatches etc and the profile was ok then.

              When i look at your graph, it looks like 95% the same to the graph we had... So i guess there must be something wrong with the emulsion PCR? Are there primer contaminations in the room or something else that would cause priming dimers to appear?

              Are you doing de novo sequencing or just a resequencing experiment to validate the techniques? It might be useful to map the ok reads on an existing genome and see if the 60bp peak has a different 'mapping behaviour'. We have discovered that the down stream analysis can greatly improve the upfront library preparation. It takes some time to analyse your data, or write some scripts to automate it. But future experiments can be QC-ed in 1h and you can always use the results to optimize the upfront work.

              Comment


              • #8
                This looks like a shape of poor Titanium run.

                For good Titanium shotgun runs, the peak length usually at 500bp, and average length is around 400bp. Short length regions may have a hump, but should not have a big peak. In other word, the length distribution should be mainly one peak around 500bp.

                Besides mapping method suggested above, I would say use output of quality filter metrics of 454 pipeline to make judgement on the run.

                Run metrics prints out lots of information, particuarlly useful is those metrics of raw reads, filtered reads, dots, mixed etc breakdown. Use sffinfo (offline package 454 software) to see whether the short reads was trimmed by quality filter. Just run" sffinfo sffile" you will get all filtered reads information with non-trimmed full length there.

                Another way to easy tell something is wrong or not is to check reads count. Titanium should generate 1 million plus per run filtered reads passing quality filter. If you don't get this number of reads per run, then lots of raw reads were rejected (and trimmed) by quality filter, meaning something is wrong on library, EMPCR or whatever upstream procedure.

                If the short reads were not result of trimming and rejecting by quality filter, then the story may change. The sample may have massive amount of short reads (primer dimers etc) and 454 machine would favor those short fragement and sequencing them in junk amount, in a high quality manner.

                Comment


                • #9
                  We are seeing some similar runs, intermittent.

                  Something else we tried to eliminate library prep issues was to do a filter template and in <qualityFilter> add a <doPrimerTrimming>false</doPrimerTrimming>


                  This will allow you to see the B-adaptor sequence if present. We would expect library issues to present B-adaptors on short sequences.

                  joa_ds what did you do for PCR optimization?

                  Comment


                  • #10
                    well, we did amplicon sequencing and had a pcr reaction prior to the emPCR, which of course amplified problems that were already there.

                    We added a purification step prior to the emPCR. I don't know the details, as I am not in the lab, but i know they are using some sort of chromatographic technique to remove short sequences 50-100bp long. This simple technique improved our mapping efficiency from around 60% to around 95%.

                    We add MID tags and detection of an MID is also something we use as a quality check to verify if sequences are ok. We expect very short sequences to have a reverse complement of the MID at the end of a sequence.

                    Another thing we do is map and check if the mapped sequence is just the primersequence, they appear ok at first sight, but something is wrong of course then. After using the chromatography, and doing some minor tweaking, the distribution looks like what it should be

                    Comment


                    • #11
                      Following hlu's suggestion (thanks did not realize the sff file had complete reads) I dug into a run that had a similar profile to andpet's

                      In this case the length is there it is just low quality, ie the untrimmed lengths have a mode right around 500.

                      Looking at some of the trimmed sequences they are trimmed for a reason, low quality scores are evident. The question still is though why do we get reads with low quality?

                      Anybody know how to dig out the signals for a particular read, I guess I have the pixel coordinates ...
                      Attached Files

                      Comment


                      • #12
                        Hi everybody,
                        we recently upgraded to the Titanium system. What we observed is that the breaking step of the Large Volume emPCR could be tricky.
                        Sometimes we found some oil above the beads pellet (usually more visible at the end of the procedure when the beads are transferred to a 1,5 mL tube).
                        In the Small Volume preparation (where the breaking is performed by using the filters), performed to titrate the library, we didn't find oil on the beads.
                        Both the beads from the Small and the Large preparation were sequenced and we obtained good results for the "Small" (expected number of reads, median length 450 bp) and very bad results for the "Large" ones (40-50% of the expected reads, median length 260 bp).
                        The output graph of the "oily sample" was very similar to that obtained by Andeas, while for the other beads it was as expected.
                        As the library used for both preparations was the same, we hypothesize the oil could hamper something during the enrichment steps or during the sequencing reaction.

                        Comment


                        • #13
                          Hi Bia,

                          Thanks for sharing your emPCR experience. We got problem of too low % filter passed reads -- about 25%. One question, were the % filter passed reads differ or similar between your results from Large vs. Small emPCR?

                          Comment


                          • #14
                            hlu,

                            what do you mean by primer dimers in the sequencing reaction? wouldnt these be removed by SPRI during LC?

                            Comment


                            • #15
                              Originally posted by boss_hoss View Post
                              hlu,

                              what do you mean by primer dimers in the sequencing reaction? wouldnt these be removed by SPRI during LC?

                              Sometimes, the content of reads were very short, 40bp, mostly formed from a pair of primer sequences.

                              This happened in the past in GS-FLX, and GS-20 times.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Exploring the Dynamics of the Tumor Microenvironment
                                by seqadmin




                                The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                                07-08-2024, 03:19 PM
                              • seqadmin
                                Exploring Human Diversity Through Large-Scale Omics
                                by seqadmin


                                In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                                06-25-2024, 06:43 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 07-19-2024, 07:20 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-16-2024, 05:49 AM
                              0 responses
                              41 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-15-2024, 06:53 AM
                              0 responses
                              46 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 07-10-2024, 07:30 AM
                              0 responses
                              43 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X