Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • litali
    Member
    • Jul 2010
    • 78

    singeltons+contigs for 454 data

    Hi,
    I am interested to create one file which will include the singeltons and contigs together, Is there any way to create such a file using 454 softwares or do I have to have a script which extracts the names of the singeltons from the readstatus file and then extracts the sequences from the fasta file and adds them to the contigs file?
    Thanks alot!!!
  • Torst
    Senior Member
    • Apr 2008
    • 275

    #2
    I don't know of any script. I am replying to remind you that 454 sometimes uses PARTS of a read (the "left" end) and then puts the "right" end in 454ReadStatus.txt with the name "XXXXXX_right Singleton", so you'll need to think about what you want to do with those. eg.

    % grep Singleton 454ReadStatus.txt
    GHFU8EI02CHPMJ_left Singleton
    GHFU8EI02B9FPY Singleton
    GHFU8EI02CJNN4_right Singleton
    GHFU8EI02CA0E9 Singleton


    To get the .FASTA sequences from the .SFF file, you'll need to use "sffinfo":

    % sffinfo -seq file.sff > file.fasta

    Also, if you did paired end sequencing, the 454Scaffolds.fna file does NOT CONTAIN those contigs in 454Contigs.fna which failed to scaffold.

    Comment

    • westerman
      Rick Westerman
      • Jun 2008
      • 1104

      #3
      I thought that _left and _right should arise from paired end reads and not from split reads.

      Basically what I do is to:

      1) Grab the reads of choice from the 454ReadStatus.txt file and, optionally, the 454TrimStatus file

      2) Use sfffile to create a temporary sff file with just those reads

      3) Use sffinfo to extract the sequences.

      The rough steps are:

      fgrep '\tSingleton' 454ReadStatus.txt > /tmp/Singleton.tmp

      sfffile -o /tmp/Singleton.sff /tmp/Singleton.tmp mysff.sff

      sffinfo -s /tmp/Singleton.sff > Singleton.tfa

      Comment

      • flxlex
        Moderator
        • Nov 2008
        • 412

        #4
        Originally posted by Torst View Post
        Also, if you did paired end sequencing, the 454Scaffolds.fna file does NOT CONTAIN those contigs in 454Contigs.fna which failed to scaffold.
        That is not entirely true: 454Scaffolds.txt contains the scaffolds (at least two contigs with gap(s)) AND all unscaffolded contigs of at least 2kb. IMO they shouldn't have done that, but rather outputted a separate unscaffolded-contig file...

        Originally posted by westerman View Post
        sfffile -o /tmp/Singleton.sff /tmp/Singleton.tmp mysff.sff
        I guess you mean

        Code:
        sfffile -o /tmp/Singleton.sff -i /tmp/Singleton.tmp mysff.sff
        (note the '-i')

        Comment

        • Torst
          Senior Member
          • Apr 2008
          • 275

          #5
          flxlex,

          Originally posted by flxlex View Post
          That is not entirely true: 454Scaffolds.txt contains the scaffolds (at least two contigs with gap(s)) AND all unscaffolded contigs of at least 2kb. IMO they shouldn't have done that, but rather outputted a separate unscaffolded-contig file...
          Hmm, it appears you are correct. Thank you for replying! I had not noticed the "1 contig scaffolds" because, like you said, it is inconsistent and they get renamed to "scaffoldNNNNNN" ... but yes, when I examine 454Scaffolds.txt I can see many scaffolds which are made up of 1 contig only.. I find it hard to accept they would use a different threshold for "contigs becoming scaffolds" and "large contigs", and NOT output the separate unscaffolded contigs file too.

          Also, you suggest the cut-off is 2kbp, but in my example 10 of the 22 contigs are between 1356bp and 1870bp, which suggests maybe the cutoff is 1kbp?

          Either way - thank you muchly for catching my error!

          Comment

          • westerman
            Rick Westerman
            • Jun 2008
            • 1104

            #6
            Originally posted by flxlex View Post

            I guess you mean

            Code:
            sfffile -o /tmp/Singleton.sff -i /tmp/Singleton.tmp mysff.sff
            (note the '-i')
            Yes, that is what I get for pulling the code out of a script that I use instead of typing it in directly. Thanks for the correction.

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            10 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            45 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            105 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            125 views
            0 reactions
            Last Post SEQadmin2  
            Working...