Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • anusha
    Junior Member
    • Jan 2010
    • 6

    Trimming or filtering the data from Solid

    Hi ..

    I have got the solid csfatsa file and the data has lot of dots in the sample. So i there any way i can trim or filter the Data in Bowtie,Maq or in another tools that you could suggest.
    Here the sample of the data.

    >1_17_228_F3
    T.1002.312..0..001.0001.2....1..1..022..0....00... 0
    >1_17_469_F3
    T.1300.333..0..020.1020.0....0..1..020..3....01... 3
    >1_17_578_F3
    T.1002.321..0..021.0230.0....1..2..122..1....11... 2
    >1_17_581_F3
    T.1002.322..0..021.0011.0....2..1..021..2....20... 1
    >1_112_1012_F3
    T.0300.3330.110021010300000000022130200.31.1.01.0. 3
    >1_112_1459_F3
    T.1002.3201.002000003212002001021200223.10.3.30.2. 3
    >1_113_39_F3
    T.1000.3330.000001000010100000121000220.10.0.01.0. 0
    >1_113_233_F3
    T.1001.3211.012000000110102021021120220.00.0.31.2. 0
    >1_113_329_F3
    T.1300.3330.011220010200000000122130200.01.0.01.0. 3
    >1_113_835_F3
    T.1300.3330.012220010200002022122230202.01.1.01.2. 3

    Thanks
    Anu
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by anusha View Post
    Hi ..

    I have got the solid csfatsa file and the data has lot of dots in the sample. So i there any way i can trim or filter the Data in Bowtie,Maq or in another tools that you could suggest.
    Here the sample of the data.

    >1_17_228_F3
    T.1002.312..0..001.0001.2....1..1..022..0....00... 0
    >1_17_469_F3
    T.1300.333..0..020.1020.0....0..1..020..3....01... 3
    >1_17_578_F3
    T.1002.321..0..021.0230.0....1..2..122..1....11... 2
    >1_17_581_F3
    T.1002.322..0..021.0011.0....2..1..021..2....20... 1
    >1_112_1012_F3
    T.0300.3330.110021010300000000022130200.31.1.01.0. 3
    >1_112_1459_F3
    T.1002.3201.002000003212002001021200223.10.3.30.2. 3
    >1_113_39_F3
    T.1000.3330.000001000010100000121000220.10.0.01.0. 0
    >1_113_233_F3
    T.1001.3211.012000000110102021021120220.00.0.31.2. 0
    >1_113_329_F3
    T.1300.3330.011220010200000000122130200.01.0.01.0. 3
    >1_113_835_F3
    T.1300.3330.012220010200002022122230202.01.1.01.2. 3

    Thanks
    Anu
    SOLiD raw data is usually more liberally returned then say Illumina data so you will notice more reads with dots etc. You could use an aligner that can handle the dots (such as BFAST; I am the author of BFAST). I suspect the vast majority of your reads don't have dots. As for primer sequence, programs like BFAST will align those as an insertion at the end of the read.

    Comment

    • hersh
      Junior Member
      • Feb 2009
      • 9

      #3
      You can use csfasta quality filter file which is avaliable on solidsoftwaretools under denovo accessory tools.

      Comment

      • av_d
        Member
        • Sep 2009
        • 12

        #4
        u can use SOLiD™ Accuracy Enhancement Tool

        Comment

        • carmeyeii
          Senior Member
          • Mar 2011
          • 137

          #5
          SAET does not only deleting the reads which do not pass the threshold, or trimming the bases which do not either, but it also makes some sort of consensus of reads, which is something you might not want to do if you want to know that what is there is what came out of the machine, as is.

          I recommend q-trim from the cartwright lab, it is in python.

          Comment

          Latest Articles

          Collapse

          • SEQadmin2
            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
            by SEQadmin2


            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

            Here are nine questions we think about, in roughly the order they matter, before...
            06-18-2026, 07:11 AM
          • SEQadmin2
            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
            by SEQadmin2


            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
            ...
            06-02-2026, 10:05 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, Yesterday, 11:10 AM
          0 responses
          7 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-17-2026, 06:09 AM
          0 responses
          42 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          103 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          125 views
          0 reactions
          Last Post SEQadmin2  
          Working...