Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    If you are working in LINUX, you can use awk as follows:

    awk '$12 ~ /Y/{print "@"$1"_000"$2":"$3":"$4":"$5":"$6"#"$7"/"$8"\n"$9"\n+"$1"_000"$2":"$3":"$4":"$5":"$6"#"$7"/"$8"\n"$10}' s_1_export.txt > s_1_sequence.txt
    Veronica Jimenez Jacinto
    UUSM.

    Comment


    • #17
      Dear All,
      I am anew user and i am analyzing Illumina NGS data. I downloaded the bowtie on Linux on 32 bit Linux system for reference based assembly. I sucessfully follow its tutorial for aligning an exemplary data already given within software folder. But I am stuck at Samtool step of aligning visualization. could some one please help me beyond that step. I thing i can,t compiled accurately the Samtool. could you please provide ready to run compiled version of samtool for 32 bit Suse linx system. I will higly oblige. my email address for corresponding is ([email protected]).
      Thanks all and sorry if my question is too silly as i am a new user of bowite.

      Asif

      Comment


      • #18
        Originally posted by kwebb View Post
        Hi

        I'm trying to work through some of the various assembler programs before actually collecting my own Illumina data. I've found some test datasets here:



        but I'm not sure if the file formats are the same as raw data from the Genome Analzyer.

        The files are s_4_seq.txt and s_4_prb.txt and the first few lines look like this:
        s_4_seq.txt
        4 1 56 910 AACTTACAATTGAAAATATAAACTCAT
        4 1 64 716 AAGATGATTATATGTCTTCCTTTTCGA
        4 1 890 894 TCAAACCAATCAGACCTATGTTTCATA

        s_4_prb.txt
        40 -40 -40 -40 40 -40 -40 -40 -40 40 -40 -40 -40 -4
        0 -40 40 -40 -40 -40 40 40 -40 -40 -40 -40 40 -40
        -40 40 -40 -40 -40 40 -40 -40 -40 -40 -40 -40 40

        So my questions are
        1. Is this the raw data format from the machine?
        2. How do I get these files into fastq format? The maq converter and sanger perl scripts previously mentioned do not seem to work.

        Thank you!
        Hi,

        I my self facing the same format within my illumina sequencing file which you have shown here. could you please provide me any perl script for converting such data in to fasta or fastq format. i will be highly oblige to find any guidelines from your side. my email address for corresponding is (asifullah111"gmail.com).

        regards
        asif

        Comment


        • #19
          Originally posted by vjimenez View Post
          If you are working in LINUX, you can use awk as follows:

          awk '$12 ~ /Y/{print "@"$1"_000"$2":"$3":"$4":"$5":"$6"#"$7"/"$8"\n"$9"\n+"$1"_000"$2":"$3":"$4":"$5":"$6"#"$7"/"$8"\n"$10}' s_1_export.txt > s_1_sequence.txt
          just to clarify, is this to convert the format SCARF ASCII mentioned above? is there any quality trimming done? because I got a file that was smaller than what I expected. I started out with file that has 43,236,910 reads to a file that has 80,81,040 lines. here is sample of input to I take it same as above post
          HWI-EAS393 0031 5 1 1295 9710 0 3 AGACGTGTGTCTGAGTAAGGAACCCGCGGGGAAGGG ]PLLPU\]Z_`^`L`aL^`LYb^bbc`^^cH``TL^ c10.fa 130687332 F 3A26T3T1 70 188 128 R Y
          Last edited by husamia; 08-25-2010, 11:44 AM.

          Comment


          • #20
            The awk line only outputs sequences with Y in the 12th (QC??) field. If you want all sequences in fastq output, you can do

            awk ' {print "@"$1"_000"$2":"$3":"$4":"$5":"$6"#"$7"/"$8"\n"$9"\n+"$1"_000"$2":"$3":"$4":"$5":"$6"# "$7"/"$8"\n"$10}' s_1_export.txt > s_1_sequence.txt

            caveats that I don't know awk , but output seems correct.

            Comment


            • #21
              Originally posted by alig View Post
              To lparsons,

              Thank you. Yes I realised that later after I'd sent my post.

              Also in case anyone else is looking to separate a fastq file into seq.fasta & qual.fasta files you actually need the other command within Maq

              fq_all2std.pl std2qual <out.prefix> <in.fastq>

              Thanks again

              alig
              Hi,

              I need to convert Illumina files into .seq and .qual for Phrap. I am unable to find the newest version of "fq_all2std.pl" with the "std2qual". Is there any other program that would convert the Illumina quality characters into phred qualities?

              Thanks,
              Charu

              Comment


              • #22
                fq_all2std

                Hi,

                the "std2qual" is part of the perl script "fq_all2std.pl" which comes with maq-0.7.1

                thanks

                ali

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  Yesterday, 07:48 AM
                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 06:57 AM
                0 responses
                4 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 07:17 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-02-2024, 08:06 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-30-2024, 12:17 PM
                0 responses
                21 views
                0 likes
                Last Post seqadmin  
                Working...
                X