Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • albrown415
    Junior Member
    • Jun 2010
    • 8

    FASTQ to SAM conversion

    What is the best program to use for converting fastq (or eland extended) files to SAM format? Thanks!
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by albrown415 View Post
    What is the best program to use for converting fastq (or eland extended) files to SAM format? Thanks!
    Do you want to align the data first, or just represent the FASTQ data in the SAM format? Is the data paired end (mate-pair)?

    Comment

    • albrown415
      Junior Member
      • Jun 2010
      • 8

      #3
      You're right. This was a silly question. Most people map the reads at this point and choose an alignment program that outputs the data into the proper format. I'm working to get Bowtie running on my computer, which I believe should be able to input fastq and output SAM.

      Comment

      • maubp
        Peter (Biopython etc)
        • Jul 2009
        • 1544

        #4
        If you did want to convert the FASTQ to an unaligned SAM or BAM file, try this:

        Comment

        • albrown415
          Junior Member
          • Jun 2010
          • 8

          #5
          Thanks. That's great to know.

          Comment

          • ebatis2
            Junior Member
            • Jan 2012
            • 1

            #6
            Hi,

            I've received results from my first NGS run and like you posted, I'd like to convert my fastq file to a SAM file in order to upload and retrieve data from Galaxy. I'll need to map the reads and align them to the Rice genome...but is this something I could do on my MacOSX? I'm at a loss as far as how to retrieve the sequencing results! Any help would be greatly appreciated!!

            Comment

            • husamia
              Member
              • Apr 2010
              • 66

              #7
              Originally posted by ebatis2 View Post
              Hi,

              I've received results from my first NGS run and like you posted, I'd like to convert my fastq file to a SAM file in order to upload and retrieve data from Galaxy. I'll need to map the reads and align them to the Rice genome...but is this something I could do on my MacOSX? I'm at a loss as far as how to retrieve the sequencing results! Any help would be greatly appreciated!!
              FASTQ is the raw reads with qualities. SAM is format to describe reads and their alignment. This was hinted in the previous respond above. It seems you've got to align the reads first if you just received the raw FASTQ. Perhaps you should be asking how to align your reads. I can't help much because I am not sure what your goal is. Galaxy has a tutorial on how to align your reads and produce a SAM file. Check it out.

              Comment

              • amolkolte
                Junior Member
                • Dec 2012
                • 8

                #8
                Originally posted by husamia View Post
                FASTQ is the raw reads with qualities. SAM is format to describe reads and their alignment. This was hinted in the previous respond above. It seems you've got to align the reads first if you just received the raw FASTQ. Perhaps you should be asking how to align your reads. I can't help much because I am not sure what your goal is. Galaxy has a tutorial on how to align your reads and produce a SAM file. Check it out.
                I have the raw FASTQ reads and in order to perform de novo assembly using transAbySS, I need to feed the input in the form of bam or sam. Can you please shed some light on this.

                Comment

                • TheRob
                  Junior Member
                  • Feb 2010
                  • 1

                  #9
                  Hi Amolkote,

                  It is rather unusual for an assembly program to accept SAM/BAM input but not FastQ. I suspect it accepts FastQ, but I don't have any experience with transAbyss. Anyways, the only tool i know that will do the job (short of an awk or perl script, which can be dangerous) was mentioned above by maubp: FastqToSam.

                  Why are you using transAbyss exactly?

                  Comment

                  • Stroehli
                    Junior Member
                    • Jan 2011
                    • 6

                    #10
                    Hi,
                    I think you cannot run transAbyss on its own. Taking a short look at the manual, (http://www.bcgsc.ca/downloads/trans-...v1.2.0.doc.pdf) I figured you probably need to run Abyss first (see "Data Preparation" in the Workflow on page 7). Plus you might have to install all the external software mentioned in "Installation, 2. External Software" (page 5). Abyss will produce contigs (.fa) and the other aligners will produce the SAM/BAM files for you, so you don't have to convert them, if I got that right. Hope it helps.

                    Cheers,
                    Stroehli
                    MSc Bioinformatics student at the Free University Berlin , Germany

                    Comment

                    • amolkolte
                      Junior Member
                      • Dec 2012
                      • 8

                      #11
                      Thanks TheRob and Stroehli !!

                      TheRob - I was using transbyss for de-novo assembly, since I don't have a concrete reference to begin with.

                      Stroehli - I have used abyss to assemble the contigs. thank you.

                      Comment

                      • kurban910
                        Member
                        • Jul 2014
                        • 58

                        #12
                        fastq to sam

                        i have a raw reads dataset in format fastq, and i want to use it to find SNPs of the transcriptome data we have. after i searched some material i found that i can do it by using Samtools and SOAPsnp softwares, am i right? but before i use them i need to convert my raw reads fastq format to SAM format, right?
                        so i installed java, samtools and picard tools on my ubuntu 12.04(why i mention these here is because i am new at linux, so any suggestion would be appreciated). and then i write this commend in the terminal :
                        java -Xmx2g -jar FastqToSam.jar FASTQ=CD_ATGTCA_L007_R1_001.fastq.gz FASTQ2=CD_ATGTCA_L007_R2_001.fastq.gz OUTPUT=outputfile.sam PREDICTED_INSERT_SIZE=null QUALITY_FORMAT=Solexa SAMPLE_NAME=file4

                        then i got this :
                        Error: Unable to access jarfile FastqToSam.jar

                        i do not know what is going on.
                        i guess many people here may done these before ,so please anyone could share your knowledge ?!

                        Comment

                        • ajagannath.patro
                          Junior Member
                          • Jul 2014
                          • 3

                          #13
                          To access the jar, you can try giving complete path of the jar where it is installed. That should work.

                          Comment

                          • WhatsOEver
                            Senior Member
                            • Apr 2012
                            • 215

                            #14
                            Originally posted by kurban910 View Post
                            i have a raw reads dataset in format fastq, and i want to use it to find SNPs of the transcriptome data we have. after i searched some material i found that i can do it by using Samtools and SOAPsnp softwares, am i right? but before i use them i need to convert my raw reads fastq format to SAM format, right?
                            so i installed java, samtools and picard tools on my ubuntu 12.04(why i mention these here is because i am new at linux, so any suggestion would be appreciated). and then i write this commend in the terminal :
                            java -Xmx2g -jar FastqToSam.jar FASTQ=CD_ATGTCA_L007_R1_001.fastq.gz FASTQ2=CD_ATGTCA_L007_R2_001.fastq.gz OUTPUT=outputfile.sam PREDICTED_INSERT_SIZE=null QUALITY_FORMAT=Solexa SAMPLE_NAME=file4

                            then i got this :
                            Error: Unable to access jarfile FastqToSam.jar

                            i do not know what is going on.
                            i guess many people here may done these before ,so please anyone could share your knowledge ?!
                            SAM is the abbreviation for Sequence Alignment/Map format, which tells you that it should contain aligned/mapped reads. Though it is possible to create a kind of unmapped SAM file from fastq, this will be useless to address your question.

                            My suggestion: Make yourself familiar with read alignment via tophat (the software is here: http://ccb.jhu.edu/software/tophat/tutorial.shtml; the paper is here: http://www.nature.com/nprot/journal/....2012.016.html) and samtools in general (I suggest Dave Tang's brief wiki: http://davetang.org/wiki/tiki-index.php?page=SAMTools) and samtools mpileup in particular (http://samtools.sourceforge.net/mpileup.shtml)

                            Comment

                            • dpryan
                              Devon Ryan
                              • Jul 2011
                              • 3478

                              #15
                              Since you mention SNP calling, you'll want to use a tools like BWA or bowtie2 rather than tophat for alignment. Aside from that, I'm in agreement with WhatsOEver.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Today, 08:59 AM
                              0 responses
                              8 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              15 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...