Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • biouser
    Junior Member
    • Aug 2012
    • 7

    sequence alignment

    Hi all,
    I want to align some read data in fasta format. i use bowtie short read aligner. but before i align them, i need a refrence sequence. im new in bioinformatics and searched about refrence seq and didnt find anything useful about why we need refrence seq for read alignment.
    please help me on understanding that and how i can download required refrence sequences.

    thank you all. justin
    Last edited by biouser; 08-14-2012, 11:46 AM.
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    You need something to align against, that is the purpose of the reference sequence. What organism is your sequencing from? That will pretty much answer the question of what to download.

    Comment

    • biouser
      Junior Member
      • Aug 2012
      • 7

      #3
      there is a fasta file containing 15million reads which is 454 sequences of Human HapMap, downloaded from genomic paired-end library from ncbi.

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        So, you have a bunch of reads from a human and you want to know where they map. For that you would need a reference human genome sequence. You could use the one from NCBI or the 1000 genome project (there are probably others, I actually don't know off-hand if the NCBI reference differs from that of the 1000 genomes project as I don't do any human sequencing).

        Comment

        • biouser
          Junior Member
          • Aug 2012
          • 7

          #5
          in NCBI ftp there is 2 kinds of files, some are in .fa format and other in .rm.out
          which one is used for refrence sequence?
          i got output from bowtie as below :

          # reads processed: 15281579
          # reads with at least one reported alignment: 610764 (4.00%)
          # reads that failed to align: 14670815 (96.00%)
          Reported 610764 alignments to 1 output stream(s)

          what is the meaning of this output report? does it mean that 4% of reads belong to chromosome 2 that i used as refrence sequnce?

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            You'll want the fa (fasta format) files. The rm.out files are from repeat masker.

            Comment

            • biouser
              Junior Member
              • Aug 2012
              • 7

              #7
              Thank you dpryan ,
              and what about second question? The Bowtie report?

              Comment

              • dpryan
                Devon Ryan
                • Jul 2011
                • 3478

                #8
                Originally posted by biouser View Post
                Thank you dpryan ,
                and what about second question? The Bowtie report?
                Ah, I missed that, mea culpa. It really just means that only 4% aligned. The remainder may not have aligned because (1) they didn't come from chromosome 2 (2) you didn't quality trim prior to alignment and so things couldn't align or (3) there adapter contamination that wasn't trimmed that caused misalignment. For your real run, I would use the "cat" command to concatenate the various chromosomes into a single file, which would then be indexed and mapped against. Since you're using bowtie, you might be able to download prebuilt indexes form the bowtie website. That'll save you a bit of time!

                Comment

                • biouser
                  Junior Member
                  • Aug 2012
                  • 7

                  #9
                  For your real run, I would use the "cat" command to concatenate the various chromosomes into a single file, which would then be indexed and mapped against. Since you're using bowtie, you might be able to download prebuilt indexes form the bowtie website. That'll save you a bit of time!
                  no hay problema. actualy building an index of reference sequence took only 3minutes and alignment against it took several hours.
                  but, is "cat" command one of bowtie's commands? or it is possible using other softwares?

                  Comment

                  • dpryan
                    Devon Ryan
                    • Jul 2011
                    • 3478

                    #10
                    Ah, I assumed that you're using Linux or a Mac, in which case cat is a standard shell program. If you're using windows then I wouldn't have a clue, presumably there's something similar.

                    Comment

                    • biouser
                      Junior Member
                      • Aug 2012
                      • 7

                      #11
                      yes! i forgot that command. i certainly use it.
                      helped me alot dpryan.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                        Here are nine questions we think about, in roughly the order they matter, before...
                        Yesterday, 07:11 AM
                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      20 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      38 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      45 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      49 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...