Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 150 bps Read Length Issue

    I am doing some experiment with BowTie. Now, I want to do experiment with 150 bps read length. So, I download it from here. And converted to fastq format. Now, I see, the fastq format looks like,

    @ERR103405.1 M10_151:1:2:12250:1321 length=302 ATTTACTGCCTTGTGTCTCCAGTGCGCTGAAAATACCTTTATCTTGAAATAAGTTAACTAACTCTTGGATACCTTTAATTAATGCTGGGTTACCACCAGAAATTGTAACGTGGTTAAATAAATCGCCACCAATACGTTTTAATTCATCATAGAACAGCTGGATGTGATTATCGCTGTAGCTGGTGTGATTCTGCATTTACTTGGGATGGTAGTGCTAAAGGCGATATAAAACTCATGACCGCTGAAGAAATTTATGATGAATTAAAACGTATTGGTGGCGATTTATTTAACCACGTTACAAT
    +ERR103405.1 M10_151:1:2:12250:1321 length=302 CCCFFFFFHHHHHHHIHJJJJJIIJJIJJIJJJIIGJJJJIIGIJJHIGIIJJIIIJIIJJIJEIJIJFIIIFJGHHGHHFFFFFFFEDCCACCDA?ABDDDDDDCDC@?<ABBBDDDDEDDDC<?B?@BDDDDDB>CC@C:>AADDCACDB@CFFFDDHHBFHEHIIIIIGJIHHEGHIIHE1C?D?GGGIIIIGIFI>BHHIJ@3CHBDGGICHGEHIIGHE>BEDEDE;ACCDDCCA?B=BBCDCCCC@@>>C@CDC>@DCDCDDD<<@?AC(2??BDBDBCDCDDCC::?881<?C>:
    Now in NCBI, they described it as "DNA for paried end (150bp) sequencing on an illumina MiSeq". But here it looks it is 302 bps read. Can anybody help me why it is given in above sequence, "length=302" while it is written in the page that it is a 150 bps read.

  • #2
    It's a paired end 151 cycle read

    Comment


    • #3
      Originally posted by NextGenSeq View Post
      It's a paired end 151 cycle read
      Thanks. But, I want to give input 150 bps length read to Bowtie Tool. So, what I should do ? I search for 150 bp and get those as result.

      Comment


      • #4
        For technical reasons, the error rates are higher for the last base. Those can be removed with a variety of tools (e.g., Trimmomatic). I suggest you search the wiki.

        Comment


        • #5
          Originally posted by HESmith View Post
          For technical reasons, the error rates are higher for the last base. Those can be removed with a variety of tools (e.g., Trimmomatic). I suggest you search the wiki.
          Thanks. But, it is not possible to get 150 bps read length .sar file and fed it into Bowtie ? Another point is: here (http://www.ncbi.nlm.nih.gov/sra/SRX145461) it says 1 forward, 151 reverse. Can you inform does it mean ?

          Comment


          • #6
            Obtaining 150bp of high-quality sequence data requires 151 cycle sequencing (followed by trimming of the final low-quality base). Paired-end sequencing doubles the number of cycles: 2x151=302. SRA contains the raw (i.e., untrimmed) data.

            Comment


            • #7
              Originally posted by HESmith View Post
              Obtaining 150bp of high-quality sequence data requires 151 cycle sequencing (followed by trimming of the final low-quality base). Paired-end sequencing doubles the number of cycles: 2x151=302. SRA contains the raw (i.e., untrimmed) data.
              Is paired end read (or 1 forward, 151 reverse) means first end is taken from DNA's forward stand and second one taken from DNA's reverse strand ? Means are they reverse complement ? Sorry, I have very little idea about Bioinformatics. Another point is,

              "Obtaining 150bp of high-quality sequence data requires 151 cycle sequencing (followed by trimming of the final low-quality base)" - is this means last base of 151 bps should be dropped by the tool ?

              Comment


              • #8
                The answers to your questions can be found by searching the forum.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Exploring the Dynamics of the Tumor Microenvironment
                  by seqadmin




                  The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                  07-08-2024, 03:19 PM
                • seqadmin
                  Exploring Human Diversity Through Large-Scale Omics
                  by seqadmin


                  In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                  06-25-2024, 06:43 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 05:49 AM
                0 responses
                15 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 07-15-2024, 06:53 AM
                0 responses
                27 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 07-10-2024, 07:30 AM
                0 responses
                38 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 07-03-2024, 09:45 AM
                0 responses
                204 views
                0 likes
                Last Post seqadmin  
                Working...
                X