Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie1's --12 option in Bowtie2

    Hi all,

    I would like to ask whether Bowtie2 can accept the .fastq in "1-read-per-line format (Tab-delimited format)"? (--12 option in Bowtie1)

    Thanks for the reply.

  • #2
    [FWIW, if it's in a single line, then it's not a fastq formatted file]

    I can't see any '--12' option in the help output of bowtie (could you point to the bowtie manual where it talks about this option?), but do notice that both bowtie and bowtie 2 support the 'raw' single-line file input type, and bowtie2 also accepts Illumina's qseq format:
    Code:
    Bowtie(v1):
    Input:
      -q                 query input files are FASTQ .fq/.fastq (default)
      -f                 query input files are (multi-)FASTA .fa/.mfa
    [b]  -r                 query input files are raw one-sequence-per-line[/b]
    Bowtie2:
     Input:
      -q                 query input files are FASTQ .fq/.fastq (default)
    [b]  --qseq             query input files are in Illumina's qseq format[/b]
      -f                 query input files are (multi-)FASTA .fa/.mfa
    [b]  -r                 query input files are raw one-sequence-per-line[/b]
    The raw format mentioned in that help doesn't seem to be what you want. The bowtie manual states "one sequence per line, without quality values or names. All quality values are assumed to be 40 on the Phred quality scale." No tabs to be found.

    Comment


    • #3
      Thanks gringer!

      Sorry for my unclear description.
      --12 is not an option. As mentioned in the Bowtie1 manual:


      It is one of the choices for read file input.

      Usage:
      bowtie [options]* <ebwt> {-1 <m1> -2 <m2> | --12 <r> | <s>} [<hit>]

      <r>
      Comma-separated list of files containing a mix of unpaired and paired-end reads in Tab-delimited format. Tab-delimited format is a 1-read-per-line format where unpaired reads consist of a read name, sequence and quality string each separated by tabs. A paired-end read consists of a read name, sequnce of the #1 mate, quality values of the #1 mate, sequence of the #2 mate, and quality values of the #2 mate separated by tabs.

      Comment


      • #4
        Huh, not sure how I missed that. I'm guessing based on a search through the bowtie2 manual for the word 'tab' that this option doesn't exist in bowtie2.

        FWIW, you should be able to convert a single-end tab-separated file to fastq with a quick awk script:
        Code:
        $ cat test.seq 
        Read1   ACATCAGGT       AFFABABAA
        Read2   ATTACAGAA       DEADBEEFA
        $ awk -F '\t' '{print "@" $1 "\n" $2 "\n+\n" $3}' test.seq
        @Read1
        ACATCAGGT
        +
        AFFABABAA
        @Read2
        ATTACAGAA
        +
        DEADBEEFA
        $ awk -F '\t' '{print "@" $1 "\n" $2 "\n+\n" $3}' test.seq > test.fastq
        Mixed reads would be a bit more complicated, and probably better output to separate files, so I'd degenerate to writing a Perl (or Python) script to do that because I'm not so comfortable with awk.
        Last edited by gringer; 10-20-2013, 08:01 PM.

        Comment


        • #5
          Thanks for gringer's information and advices.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          26 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          29 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X