Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • thh32
    Member
    • Feb 2014
    • 60

    BWA with multi read files

    Hi I have never used BWA before and so was wondering if anyone knows how I can use BWA to map my reads to my reference genome. My data is as follows:

    L001_R1.fasta
    L001_R2.fasta
    L002_R1.fasta
    L002_R2.fasta
    etc
    There are 12 files in total and as you can see they are paired reads.
    So does anyone have any idea?
    Any help would be much appreciated.

    Thanks,
    Tom
  • thh32
    Member
    • Feb 2014
    • 60

    #2
    Also I have indexed my referance genomes (I have 93 genomes together in a single fasta) and been provided with 5 files : Combined_genomes_index_bwa.amb Combined_genomes_index_bwa.ann Combined_genomes_index_bwa.bwt Combined_genomes_index_bwa.pac Combined_genomes_index_bwa.sa

    How do I use these for my mapping as I want to use bwa mem as the reads length range from 40 to 140 but the manual doesnt state how to input these index files

    bwa mem [-aCHMpP] [-t nThreads] [-k minSeedLen] [-w bandWidth] [-d zDropoff] [-r seedSplitRatio] [-c maxOcc] [-A matchScore] [-B mmPenalty] [-O gapOpenPen] [-E gapExtPen] [-L clipPen] [-U unpairPen] [-R RGline] [-v verboseLevel] db.prefix reads.fq [mates.fq]

    Again, any help would be great.

    Comment

    • bruce01
      Senior Member
      • Mar 2011
      • 160

      #3
      From the manual:

      Code:
      bwa mem ref.fa read1.fq read2.fq > aln-pe.sam
      Just make sure your fasta file is in the same dir as your index files and there shouldn't be any problems.

      As to your first question, I presume the above command shows how to use PE data?

      Comment

      • mastal
        Senior Member
        • Mar 2009
        • 666

        #4
        Use the base name of the index files as 'db.prefix', so you would use
        'Combined_genomes_index_bwa' as the name of the index (and provide the complete path to the index files if necessary).

        The online manual doesn't seem to specify what to do if your reads are in more than 1 file. Try listing all the R1 files separated by commas but no spaces, followed by the R2 files, again separated by commas but no spaces.

        Code:
         L1_R1.fastq,L2_R1.fastq,L3_R1.fastq... L1_R2.fastq,L2_R2.fastq,L3_R2.fastq...
        If that doesn't work you could combine all your R1 files together and all your R2 files.

        Comment

        • thh32
          Member
          • Feb 2014
          • 60

          #5
          bruce01, I have tried your method and it just didnt work at all, no error message just a list of the commands.Will try the method you suggest mastal now.
          Last edited by thh32; 02-26-2014, 05:45 AM.

          Comment

          • thh32
            Member
            • Feb 2014
            • 60

            #6
            Right using mastal's method this error occured [E::main_mem] fail to open file ' LIST OF ALL FILES'

            It appears to be that if I leave spaces between the files it just ignores the whole thing and provides a list of the commands however if I remove the space and put commas then it see's it as one long file name which it cannot locate.

            Comment

            • bruce01
              Senior Member
              • Mar 2011
              • 160

              #7
              OK, can you tell me the exact command you used, and whether you want to align all your fastq at the same time (ie L001+L002+...+L00n), or individually (ie L001 makes single SAM file)?

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Today, 05:37 AM
              0 responses
              5 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              16 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              50 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              110 views
              0 reactions
              Last Post SEQadmin2  
              Working...