Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA with multi read files

    Hi I have never used BWA before and so was wondering if anyone knows how I can use BWA to map my reads to my reference genome. My data is as follows:

    L001_R1.fasta
    L001_R2.fasta
    L002_R1.fasta
    L002_R2.fasta
    etc
    There are 12 files in total and as you can see they are paired reads.
    So does anyone have any idea?
    Any help would be much appreciated.

    Thanks,
    Tom

  • #2
    Also I have indexed my referance genomes (I have 93 genomes together in a single fasta) and been provided with 5 files : Combined_genomes_index_bwa.amb Combined_genomes_index_bwa.ann Combined_genomes_index_bwa.bwt Combined_genomes_index_bwa.pac Combined_genomes_index_bwa.sa

    How do I use these for my mapping as I want to use bwa mem as the reads length range from 40 to 140 but the manual doesnt state how to input these index files

    bwa mem [-aCHMpP] [-t nThreads] [-k minSeedLen] [-w bandWidth] [-d zDropoff] [-r seedSplitRatio] [-c maxOcc] [-A matchScore] [-B mmPenalty] [-O gapOpenPen] [-E gapExtPen] [-L clipPen] [-U unpairPen] [-R RGline] [-v verboseLevel] db.prefix reads.fq [mates.fq]

    Again, any help would be great.

    Comment


    • #3
      From the manual:

      Code:
      bwa mem ref.fa read1.fq read2.fq > aln-pe.sam
      Just make sure your fasta file is in the same dir as your index files and there shouldn't be any problems.

      As to your first question, I presume the above command shows how to use PE data?

      Comment


      • #4
        Use the base name of the index files as 'db.prefix', so you would use
        'Combined_genomes_index_bwa' as the name of the index (and provide the complete path to the index files if necessary).

        The online manual doesn't seem to specify what to do if your reads are in more than 1 file. Try listing all the R1 files separated by commas but no spaces, followed by the R2 files, again separated by commas but no spaces.

        Code:
         L1_R1.fastq,L2_R1.fastq,L3_R1.fastq... L1_R2.fastq,L2_R2.fastq,L3_R2.fastq...
        If that doesn't work you could combine all your R1 files together and all your R2 files.

        Comment


        • #5
          bruce01, I have tried your method and it just didnt work at all, no error message just a list of the commands.Will try the method you suggest mastal now.
          Last edited by thh32; 02-26-2014, 05:45 AM.

          Comment


          • #6
            Right using mastal's method this error occured [E::main_mem] fail to open file ' LIST OF ALL FILES'

            It appears to be that if I leave spaces between the files it just ignores the whole thing and provides a list of the commands however if I remove the space and put commas then it see's it as one long file name which it cannot locate.

            Comment


            • #7
              OK, can you tell me the exact command you used, and whether you want to align all your fastq at the same time (ie L001+L002+...+L00n), or individually (ie L001 makes single SAM file)?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advanced Tools Transforming the Field of Cytogenomics
                by seqadmin


                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                Yesterday, 06:26 AM
              • seqadmin
                How RNA-Seq is Transforming Cancer Studies
                by seqadmin



                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                09-07-2023, 11:15 PM
              • seqadmin
                Methods for Investigating the Transcriptome
                by seqadmin




                Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                Whole Transcriptome RNA-seq
                Whole transcriptome sequencing...
                08-31-2023, 11:07 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 06:57 AM
              0 responses
              6 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 07:53 AM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-25-2023, 07:42 AM
              0 responses
              14 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-22-2023, 09:05 AM
              0 responses
              44 views
              0 likes
              Last Post seqadmin  
              Working...
              X