Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Only about 20% reads are mapped using bfast

    Hi all,

    I used bfast+bwa. For the F3 reads, only about 20% reads are mapped using bfast match. Is this number good enough?

    Below is the output from bfast match
    ************************************************************
    Printing Program Parameters:
    programMode: [ExecuteProgram]
    fastaFileName: /share/data9/genomes/human_all.fasta
    mainIndexes [Auto-recognizing]
    secondaryIndexes [Not Using]
    readsFileName: Pla0000325656_1_PE_HS26621_1_F3_.15.fastq
    offsets: [Using All]
    loadAllIndexes: [Not Using]
    compression: [Not Using]
    space: [Color Space]
    startReadNum: 1
    endReadNum: 2147483647
    keySize: [Not Using]
    maxKeyMatches: 8
    maxNumMatches: 384
    whichStrand: [Both Strands]
    numThreads: 8
    queueLength: 250000
    tmpDir: /share/data11/solid/tmp/
    timing: [Using]
    ************************************************************
    Searching for main indexes...
    Found 4 index (4 total files).
    Not using secondary indexes.
    ************************************************************
    Reading in reference genome from /share/data9/genomes/human_all.fasta.cs.brg.
    In total read 25 contigs for a total of 3080436051 bases
    ************************************************************
    Reading Pla0000325656_1_PE_HS26621_1_F3_.15.fastq into a temp file.
    Will process 1000000 reads.
    ************************************************************
    Searching index file 1/4 (index #1, bin #1)...
    Reading index from /share/data9/genomes/human_all.fasta.cs.1.1.bif.
    Read index from /share/data9/genomes/human_all.fasta.cs.1.1.bif.
    Reads processed: 1000000
    Cleaning up index.
    Searching index file 1/4 (index #1, bin #1) complete...
    Found 185475 matches.
    ************************************************************
    Searching index file 2/4 (index #2, bin #1)...
    Reading index from /share/data9/genomes/human_all.fasta.cs.2.1.bif.
    Read index from /share/data9/genomes/human_all.fasta.cs.2.1.bif.
    Reads processed: 1000000
    Cleaning up index.
    Searching index file 2/4 (index #2, bin #1) complete...
    Found 230886 matches.

    ...

  • #2
    Good enough for what? We need to know your experimental design. E.g., transcriptome project? whole genome resequencing? environmental sampling?

    Depending on the project I've had people who were ecstatic that we obtained 3% mapping and people who were not happy with 75% mapping.

    If I was to make a wild guess, since you are working with human and SOLiD data, I would say that 20% is too low. But we do need to know what you are trying to accomplish before anything definitive can be said.

    Comment


    • #3
      Thanks, the reads are from human genome sequencing data. I just checked the data and found the poor mapping rate was caused by the high error rate of the data. When we use better quality data, the mapping rate is more than 65%.


      Originally posted by westerman View Post
      Good enough for what? We need to know your experimental design. E.g., transcriptome project? whole genome resequencing? environmental sampling?

      Depending on the project I've had people who were ecstatic that we obtained 3% mapping and people who were not happy with 75% mapping.

      If I was to make a wild guess, since you are working with human and SOLiD data, I would say that 20% is too low. But we do need to know what you are trying to accomplish before anything definitive can be said.

      Comment


      • #4
        Why are you only using 4 indexes instead of the recommended 10? If your bad colors are periodic (eg every 5th color is missing) you may want to design a new index that compensates for it to get decent mapping rates.

        Comment


        • #5
          Originally posted by westerman View Post
          Good enough for what? We need to know your experimental design. E.g., transcriptome project? whole genome resequencing? environmental sampling?

          Depending on the project I've had people who were ecstatic that we obtained 3% mapping and people who were not happy with 75% mapping.

          If I was to make a wild guess, since you are working with human and SOLiD data, I would say that 20% is too low. But we do need to know what you are trying to accomplish before anything definitive can be said.
          You couldn't have said it better. There is a huge disparity between Biologists' 'ideals' and the realities of mapping throughput.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Understanding Genetic Influence on Infectious Disease
            by seqadmin




            During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

            Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
            09-09-2024, 10:59 AM
          • seqadmin
            Addressing Off-Target Effects in CRISPR Technologies
            by seqadmin






            The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
            08-27-2024, 04:44 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 06:25 AM
          0 responses
          9 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 01:02 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-18-2024, 06:39 AM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-11-2024, 02:44 PM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Working...
          X