Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • converting paired-end (PE) bam file to single-end (SE) fastq

    Hi:
    while working with COAD TCGA BAM files, I find the very annoying to find PE reads. These files are mashed up and not consistent.
    for example:
    1. read lengths are not consistent. Some are 34 some 76 reads.
    2. Many reads miss mate or pair.

    I want to identify novel splicing differences however TCGA BAM files are mapped to known transcripts (known exon pairing from known isoforms gtf) thus limiting the discovery of novel isoforms.

    I decided convert BAM to fastq and realign to full genome.

    While doing this, because of loss of many pair and mates in bam, I converted them to single end fastq.

    Any ideas if converting a paired-end bam to single end fastq pose any problem in philosophical ways.

    thanks

  • #2
    Yes, you'll be expected to decrease your mapping efficiency a bit, since one mate can act as an anchor to rescue the other. Further, it's much easier to use paired-end reads to find isoforms, since you're then not relying solely on alignments over a splice junction.

    Comment


    • #3
      Yes thats a disadvantage I agree.

      Unfortunately, the bam file does not have enough PE reads.

      When I used bamtofastq for PE fastq files, interestingly I obtained 0 fastq reads.

      Comment


      • #4
        Adrian,

        You can try running repair.sh to split the file into paired and unpaired reads, and then map twice, once for the paired and once for the unpaired, and then merge the bam files. That will allow maximal use of the available information.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Non-Coding RNA Research and Technologies
          by seqadmin




          Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

          Nobel Prize for MicroRNA Discovery
          This week,...
          10-07-2024, 08:07 AM
        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:55 AM
        0 responses
        9 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        105 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        114 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-30-2024, 08:33 AM
        1 response
        118 views
        0 likes
        Last Post EmiTom
        by EmiTom
         
        Working...
        X