Header Leaderboard Ad

Collapse

samtools merge

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • samtools merge

    Hallo everybody,

    I am running a bowtie assembly for drosophila.
    After the bowtie is finished I'm piping the unmapped reads to tophat to see if I can map some more reads onto the same reference genome.

    The sam out from bowtie I than convert into bam and tophat make automatically a bam file
    After finishing both runs, I would like to combine both bam files with the samtools merge command:

    Code:
    samtools merge -h dilptotal.sam dilptotal_2.bam dilptotal_bowtie.bam dilp_tophat.bam
    but I'm keep getting this error message:
    Code:
    [bam_merge_core] different target sequence name: 'YHet' != '2L' in file 'dilp_tophat.bam'
    I don't exactly understand what this error means.

    I used for both runs the same reference genome. in both there are the chromosomes "2L' and 'YHet'.
    YHet is the heterochromatin part of the Y chromosome. It comes 4 times in the sorted bowtie bam file but over 4500 times in the sorted tophat bam file.
    '2L' reads I have many millions in both files.

    why does it has this problem? Is it because I don't have a header in my tophat output file with the chromosomes (@SQ)?

    can I set tophat to have an header in the sam or bam files?

    Thanks for ant advice,

    Assa

  • #2
    Make sure that the SQ lines in the header are the same (use samtools view -H). You will also need to sort them before merging (samtools sort).

    Comment


    • #3
      Originally posted by nilshomer View Post
      Make sure that the SQ lines in the header are the same (use samtools view -H). You will also need to sort them before merging (samtools sort).
      This is exactly my problem. tophat produces no header in the bam file.

      Can I change the setting so that tophat will create a header?
      Is there a header in the (temporary) sam files from tophat?

      Is it enough just to copy paste the header from the bam file from bowtie into the one from tophat?

      Comment


      • #4
        Originally posted by frymor View Post
        This is exactly my problem. tophat produces no header in the bam file.

        Can I change the setting so that tophat will create a header?
        Is there a header in the (temporary) sam files from tophat?

        Is it enough just to copy paste the header from the bam file from bowtie into the one from tophat?
        I'd love to know the answers to this too!

        cheers

        Comment


        • #5
          I got similar problem:

          I downloaded sam files from recent published study.
          Each sam file contains alignments of the reads to a single chromosome (hg19).
          I want to merge alignments into one file.
          Every sam file have only @SQ as header of its chromosome.

          For example:
          in chrY.sam
          @SQ SN:chrY LN:59373566
          in chrM.sam
          @SQ SN:chrM LN:16571

          I used:
          samtools view -T /data/pipeline_in/Genomes/Human_GRCh37/all.fa -Sb chrY.sam | samtools sort - chrY.sam.sorted
          samtools view -T /data/pipeline_in/Genomes/Human_GRCh37/all.fa -Sb chrM.sam | samtools sort - chrM.sam.sorted

          Then in order to merge them:
          samtools merge out chrM.sam.sorted.bam chrY.sam.sorted.bam

          I got this error:
          [bam_merge_core] different target sequence name: 'chrM' != 'chrY' in file 'chrY.sam.sorted.bam'

          What I need to do?
          from searching the net I got some clues this error is connected to the header?
          Do I need to replace the headers of the primary sam files?
          Where I find proper example for header?

          Thanks in advance,
          Oz Solomon

          Comment

          Latest Articles

          Collapse

          • seqadmin
            A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
            by seqadmin


            ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

            01-24-2023, 01:19 PM
          • seqadmin
            Introduction to Single-Cell Sequencing
            by seqadmin
            Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

            The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
            ...
            01-09-2023, 03:10 PM

          ad_right_rmr

          Collapse
          Working...
          X