Seqanswers Leaderboard Ad

**Apexy** · 11-16-2012, 05:42 AM

Hi Meher,
That may need a small script. This necessitates that you copy paste the first sequence from both files to give one an idea of how to help you.

**meher** · 11-16-2012, 05:54 AM

Originally posted by Apexy View Post

Hi Meher,
That may need a small script. This necessitates that you copy paste the first sequence from both files to give one an idea of how to help you.

Hi,

I can provide the few line but before that, i have a question,

Does it make any difference to the final alignment result if we tag them and perform the alignment d generate a single bam file when compared to aligning independently and merging the 2bam files of two samples into a single bam file?.

Would there be any bias in the alignment if we choose one method over the other?

**kmcarr** · 11-16-2012, 05:59 AM

Better way is to perform the alignments separately, assigning unique read group IDs (some aligners, e.g. bowtie, will add read group IDs during alignment) and then merging the BAM files before proceeding to variant detection. Pay attention to the header information which is attached to the merged output as you need to make sure that every read group ID present in the file is referenced in the header. samtools merge does not handle this automatically, you have to supply a properly formatted header. I'm not sure if Picard MergeSamFiles properly merges the header or not.

But I do wonder why you want to do this. GATK does not require merged BAM files; from the GATK Best Practices document:

Because the GATK can dynamically merge BAM files, it isn't critical to have merged files by lane into sample bams, or even samples bams into cohort bams.

**Apexy** · 11-16-2012, 06:04 AM

Hello Meher,
I do not think it matters if the insert size in both sample is expected to be the same. At least with bowtie (specified by -1 and -2) all you need is to tell it which file is which. However, you must pay particular attention during merging in relation to header info. There is an extensive manual here

**meher** · 11-16-2012, 06:13 AM

Originally posted by kmcarr View Post

Better way is to perform the alignments separately, assigning unique read group IDs (some aligners, e.g. bowtie, will add read group IDs during alignment) and then merging the BAM files before proceeding to variant detection. Pay attention to the header information which is attached to the merged output as you need to make sure that every read group ID present in the file is referenced in the header. samtools merge does not handle this automatically, you have to supply a properly formatted header. I'm not sure if Picard MergeSamFiles properly merges the header or not.

But I do wonder why you want to do this. GATK does not require merged BAM files; from the GATK Best Practices document:

Yes, it is not required to merge bams. The actual task which i want to accomplish is to detect the variants from the two samples in a single VCF file and infer the depth of the variant from both the samples(i.e if a variant has depth 100, i would like to find how many of the reads came from each of these samples). Performing multisample variant calling on the two bam files using GATK will accomplish this.

But, I would really like to know if there could be any biases in doing as described as above. when compared to doing a single alignment by tagging the reads before alignment and then performing variant calling.

Which of these would get rid of any biases, if they are supposed to be present.

**meher** · 11-16-2012, 07:20 AM

Originally posted by Apexy View Post

Hello Meher,
I do not think it matters if the insert size in both sample is expected to be the same. At least with bowtie (specified by -1 and -2) all you need is to tell it which file is which. However, you must pay particular attention during merging in relation to header info. There is an extensive manual here

Hi any way these are the first few lines,
sample1_1.fastq

@HWI-ST188:1:1101:1225:2112#0/1
AGANAGTAAGTAAAATCTATTATGATATTCTTATAAAGAAAAGCCCACTTTTGAAGATTTCAGAAGTGCTTCTAAAGGAGGTAGCGCGGCATAATACTGGG
+
Z^_BS\ccgg`eghhhhhhhhhhhhhhhhhhhhhhhhgggdcfhhhhhhhhhdhghhfhbghhff]]egfdghf]cdgfbdTZacebbababb_bb]`cb`
@HWI-ST188:1:1101:1221:2160#0/1
TTCNAATAAAATAAATAAAAGATGAGATGAATATTCATTTTGACTTCATTTTCTACTTTTTTTTCAGAATACTTAAAGTTTGAGAGAAATGTGAGACAACT
+
__bBS`ccggcggiihhfghicghhiieghihehihfibghifhehhffhiiiiffghiiiiihdggg_b`bddbbcbabdd`_`bc``Y_bbZ_T_^BBB

sample1_2.fastq

@HWI-ST188:1:1101:1225:2112#0/2
ATGAATCAGATTGAAAATGCAAACTGTGACATGAGGCAGAGGCATTTATTTTATTTNGTGGGGAATCGGGAAAGGAAATTGCTAGGTTTCTGCAGCCCCAG
+
bbbeeeeegffgcgifhhihihiiif`agh`ghifhhhhiiihhcffXagXcce_cBL[Z_eaghfeedcS\^`dcbZZZ`b`bY^T]_bb]RGYba^[^_
@HWI-ST188:1:1101:1221:2160#0/2
TTAAATCTTAAAAGTGTATGTAAAAATGTTCAAAATATTAGTTTTCTTTAAATTTTNGTAGAAAAGGCATTATCTTCACATTAAGTGACATGAGATAACGC
+
bbbeeeeegggfghQbK`hhbigiiieh[ddgdgfhbgfffS^fddgiiidXaeSXBOO^eg`efbghfYHWbee_cffgccV`g]b_gHZZZZ^Y_bBBB

**Apexy** · 12-10-2012, 02:45 AM

Hello,

Something must have made me forget to replythis. if you want to merge two fastq files, use the attached script. However, I cannot relate the info ( two samples) with the sequences you provided. If you have 2 samples (paired), then you should in fact have 4 files. I think you have just provided a forward seq from sample1_1.fastq and a reverse seq from sample1_2.fastq

Attached Files

merge.fastq.pl (581 Bytes, 27 views)

Topics	Statistics	Last Post
The Role of Spliceosomes in RNA Splicing and Genome Evolution by seqadmin Started by seqadmin, Yesterday, 07:03 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 07:03 AM
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, 05-10-2024, 06:35 AM	0 responses 36 views 0 likes	Last Post by seqadmin 05-10-2024, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 43 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 38 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM

Seqanswers Leaderboard Ad

Announcement

How to tag reads for alignment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News