Hi,
Is there away to check a sam file integrity? I generated one after mapping with bwa, but have problems on rebuilding a bam file from it.
In fact, when I use
$ samtools view -F12 -Sb /input_file.sam > new_bams/output_file.bam
the bam created is empty (4 kb, coming from a 113 Mb sam).
I tried many different ways, so I am wondering now if the problem is with the sam file.
Info:
-The starting bam file I used are paired-end ones of about 50 Mb and 500000 reads.
-The creation of fastq file goes well, generating 2 files for the two reads, with the same number of reads of the bam, and size of 53 Mb.
-The sai file generated from the mapping process with BWA are around 1 Mb.
-The new sam file is generated through this command:
$ bwa sampe /ref_genome.fa /bwa_read1.sai /bwa_read2.sai /read1.fastq /read2.fastq > /output.sam
-The header and first lines of the sam are these:
@SQ SN:gi|9633069|ref|NC_000898.1| LN:162114
@SQ SN:gi|224020395|ref|NC_001664.2| LN:159322
@SQ SN:gi|82503188|ref|NC_007605.1| LN:171823
@SQ SN:gi|139424470|ref|NC_009334.1| LN:172764
@PG ID:bwa PN:bwa VN:0.6.1-r104
SRR360611.2626045 77 * 0 0 * * 0 0 GATGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGAC 7:C;7@?9C7A@@C@D@A>/;CCGBBADBBBBBEAEE9GHEEFHGKGKJIJKHIIIJKIFJIIGJCC?JKKLKKKJKHJKKJCKCKKJKKKIGHG>EEDB
SRR360611.2626045 141 * 0 0 * * 0 0 TGGTTCCTACTTCAGGGCCATAAAGCCTAAATAGCCCACACGTTCCCCTTAAATAAGACATCACGATGGATCACAGGTCTATCACCCTATTAACCACTCA 8DDEEGHIGGGGHIIEIIJIHHJJJJJKIJJGDJGKKIFHCCIIJKKKKIJKADJIGJFKI>JIAJGGEDB&:0EB>@7C8B6=;.>FACACB:@E=G@:
SRR360611.18087377 77 * 0 0 * * 0 0 ATGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACG <<???@A@ACC?<<C<B??D?GFBDEBAECBEEFAIBEKDJJHJIJHGIJKGIFIJKIF@IIIHCJKIJKKLKKGIJJKKJDJCKIKLIJIHHH?FDDB;
SRR360611.18087377 141 * 0 0 * * 0 0 GGAAGCTTTCTGTTGGCTCACATTTGGTTTATTGATGTAATGTATTGATGCTTCCCATAACGCCCTAAGTTCACACATCAACTGCAACTCCAAAGCCACC 8ECFGGFFFHIHF<ADIJIGDHHHBIIGFF@C?G@CE6BEGGEB>GDEEGCF>HK<F?<F&653&=2:?5?6/5=0B><:GCAGBEB+GF?D:@BB=B;@
SRR360611.8887247 77 * 0 0 * * 0 0 ATGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACG 5<;:=@A@ACA?=<C>FA:ABBFBDEB<BCBBECAIBHKGJHF?FJKGIJKIGIIJKIGIGIGHAF<IJKKLKKHIIGKKJDKCKKKLKKJHHH?FEDA;
-I am working on a Mac Terminal (OSX 10.5, or OSX 10.6, or on Linux based cluster)
Thanks in advance!
Is there away to check a sam file integrity? I generated one after mapping with bwa, but have problems on rebuilding a bam file from it.
In fact, when I use
$ samtools view -F12 -Sb /input_file.sam > new_bams/output_file.bam
the bam created is empty (4 kb, coming from a 113 Mb sam).
I tried many different ways, so I am wondering now if the problem is with the sam file.
Info:
-The starting bam file I used are paired-end ones of about 50 Mb and 500000 reads.
-The creation of fastq file goes well, generating 2 files for the two reads, with the same number of reads of the bam, and size of 53 Mb.
-The sai file generated from the mapping process with BWA are around 1 Mb.
-The new sam file is generated through this command:
$ bwa sampe /ref_genome.fa /bwa_read1.sai /bwa_read2.sai /read1.fastq /read2.fastq > /output.sam
-The header and first lines of the sam are these:
@SQ SN:gi|9633069|ref|NC_000898.1| LN:162114
@SQ SN:gi|224020395|ref|NC_001664.2| LN:159322
@SQ SN:gi|82503188|ref|NC_007605.1| LN:171823
@SQ SN:gi|139424470|ref|NC_009334.1| LN:172764
@PG ID:bwa PN:bwa VN:0.6.1-r104
SRR360611.2626045 77 * 0 0 * * 0 0 GATGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGAC 7:C;7@?9C7A@@C@D@A>/;CCGBBADBBBBBEAEE9GHEEFHGKGKJIJKHIIIJKIFJIIGJCC?JKKLKKKJKHJKKJCKCKKJKKKIGHG>EEDB
SRR360611.2626045 141 * 0 0 * * 0 0 TGGTTCCTACTTCAGGGCCATAAAGCCTAAATAGCCCACACGTTCCCCTTAAATAAGACATCACGATGGATCACAGGTCTATCACCCTATTAACCACTCA 8DDEEGHIGGGGHIIEIIJIHHJJJJJKIJJGDJGKKIFHCCIIJKKKKIJKADJIGJFKI>JIAJGGEDB&:0EB>@7C8B6=;.>FACACB:@E=G@:
SRR360611.18087377 77 * 0 0 * * 0 0 ATGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACG <<???@A@ACC?<<C<B??D?GFBDEBAECBEEFAIBEKDJJHJIJHGIJKGIFIJKIF@IIIHCJKIJKKLKKGIJJKKJDJCKIKLIJIHHH?FDDB;
SRR360611.18087377 141 * 0 0 * * 0 0 GGAAGCTTTCTGTTGGCTCACATTTGGTTTATTGATGTAATGTATTGATGCTTCCCATAACGCCCTAAGTTCACACATCAACTGCAACTCCAAAGCCACC 8ECFGGFFFHIHF<ADIJIGDHHHBIIGFF@C?G@CE6BEGGEB>GDEEGCF>HK<F?<F&653&=2:?5?6/5=0B><:GCAGBEB+GF?D:@BB=B;@
SRR360611.8887247 77 * 0 0 * * 0 0 ATGGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACG 5<;:=@A@ACA?=<C>FA:ABBFBDEB<BCBBECAIBHKGJHF?FJKGIJKIGIIJKIGIGIGHAF<IJKKLKKHIIGKKJDKCKKKLKKJHHH?FEDA;
-I am working on a Mac Terminal (OSX 10.5, or OSX 10.6, or on Linux based cluster)
Thanks in advance!
Comment