Hi all,
I am using bowtie2. Recently I am facing some problem in mapping. I am downloading yeast data from http://www.ebi.ac.uk/ena/data/view/ERX010187. My data showing good in FASTQC but the alignment is not good and finally I am getting nothing. Can anyone help me with this?
Best Regards
Zillur
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ ls
ERR029138_1.fastq ERR029138_1.fastq.gz ERR029138_2.fastq ERR029138_2.fastq.gz
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ bowtie2 -x ../genome -1 err029138_1.fastq -2 err029138_2.fastq -S eg.sam
30909942 reads; of these:
30909942 (100.00%) were paired; of these:
30909942 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
30909942 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
30909942 pairs aligned 0 times concordantly or discordantly; of these:
61819884 mates make up the pairs; of these:
43136222 (69.78%) aligned 0 times
557574 (0.90%) aligned exactly 1 time
18126088 (29.32%) aligned >1 times
30.22% overall alignment rate
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ samtools view -bS eg.sam > eg.bam
[W::sam_hdr_parse] duplicated sequence 'I'
[W::sam_hdr_parse] duplicated sequence 'VI'
[W::sam_hdr_parse] duplicated sequence 'III'
[W::sam_hdr_parse] duplicated sequence 'IX'
[W::sam_hdr_parse] duplicated sequence 'VIII'
[W::sam_hdr_parse] duplicated sequence 'V'
[W::sam_hdr_parse] duplicated sequence 'XI'
[W::sam_hdr_parse] duplicated sequence 'X'
[W::sam_hdr_parse] duplicated sequence 'XIV'
[W::sam_hdr_parse] duplicated sequence 'II'
[W::sam_hdr_parse] duplicated sequence 'XIII'
[W::sam_hdr_parse] duplicated sequence 'XVI'
[W::sam_hdr_parse] duplicated sequence 'XII'
[W::sam_hdr_parse] duplicated sequence 'VII'
[W::sam_hdr_parse] duplicated sequence 'XV'
[W::sam_hdr_parse] duplicated sequence 'IV'
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ samtools sort eg.bam eg.sorted
[bam_sort_core] merging from 26 files...
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ samtools mpileup -uf ../genome.fa eg.sorted.bam | bcftools view -o - > eg.raw.bcf
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ bcftools stats eg.raw.bcf
# This file was produced by bcftools stats (1.1+htslib-1.1) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats eg.raw.bcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 eg.raw.bcf
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 0
SN 0 number of SNPs: 0
SN 0 number of MNPs: 0
SN 0 number of indels: 0
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 0 0 0.00 0 0 0.00
# Sis, Singleton stats:
# SiS [2]id [3]allele count [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
SiS 0 1 0 0 0 0 0 0 0
# AF, Stats by non-reference allele frequency:
# AF [2]id [3]allele frequency [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
# QUAL, Stats by quality:
# QUAL [2]id [3]Quality [4]number of SNPs [5]number of transitions (1st ALT) [6]number of transversions (1st ALT) [7]number of indels
# IDD, InDel distribution:
# IDD [2]id [3]length (deletions negative) [4]count
# ST, Substitution types:
# ST [2]id [3]type [4]count
ST 0 A>C 0
ST 0 A>G 0
ST 0 A>T 0
ST 0 C>A 0
ST 0 C>G 0
ST 0 C>T 0
ST 0 G>A 0
ST 0 G>C 0
ST 0 G>T 0
ST 0 T>A 0
ST 0 T>C 0
ST 0 T>G 0
I am using bowtie2. Recently I am facing some problem in mapping. I am downloading yeast data from http://www.ebi.ac.uk/ena/data/view/ERX010187. My data showing good in FASTQC but the alignment is not good and finally I am getting nothing. Can anyone help me with this?
Best Regards
Zillur
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ ls
ERR029138_1.fastq ERR029138_1.fastq.gz ERR029138_2.fastq ERR029138_2.fastq.gz
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ bowtie2 -x ../genome -1 err029138_1.fastq -2 err029138_2.fastq -S eg.sam
30909942 reads; of these:
30909942 (100.00%) were paired; of these:
30909942 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
30909942 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
30909942 pairs aligned 0 times concordantly or discordantly; of these:
61819884 mates make up the pairs; of these:
43136222 (69.78%) aligned 0 times
557574 (0.90%) aligned exactly 1 time
18126088 (29.32%) aligned >1 times
30.22% overall alignment rate
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ samtools view -bS eg.sam > eg.bam
[W::sam_hdr_parse] duplicated sequence 'I'
[W::sam_hdr_parse] duplicated sequence 'VI'
[W::sam_hdr_parse] duplicated sequence 'III'
[W::sam_hdr_parse] duplicated sequence 'IX'
[W::sam_hdr_parse] duplicated sequence 'VIII'
[W::sam_hdr_parse] duplicated sequence 'V'
[W::sam_hdr_parse] duplicated sequence 'XI'
[W::sam_hdr_parse] duplicated sequence 'X'
[W::sam_hdr_parse] duplicated sequence 'XIV'
[W::sam_hdr_parse] duplicated sequence 'II'
[W::sam_hdr_parse] duplicated sequence 'XIII'
[W::sam_hdr_parse] duplicated sequence 'XVI'
[W::sam_hdr_parse] duplicated sequence 'XII'
[W::sam_hdr_parse] duplicated sequence 'VII'
[W::sam_hdr_parse] duplicated sequence 'XV'
[W::sam_hdr_parse] duplicated sequence 'IV'
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ samtools sort eg.bam eg.sorted
[bam_sort_core] merging from 26 files...
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ samtools mpileup -uf ../genome.fa eg.sorted.bam | bcftools view -o - > eg.raw.bcf
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
Zillur-Rahman:1:28:2015 ZILLURRAHMAN$ bcftools stats eg.raw.bcf
# This file was produced by bcftools stats (1.1+htslib-1.1) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats eg.raw.bcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 eg.raw.bcf
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 0
SN 0 number of SNPs: 0
SN 0 number of MNPs: 0
SN 0 number of indels: 0
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 0 0 0.00 0 0 0.00
# Sis, Singleton stats:
# SiS [2]id [3]allele count [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
SiS 0 1 0 0 0 0 0 0 0
# AF, Stats by non-reference allele frequency:
# AF [2]id [3]allele frequency [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
# QUAL, Stats by quality:
# QUAL [2]id [3]Quality [4]number of SNPs [5]number of transitions (1st ALT) [6]number of transversions (1st ALT) [7]number of indels
# IDD, InDel distribution:
# IDD [2]id [3]length (deletions negative) [4]count
# ST, Substitution types:
# ST [2]id [3]type [4]count
ST 0 A>C 0
ST 0 A>G 0
ST 0 A>T 0
ST 0 C>A 0
ST 0 C>G 0
ST 0 C>T 0
ST 0 G>A 0
ST 0 G>C 0
ST 0 G>T 0
ST 0 T>A 0
ST 0 T>C 0
ST 0 T>G 0