Seqanswers Leaderboard Ad

**GenoMax** · 12-11-2014, 10:23 AM

You analyze sequence data by mapping it against a reference genome sequence (in this case yeast).

Sequence data = ERR000004.fastq (you already have it).

You know where the yeast genome index is to search against.

After bowtie2 search resulting sam file has information about where each read aligns to the reference genome.

**zillur** · 12-11-2014, 11:02 AM

Thank you very much for your kind reply. But using only one file there is a problem.... may be I am missing something....

Best Regards
Zillur

Zillur-Rahman:saccharomyces ZILLURRAHMAN$ ls
ERR000004.fastq
ERR000004.fastq.bz2
ERR000004_1.fastq
ERR000004_1.fastq.bz2
ERR000004_2.fastq
ERR000004_2.fastq.bz2
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.I.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.II.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.III.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.IV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.IX.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.Mito.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.V.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.VI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.VII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.VIII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.X.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XIII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XIV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XV 2.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XVI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.genome.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna.toplevel.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.I.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.II.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.III.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.IV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.IX.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.Mito.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.V.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.VI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.VII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.VIII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.X.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XIII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XIV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XVI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.genome.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.toplevel.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.I.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.II.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.III.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.IV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.IX.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.Mito.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.V.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.VI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.VII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.VIII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.X.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XIII.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XIV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XV.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XVI.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.genome.fa
Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.toplevel.fa
Saccharomyces_cerevisiae_Ensembl_R64-1-1
eg21.sam
genome.1.bt2
genome.2.bt2
genome.3.bt2
genome.4.bt2
genome.fa
genome.fa.fai
genome.rev.1.bt2
genome.rev.2.bt2
Zillur-Rahman:saccharomyces ZILLURRAHMAN$ bowtie2 -x genome -1 err000004.fastq -S eg21.sam
Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
Usage:
bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> | -U <r>} [-S <sam>]

<bt2-idx> Index filename prefix (minus trailing .X.bt2).
NOTE: Bowtie 1 and Bowtie 2 indexes are not compatible.
<m1> Files with #1 mates, paired with files in <m2>.
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<m2> Files with #2 mates, paired with files in <m1>.
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<r> Files with unpaired reads.
Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
<sam> File for SAM output (default: stdout)

<m1>, <m2>, <r> can be comma-separated lists (no whitespace) and can be
specified many times. E.g. '-U file1.fq,file2.fq -U file3.fq'.

Options (defaults in parentheses):

Input:
-q query input files are FASTQ .fq/.fastq (default)
--qseq query input files are in Illumina's qseq format
-f query input files are (multi-)FASTA .fa/.mfa
-r query input files are raw one-sequence-per-line
-c <m1>, <m2>, <r> are sequences themselves, not files
-s/--skip <int> skip the first <int> reads/pairs in the input (none)
-u/--upto <int> stop after first <int> reads/pairs (no limit)
-5/--trim5 <int> trim <int> bases from 5'/left end of reads (0)
-3/--trim3 <int> trim <int> bases from 3'/right end of reads (0)
--phred33 qualities are Phred+33 (default)
--phred64 qualities are Phred+64
--int-quals qualities encoded as space-delimited integers

Presets: Same as:
For --end-to-end:
--very-fast -D 5 -R 1 -N 0 -L 22 -i S,0,2.50
--fast -D 10 -R 2 -N 0 -L 22 -i S,0,2.50
--sensitive -D 15 -R 2 -N 0 -L 22 -i S,1,1.15 (default)
--very-sensitive -D 20 -R 3 -N 0 -L 20 -i S,1,0.50

For --local:
--very-fast-local -D 5 -R 1 -N 0 -L 25 -i S,1,2.00
--fast-local -D 10 -R 2 -N 0 -L 22 -i S,1,1.75
--sensitive-local -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 (default)
--very-sensitive-local -D 20 -R 3 -N 0 -L 20 -i S,1,0.50

Alignment:
-N <int> max # mismatches in seed alignment; can be 0 or 1 (0)
-L <int> length of seed substrings; must be >3, <32 (22)
-i <func> interval between seed substrings w/r/t read len (S,1,1.15)
--n-ceil <func> func for max # non-A/C/G/Ts permitted in aln (L,0,0.15)
--dpad <int> include <int> extra ref chars on sides of DP table (15)
--gbar <int> disallow gaps within <int> nucs of read extremes (4)
--ignore-quals treat all quality values as 30 on Phred scale (off)
--nofw do not align forward (original) version of read (off)
--norc do not align reverse-complement version of read (off)
--no-1mm-upfront do not allow 1 mismatch alignments before attempting to
scan for the optimal seeded alignments
--end-to-end entire read must align; no clipping (on)
OR
--local local alignment; ends might be soft clipped (off)

Scoring:
--ma <int> match bonus (0 for --end-to-end, 2 for --local)
--mp <int> max penalty for mismatch; lower qual = lower penalty (6)
--np <int> penalty for non-A/C/G/Ts in read/ref (1)
--rdg <int>,<int> read gap open, extend penalties (5,3)
--rfg <int>,<int> reference gap open, extend penalties (5,3)
--score-min <func> min acceptable alignment score w/r/t read length
(G,20,8 for local, L,-0.6,-0.6 for end-to-end)

Reporting:
(default) look for multiple alignments, report best, with MAPQ
OR
-k <int> report up to <int> alns per read; MAPQ not meaningful
OR
-a/--all report all alignments; very slow, MAPQ not meaningful

Effort:
-D <int> give up extending after <int> failed extends in a row (15)
-R <int> for reads w/ repetitive seeds, try <int> sets of seeds (2)

Paired-end:
-I/--minins <int> minimum fragment length (0)
-X/--maxins <int> maximum fragment length (500)
--fr/--rf/--ff -1, -2 mates align fw/rev, rev/fw, fw/fw (--fr)
--no-mixed suppress unpaired alignments for paired reads
--no-discordant suppress discordant alignments for paired reads
--no-dovetail not concordant when mates extend past each other
--no-contain not concordant when one mate alignment contains other
--no-overlap not concordant when mates overlap at all

Output:
-t/--time print wall-clock time taken by search phases
--un <path> write unpaired reads that didn't align to <path>
--al <path> write unpaired reads that aligned at least once to <path>
--un-conc <path> write pairs that didn't align concordantly to <path>
--al-conc <path> write pairs that aligned concordantly at least once to <path>
(Note: for --un, --al, --un-conc, or --al-conc, add '-gz' to the option name, e.g.
--un-gz <path>, to gzip compress output, or add '-bz2' to bzip2 compress output.)
--quiet print nothing to stderr except serious errors
--met-file <path> send metrics to file at <path> (off)
--met-stderr send metrics to stderr (off)
--met <int> report internal counters & metrics every <int> secs (1)
--no-head supppress header lines, i.e. lines starting with @
--no-sq supppress @SQ header lines
--rg-id <text> set read group id, reflected in @RG line and RG:Z: opt field
--rg <text> add <text> ("lab:value") to @RG line of SAM header.
Note: @RG line only printed when --rg-id is set.
--omit-sec-seq put '*' in SEQ and QUAL fields for secondary alignments.

Performance:
-p/--threads <int> number of alignment threads to launch (1)
--reorder force SAM output order to match order of input reads
--mm use memory-mapped I/O for index; many 'bowtie's can share

Other:
--qc-filter filter out reads that are bad according to QSEQ filter
--seed <int> seed for random number generator (0)
--non-deterministic seed rand. gen. arbitrarily instead of using read attributes
--version print version information and quit
-h/--help print this usage message
***
Error: Must specify at least one read input with -U/-1/-2
(ERR): bowtie2-align exited with value 1
Zillur-Rahman:saccharomyces ZILLURRAHMAN$

**zillur** · 12-11-2014, 11:43 AM

Sorry for the inconvenience. I think it is working now. Thank you very much.

Best Regards
Zillur

Zillur-Rahman:saccharomyces ZILLURRAHMAN$ samtools mpileup -uf genome.fa eg21.sorted.bam | bcftools view -o - > eg21.raw.bcf
[fai_load] fail to open FASTA file.
[vcf.c:1224 vcf_hdr_read] Could not read the header
Failed to open or the file not indexed: -
Zillur-Rahman:saccharomyces ZILLURRAHMAN$ samtools mpileup -uf genome2.fa eg21.sorted.bam | bcftools view -o - > eg21.raw.bcf
[fai_load] build FASTA index.
[fai_build] fail to open the FASTA file genome2.fa
[fai_load] fail to open FASTA index.
[vcf.c:1224 vcf_hdr_read] Could not read the header
Failed to open or the file not indexed: -
Zillur-Rahman:saccharomyces ZILLURRAHMAN$ samtools mpileup -uf genome2.fa eg21.sorted.bam | bcftools view -o - > eg21.raw.bcf
[fai_load] build FASTA index.
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
Zillur-Rahman:saccharomyces ZILLURRAHMAN$ bcftools stats eg21.raw.bcf
# This file was produced by bcftools stats (1.1+htslib-1.1) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats eg21.raw.bcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 eg21.raw.bcf
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 12116097
SN 0 number of SNPs: 160017
SN 0 number of MNPs: 0
SN 0 number of indels: 2273
SN 0 number of others: 0
SN 0 number of multiallelic sites: 160125
SN 0 number of multiallelic SNP sites: 160017
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 67484 97365 0.69 65833 94184 0.70
# Sis, Singleton stats:
# SiS [2]id [3]allele count [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
SiS 0 1 164849 67484 97365 2388 0 0 2388
# AF, Stats by non-reference allele frequency:
# AF [2]id [3]allele frequency [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
AF 0 0.000000 164849 67484 97365 2388 0 0 2388
# QUAL, Stats by quality:
# QUAL [2]id [3]Quality [4]number of SNPs [5]number of transitions (1st ALT) [6]number of transversions (1st ALT) [7]number of indels
QUAL 0 0 160017 65833 94184 2273
# IDD, InDel distribution:
# IDD [2]id [3]length (deletions negative) [4]count
IDD 0 -4 3
IDD 0 -3 30
IDD 0 -2 130
IDD 0 -1 1263
IDD 0 1 733
IDD 0 2 118
IDD 0 3 83
IDD 0 4 28
# ST, Substitution types:
# ST [2]id [3]type [4]count
ST 0 A>C 15925
ST 0 A>G 22401
ST 0 A>T 19385
ST 0 C>A 7557
ST 0 C>G 5817
ST 0 C>T 11438
ST 0 G>A 11423
ST 0 G>C 5621
ST 0 G>T 7058
ST 0 T>A 19939
ST 0 T>C 22222
ST 0 T>G 16063
Zillur-Rahman:saccharomyces ZILLURRAHMAN$

**zillur** · 12-12-2014, 04:54 AM

Hi,
Thank you very much for your kind help all the way. My mentor told me to make index by my own. Can you give me some suggestions about how to build index for yeast.

Best Regards
Zillur

**GenoMax** · 12-12-2014, 05:23 AM

Many can tell you how to create indexes on this forum but then you are not learning the basics (as your mentor must intend you do).

Search around this forum/bowtie website and then ask specific questions, if you hit a problem you can't solve.

**zillur** · 12-12-2014, 08:41 AM

Thank you very much for your kind suggestions. I am trying.... Using this line...

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-build -f chromosome.1.fa,chromosome.2.fa,chromosome.3.fa,chromosome.4.fa,chromosome.5.fa,chromosome.6.fa,chromosome.7.fa,chromosome.8.fa,chromosome.9.fa,chromosome.10.fa,chromosome.11.fa,chromosome.12.fa,chromosome.13.fa,chromosome.14.fa,chromosome.15.fa,chromosome.16.fa index

Which gave me this files....

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
chromosome.1.fa chromosome.15.fa chromosome.6.fa index.2.bt2
chromosome.10.fa chromosome.16.fa chromosome.7.fa index.3.bt2
chromosome.11.fa chromosome.2.fa chromosome.8.fa index.4.bt2
chromosome.12.fa chromosome.3.fa chromosome.9.fa index.rev.1.bt2
chromosome.13.fa chromosome.4.fa index.rev.2.bt2
chromosome.14.fa chromosome.5.fa index.1.bt2

But when I am going to check.....

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a index
No index name given!
Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
Usage: bowtie2-inspect [options]* <bt2_base>
<bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

By default, prints FASTA records of the indexed nucleotide sequences to
standard out. With -n, just prints names. With -s, just prints a summary of
the index parameters and sequences. With -e, preserves colors if applicable.

Options:
--large-index force inspection of the 'large' index, even if a
'small' one is present.
-a/--across <int> Number of characters across in FASTA output (default: 60)
-n/--names Print reference sequence names only
-s/--summary Print summary incl. ref names, lengths, index properties
-e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
-v/--verbose Verbose output (for debugging)
-h/--help print detailed description of tool and its options
--help print this usage message
Zillur-Rahman:index_err000004 ZILLURRAHMAN$

Would you please to give me some hints what I am missing?

Best Regards
Zillur

**GenoMax** · 12-12-2014, 09:16 AM

Single Genome sequence file = All chromosomes together.

What do you think you need to do in this case?

**zillur** · 12-12-2014, 09:37 AM

Thank you very much. I have a file genome.fa, but.....

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-build -f chromosome.1.fa,chromosome.2.fa,chromosome.3.fa,chromosome.4.fa,chromosome.5.fa,chromosome.6.fa,chromosome.7.fa,chromosome.8.fa,chromosome.9.fa,chromosome.10.fa,chromosome.11.fa,chromosome.12.fa,chromosome.13.fa,chromosome.14.fa,chromosome.15.fa,chromosome.16.fa,genome.fa genome

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
chromosome.1.fa chromosome.2.fa genome.1.bt2 index.2.bt2
chromosome.10.fa chromosome.3.fa genome.2.bt2 index.3.bt2
chromosome.11.fa chromosome.4.fa genome.3.bt2 index.4.bt2
chromosome.12.fa chromosome.5.fa genome.4.bt2 index.rev.1.bt2
chromosome.13.fa chromosome.6.fa genome.fa index.rev.2.bt2
chromosome.14.fa chromosome.7.fa genome.rev.1.bt2
chromosome.15.fa chromosome.8.fa genome.rev.2.bt2
chromosome.16.fa chromosome.9.fa index.1.bt2
Zillur-Rahman:index_err000004 ZILLURRAHMAN$ rm index.1.bt2 index.2.bt2 index.3.bt2 index.4.bt2 index.rev.1.bt2 index.rev.2.bt2
Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
chromosome.1.fa chromosome.15.fa chromosome.6.fa genome.3.bt2
chromosome.10.fa chromosome.16.fa chromosome.7.fa genome.4.bt2
chromosome.11.fa chromosome.2.fa chromosome.8.fa genome.fa
chromosome.12.fa chromosome.3.fa chromosome.9.fa genome.rev.1.bt2
chromosome.13.fa chromosome.4.fa genome.1.bt2 genome.rev.2.bt2
chromosome.14.fa chromosome.5.fa genome.2.bt2
Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a genome
No index name given!
Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
Usage: bowtie2-inspect [options]* <bt2_base>
<bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

By default, prints FASTA records of the indexed nucleotide sequences to
standard out. With -n, just prints names. With -s, just prints a summary of
the index parameters and sequences. With -e, preserves colors if applicable.

Options:
--large-index force inspection of the 'large' index, even if a
'small' one is present.
-a/--across <int> Number of characters across in FASTA output (default: 60)
-n/--names Print reference sequence names only
-s/--summary Print summary incl. ref names, lengths, index properties
-e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
-v/--verbose Verbose output (for debugging)
-h/--help print detailed description of tool and its options
--help print this usage message
Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a genome

**GenoMax** · 12-12-2014, 09:50 AM

Originally posted by zillur View Post

Thank you very much. I have a file genome.fa, but.....

If genome.fa has all chromosomes then why are you using individual chromosome files to build the index.

**zillur** · 12-12-2014, 09:58 AM

Thank you very much. I I used only genome.fa, But still...... "No index name given" How can I give an index name?

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-build -f genome.fa index

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
chromosome.1.fa chromosome.2.fa genome.1.bt2 index.2.bt2
chromosome.10.fa chromosome.3.fa genome.2.bt2 index.3.bt2
chromosome.11.fa chromosome.4.fa genome.3.bt2 index.4.bt2
chromosome.12.fa chromosome.5.fa genome.4.bt2 index.rev.1.bt2
chromosome.13.fa chromosome.6.fa genome.fa index.rev.2.bt2
chromosome.14.fa chromosome.7.fa genome.rev.1.bt2
chromosome.15.fa chromosome.8.fa genome.rev.2.bt2
chromosome.16.fa chromosome.9.fa index.1.bt2
Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a genome
No index name given!
Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
Usage: bowtie2-inspect [options]* <bt2_base>
<bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

By default, prints FASTA records of the indexed nucleotide sequences to
standard out. With -n, just prints names. With -s, just prints a summary of
the index parameters and sequences. With -e, preserves colors if applicable.

Options:
--large-index force inspection of the 'large' index, even if a
'small' one is present.
-a/--across <int> Number of characters across in FASTA output (default: 60)
-n/--names Print reference sequence names only
-s/--summary Print summary incl. ref names, lengths, index properties
-e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
-v/--verbose Verbose output (for debugging)
-h/--help print detailed description of tool and its options
--help print this usage message
Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a index
No index name given!
Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
Usage: bowtie2-inspect [options]* <bt2_base>
<bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

By default, prints FASTA records of the indexed nucleotide sequences to
standard out. With -n, just prints names. With -s, just prints a summary of
the index parameters and sequences. With -e, preserves colors if applicable.

Options:
--large-index force inspection of the 'large' index, even if a
'small' one is present.
-a/--across <int> Number of characters across in FASTA output (default: 60)
-n/--names Print reference sequence names only
-s/--summary Print summary incl. ref names, lengths, index properties
-e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
-v/--verbose Verbose output (for debugging)
-h/--help print detailed description of tool and its options
--help print this usage message
Zillur-Rahman:index_err000004 ZILLURRAHMAN$

**GenoMax** · 12-12-2014, 10:35 AM

You need to provide a name that will be used as the prefix for all index related files. This can be anything but you would want to use something (e.g. MyYeast) that would make sense afterwards.

**zillur** · 12-12-2014, 10:44 AM

Thank you very much. I think its working now.

Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a index genome

It gave a single file

ATTAGTGTATTGGATTCGACAAGAGGCAAGCAAGGGAGCCAAGTTTTCCGCATGTCTGGAAGGCAGATCAAAGAGTTGTATTATAAAGTATGGAGCAACTTGCGTGAATCGAAGACAGAGGTGCTGCAGTACTTTTTGAACTGGGACGAGAAAAAGTGCCGGGAAGAATGGGAGGCAAAAGACGATACGGTCTTTGTGGAAGCGCTCGAGAAAGTTGGAGTTTTTCAGCGTTTGCGTTCCATGACGAGCGCTGGACTGCAGGGTCCGCAGTACGTCAAGCTGCAGTTTAGCAGGCATCATCGACAGTTGAGGAGCAGATATGAATTAAGTCTAGGAATGCACTTGCGAGATCAGCTTGCGCTGGGAGTTACCCCATCTAAAGTGCCGCATTGGACGGCATTCCTGTCGATGCTGATAGGGCTGTTCTACAATAAAACATTTCGGCAGAAACTGGAATATCTTTTGGAGCAGATTTCGGAGGTGTGGTTGTTACCACATTGGCTTGATTTGGCAAACGTTGAAGTTCTCGCTGCAGATAACACGAGGGTACCGCTGTACATGCTGATGGTAGCGGTTCACAAAGAGCTGGATAGCGATGATGTTCCAGACGGTAGATTTGATATAATATTACTATGTAGAGATTCGAGCAGAGAAGTTGGAGAGTGAAGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTAAGAAANNNNNNNNNCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGG

....and many more.......

**zillur** · 12-18-2014, 08:54 AM

Hi,
Thank you very much for your kind help. I need to try mapping paired reads to the genome, instead of single reads. Can you give me some hints from where I can download fastq files of yeast genome for pair reads mapping. Currently I am using http://www.ebi.ac.uk/ena/data/view/ERP000001
But there are only single reads.

Best Regards
Zillur

**GenoMax** · 12-18-2014, 03:07 PM

Originally posted by zillur View Post

Hi,
Thank you very much for your kind help. I need to try mapping paired reads to the genome, instead of single reads. Can you give me some hints from where I can download fastq files of yeast genome for pair reads mapping. Currently I am using http://www.ebi.ac.uk/ena/data/view/ERP000001
But there are only single reads.

Best Regards
Zillur

http://sra.dnanexus.com/?result_type=Study&show=&q=paired+end+yeast

You should post these kinds of questions in a new thread. This is no longer related to the parent thread you are posting in.

**zillur** · 01-27-2015, 06:37 AM

Hi,
Sorry to disturb you again. I am facing problem again. Would you please to give some hints what is my faults? I have downloaded data from http://www.ebi.ac.uk/ena/data/search...genome+pair+en
May be there is a problem in creating egf.raw.bcf file. But I did it many with same protocol.

Best Regards
Zillur

Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ bowtie2 -x ../genome -1 err029139_1.fastq -2 err029139_2.fastq -S eg.sam
30723562 reads; of these:
30723562 (100.00%) were paired; of these:
30723562 (100.00%) aligned concordantly 0 times
0 (0.00%) aligned concordantly exactly 1 time
0 (0.00%) aligned concordantly >1 times
----
30723562 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
30723562 pairs aligned 0 times concordantly or discordantly; of these:
61447124 mates make up the pairs; of these:
44425291 (72.30%) aligned 0 times
531380 (0.86%) aligned exactly 1 time
16490453 (26.84%) aligned >1 times
27.70% overall alignment rate
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools view -bS eg.sam > eg.bam
[W::sam_hdr_parse] duplicated sequence 'I'
[W::sam_hdr_parse] duplicated sequence 'VI'
[W::sam_hdr_parse] duplicated sequence 'III'
[W::sam_hdr_parse] duplicated sequence 'IX'
[W::sam_hdr_parse] duplicated sequence 'VIII'
[W::sam_hdr_parse] duplicated sequence 'V'
[W::sam_hdr_parse] duplicated sequence 'XI'
[W::sam_hdr_parse] duplicated sequence 'X'
[W::sam_hdr_parse] duplicated sequence 'XIV'
[W::sam_hdr_parse] duplicated sequence 'II'
[W::sam_hdr_parse] duplicated sequence 'XIII'
[W::sam_hdr_parse] duplicated sequence 'XVI'
[W::sam_hdr_parse] duplicated sequence 'XII'
[W::sam_hdr_parse] duplicated sequence 'VII'
[W::sam_hdr_parse] duplicated sequence 'XV'
[W::sam_hdr_parse] duplicated sequence 'IV'
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools sort eg.bam eg.sorted
[bam_sort_core] merging from 26 files...
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf../genome eg.sorted.bam | bcftools view -o - > eg.raw.bcf
[fai_load] build FASTA index.
[fai_build] fail to open the FASTA file ../genome
[fai_load] fail to open FASTA index.
[vcf.c:1224 vcf_hdr_read] Could not read the header
Failed to open or the file not indexed: -
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf../genome.fa eg.sorted.bam | bcftools view -o - > eg.raw.bcf
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ bcftools stats eg.raw.bcf
# This file was produced by bcftools stats (1.1+htslib-1.1) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats eg.raw.bcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 eg.raw.bcf
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 0
SN 0 number of SNPs: 0
SN 0 number of MNPs: 0
SN 0 number of indels: 0
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 0 0 0.00 0 0 0.00
# Sis, Singleton stats:
# SiS [2]id [3]allele count [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
SiS 0 1 0 0 0 0 0 0 0
# AF, Stats by non-reference allele frequency:
# AF [2]id [3]allele frequency [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
# QUAL, Stats by quality:
# QUAL [2]id [3]Quality [4]number of SNPs [5]number of transitions (1st ALT) [6]number of transversions (1st ALT) [7]number of indels
# IDD, InDel distribution:
# IDD [2]id [3]length (deletions negative) [4]count
# ST, Substitution types:
# ST [2]id [3]type [4]count
ST 0 A>C 0
ST 0 A>G 0
ST 0 A>T 0
ST 0 C>A 0
ST 0 C>G 0
ST 0 C>T 0
ST 0 G>A 0
ST 0 G>C 0
ST 0 G>T 0
ST 0 T>A 0
ST 0 T>C 0
ST 0 T>G 0
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf ../genome.fa eg.sorted.bam | bcftools view -o - > eg.raw.bcf
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf genome.fa eg.sorted.bam | bcftools view -o u -v -c > eg.raw.bcf
[fai_load] build FASTA index.
[fai_build] fail to open the FASTA file genome.fa
[fai_load] fail to open FASTA index.
[vcf.c:1224 vcf_hdr_read] Could not read the header
Failed to open or the file not indexed: -
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf ../genome.fa eg.sorted.bam | bcftools view -o u -v -c > eg.raw.bcf
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
[E::-c] unknown type
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ ls
ERR029139_1.fastq ERR029139_2.fastq eg.bam eg.sam
ERR029139_1.fastq.gz ERR029139_2.fastq.gz eg.raw.bcf eg.sorted.bam
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ bcftools stats eg.raw.bcf
[vcf.c:1224 vcf_hdr_read] Could not read the header
Could not read the file or the file is not indexed: eg.raw.bcf
Zillur-Rahman:1:26:2015 ZILLURRAHMAN$

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 18 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News