Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • You analyze sequence data by mapping it against a reference genome sequence (in this case yeast).

    Sequence data = ERR000004.fastq (you already have it).

    You know where the yeast genome index is to search against.

    After bowtie2 search resulting sam file has information about where each read aligns to the reference genome.

    Comment


    • Thank you very much for your kind reply. But using only one file there is a problem.... may be I am missing something....

      Best Regards
      Zillur

      Zillur-Rahman:saccharomyces ZILLURRAHMAN$ ls
      ERR000004.fastq
      ERR000004.fastq.bz2
      ERR000004_1.fastq
      ERR000004_1.fastq.bz2
      ERR000004_2.fastq
      ERR000004_2.fastq.bz2
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.I.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.II.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.III.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.IV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.IX.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.Mito.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.V.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.VI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.VII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.VIII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.X.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XIII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XIV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XV 2.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.chromosome.XVI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.genome.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna.toplevel.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.I.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.II.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.III.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.IV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.IX.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.Mito.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.V.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.VI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.VII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.VIII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.X.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XIII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XIV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.chromosome.XVI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.genome.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_rm.toplevel.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.I.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.II.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.III.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.IV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.IX.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.Mito.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.V.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.VI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.VII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.VIII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.X.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XIII.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XIV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XV.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.chromosome.XVI.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.genome.fa
      Saccharomyces_cerevisiae.R64-1-1.24.dna_sm.toplevel.fa
      Saccharomyces_cerevisiae_Ensembl_R64-1-1
      eg21.sam
      genome.1.bt2
      genome.2.bt2
      genome.3.bt2
      genome.4.bt2
      genome.fa
      genome.fa.fai
      genome.rev.1.bt2
      genome.rev.2.bt2
      Zillur-Rahman:saccharomyces ZILLURRAHMAN$ bowtie2 -x genome -1 err000004.fastq -S eg21.sam
      Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
      Usage:
      bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> | -U <r>} [-S <sam>]

      <bt2-idx> Index filename prefix (minus trailing .X.bt2).
      NOTE: Bowtie 1 and Bowtie 2 indexes are not compatible.
      <m1> Files with #1 mates, paired with files in <m2>.
      Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
      <m2> Files with #2 mates, paired with files in <m1>.
      Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
      <r> Files with unpaired reads.
      Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2).
      <sam> File for SAM output (default: stdout)

      <m1>, <m2>, <r> can be comma-separated lists (no whitespace) and can be
      specified many times. E.g. '-U file1.fq,file2.fq -U file3.fq'.

      Options (defaults in parentheses):

      Input:
      -q query input files are FASTQ .fq/.fastq (default)
      --qseq query input files are in Illumina's qseq format
      -f query input files are (multi-)FASTA .fa/.mfa
      -r query input files are raw one-sequence-per-line
      -c <m1>, <m2>, <r> are sequences themselves, not files
      -s/--skip <int> skip the first <int> reads/pairs in the input (none)
      -u/--upto <int> stop after first <int> reads/pairs (no limit)
      -5/--trim5 <int> trim <int> bases from 5'/left end of reads (0)
      -3/--trim3 <int> trim <int> bases from 3'/right end of reads (0)
      --phred33 qualities are Phred+33 (default)
      --phred64 qualities are Phred+64
      --int-quals qualities encoded as space-delimited integers

      Presets: Same as:
      For --end-to-end:
      --very-fast -D 5 -R 1 -N 0 -L 22 -i S,0,2.50
      --fast -D 10 -R 2 -N 0 -L 22 -i S,0,2.50
      --sensitive -D 15 -R 2 -N 0 -L 22 -i S,1,1.15 (default)
      --very-sensitive -D 20 -R 3 -N 0 -L 20 -i S,1,0.50

      For --local:
      --very-fast-local -D 5 -R 1 -N 0 -L 25 -i S,1,2.00
      --fast-local -D 10 -R 2 -N 0 -L 22 -i S,1,1.75
      --sensitive-local -D 15 -R 2 -N 0 -L 20 -i S,1,0.75 (default)
      --very-sensitive-local -D 20 -R 3 -N 0 -L 20 -i S,1,0.50

      Alignment:
      -N <int> max # mismatches in seed alignment; can be 0 or 1 (0)
      -L <int> length of seed substrings; must be >3, <32 (22)
      -i <func> interval between seed substrings w/r/t read len (S,1,1.15)
      --n-ceil <func> func for max # non-A/C/G/Ts permitted in aln (L,0,0.15)
      --dpad <int> include <int> extra ref chars on sides of DP table (15)
      --gbar <int> disallow gaps within <int> nucs of read extremes (4)
      --ignore-quals treat all quality values as 30 on Phred scale (off)
      --nofw do not align forward (original) version of read (off)
      --norc do not align reverse-complement version of read (off)
      --no-1mm-upfront do not allow 1 mismatch alignments before attempting to
      scan for the optimal seeded alignments
      --end-to-end entire read must align; no clipping (on)
      OR
      --local local alignment; ends might be soft clipped (off)

      Scoring:
      --ma <int> match bonus (0 for --end-to-end, 2 for --local)
      --mp <int> max penalty for mismatch; lower qual = lower penalty (6)
      --np <int> penalty for non-A/C/G/Ts in read/ref (1)
      --rdg <int>,<int> read gap open, extend penalties (5,3)
      --rfg <int>,<int> reference gap open, extend penalties (5,3)
      --score-min <func> min acceptable alignment score w/r/t read length
      (G,20,8 for local, L,-0.6,-0.6 for end-to-end)

      Reporting:
      (default) look for multiple alignments, report best, with MAPQ
      OR
      -k <int> report up to <int> alns per read; MAPQ not meaningful
      OR
      -a/--all report all alignments; very slow, MAPQ not meaningful

      Effort:
      -D <int> give up extending after <int> failed extends in a row (15)
      -R <int> for reads w/ repetitive seeds, try <int> sets of seeds (2)

      Paired-end:
      -I/--minins <int> minimum fragment length (0)
      -X/--maxins <int> maximum fragment length (500)
      --fr/--rf/--ff -1, -2 mates align fw/rev, rev/fw, fw/fw (--fr)
      --no-mixed suppress unpaired alignments for paired reads
      --no-discordant suppress discordant alignments for paired reads
      --no-dovetail not concordant when mates extend past each other
      --no-contain not concordant when one mate alignment contains other
      --no-overlap not concordant when mates overlap at all

      Output:
      -t/--time print wall-clock time taken by search phases
      --un <path> write unpaired reads that didn't align to <path>
      --al <path> write unpaired reads that aligned at least once to <path>
      --un-conc <path> write pairs that didn't align concordantly to <path>
      --al-conc <path> write pairs that aligned concordantly at least once to <path>
      (Note: for --un, --al, --un-conc, or --al-conc, add '-gz' to the option name, e.g.
      --un-gz <path>, to gzip compress output, or add '-bz2' to bzip2 compress output.)
      --quiet print nothing to stderr except serious errors
      --met-file <path> send metrics to file at <path> (off)
      --met-stderr send metrics to stderr (off)
      --met <int> report internal counters & metrics every <int> secs (1)
      --no-head supppress header lines, i.e. lines starting with @
      --no-sq supppress @SQ header lines
      --rg-id <text> set read group id, reflected in @RG line and RG:Z: opt field
      --rg <text> add <text> ("lab:value") to @RG line of SAM header.
      Note: @RG line only printed when --rg-id is set.
      --omit-sec-seq put '*' in SEQ and QUAL fields for secondary alignments.

      Performance:
      -p/--threads <int> number of alignment threads to launch (1)
      --reorder force SAM output order to match order of input reads
      --mm use memory-mapped I/O for index; many 'bowtie's can share

      Other:
      --qc-filter filter out reads that are bad according to QSEQ filter
      --seed <int> seed for random number generator (0)
      --non-deterministic seed rand. gen. arbitrarily instead of using read attributes
      --version print version information and quit
      -h/--help print this usage message
      ***
      Error: Must specify at least one read input with -U/-1/-2
      (ERR): bowtie2-align exited with value 1
      Zillur-Rahman:saccharomyces ZILLURRAHMAN$

      Comment


      • Sorry for the inconvenience. I think it is working now. Thank you very much.

        Best Regards
        Zillur

        Zillur-Rahman:saccharomyces ZILLURRAHMAN$ samtools mpileup -uf genome.fa eg21.sorted.bam | bcftools view -o - > eg21.raw.bcf
        [fai_load] fail to open FASTA file.
        [vcf.c:1224 vcf_hdr_read] Could not read the header
        Failed to open or the file not indexed: -
        Zillur-Rahman:saccharomyces ZILLURRAHMAN$ samtools mpileup -uf genome2.fa eg21.sorted.bam | bcftools view -o - > eg21.raw.bcf
        [fai_load] build FASTA index.
        [fai_build] fail to open the FASTA file genome2.fa
        [fai_load] fail to open FASTA index.
        [vcf.c:1224 vcf_hdr_read] Could not read the header
        Failed to open or the file not indexed: -
        Zillur-Rahman:saccharomyces ZILLURRAHMAN$ samtools mpileup -uf genome2.fa eg21.sorted.bam | bcftools view -o - > eg21.raw.bcf
        [fai_load] build FASTA index.
        [mpileup] 1 samples in 1 input files
        <mpileup> Set max per-file depth to 8000
        Zillur-Rahman:saccharomyces ZILLURRAHMAN$ bcftools stats eg21.raw.bcf
        # This file was produced by bcftools stats (1.1+htslib-1.1) and can be plotted using plot-vcfstats.
        # The command line was: bcftools stats eg21.raw.bcf
        #
        # Definition of sets:
        # ID [2]id [3]tab-separated file names
        ID 0 eg21.raw.bcf
        # SN, Summary numbers:
        # SN [2]id [3]key [4]value
        SN 0 number of samples: 1
        SN 0 number of records: 12116097
        SN 0 number of SNPs: 160017
        SN 0 number of MNPs: 0
        SN 0 number of indels: 2273
        SN 0 number of others: 0
        SN 0 number of multiallelic sites: 160125
        SN 0 number of multiallelic SNP sites: 160017
        # TSTV, transitions/transversions:
        # TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
        TSTV 0 67484 97365 0.69 65833 94184 0.70
        # Sis, Singleton stats:
        # SiS [2]id [3]allele count [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
        SiS 0 1 164849 67484 97365 2388 0 0 2388
        # AF, Stats by non-reference allele frequency:
        # AF [2]id [3]allele frequency [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
        AF 0 0.000000 164849 67484 97365 2388 0 0 2388
        # QUAL, Stats by quality:
        # QUAL [2]id [3]Quality [4]number of SNPs [5]number of transitions (1st ALT) [6]number of transversions (1st ALT) [7]number of indels
        QUAL 0 0 160017 65833 94184 2273
        # IDD, InDel distribution:
        # IDD [2]id [3]length (deletions negative) [4]count
        IDD 0 -4 3
        IDD 0 -3 30
        IDD 0 -2 130
        IDD 0 -1 1263
        IDD 0 1 733
        IDD 0 2 118
        IDD 0 3 83
        IDD 0 4 28
        # ST, Substitution types:
        # ST [2]id [3]type [4]count
        ST 0 A>C 15925
        ST 0 A>G 22401
        ST 0 A>T 19385
        ST 0 C>A 7557
        ST 0 C>G 5817
        ST 0 C>T 11438
        ST 0 G>A 11423
        ST 0 G>C 5621
        ST 0 G>T 7058
        ST 0 T>A 19939
        ST 0 T>C 22222
        ST 0 T>G 16063
        Zillur-Rahman:saccharomyces ZILLURRAHMAN$

        Comment


        • Hi,
          Thank you very much for your kind help all the way. My mentor told me to make index by my own. Can you give me some suggestions about how to build index for yeast.

          Best Regards
          Zillur

          Comment


          • Many can tell you how to create indexes on this forum but then you are not learning the basics (as your mentor must intend you do).

            Search around this forum/bowtie website and then ask specific questions, if you hit a problem you can't solve.

            Comment


            • Thank you very much for your kind suggestions. I am trying.... Using this line...

              Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-build -f chromosome.1.fa,chromosome.2.fa,chromosome.3.fa,chromosome.4.fa,chromosome.5.fa,chromosome.6.fa,chromosome.7.fa,chromosome.8.fa,chromosome.9.fa,chromosome.10.fa,chromosome.11.fa,chromosome.12.fa,chromosome.13.fa,chromosome.14.fa,chromosome.15.fa,chromosome.16.fa index

              Which gave me this files....

              Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
              chromosome.1.fa chromosome.15.fa chromosome.6.fa index.2.bt2
              chromosome.10.fa chromosome.16.fa chromosome.7.fa index.3.bt2
              chromosome.11.fa chromosome.2.fa chromosome.8.fa index.4.bt2
              chromosome.12.fa chromosome.3.fa chromosome.9.fa index.rev.1.bt2
              chromosome.13.fa chromosome.4.fa index.rev.2.bt2
              chromosome.14.fa chromosome.5.fa index.1.bt2

              But when I am going to check.....

              Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a index
              No index name given!
              Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
              Usage: bowtie2-inspect [options]* <bt2_base>
              <bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

              By default, prints FASTA records of the indexed nucleotide sequences to
              standard out. With -n, just prints names. With -s, just prints a summary of
              the index parameters and sequences. With -e, preserves colors if applicable.

              Options:
              --large-index force inspection of the 'large' index, even if a
              'small' one is present.
              -a/--across <int> Number of characters across in FASTA output (default: 60)
              -n/--names Print reference sequence names only
              -s/--summary Print summary incl. ref names, lengths, index properties
              -e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
              -v/--verbose Verbose output (for debugging)
              -h/--help print detailed description of tool and its options
              --help print this usage message
              Zillur-Rahman:index_err000004 ZILLURRAHMAN$

              Would you please to give me some hints what I am missing?

              Best Regards
              Zillur

              Comment


              • Single Genome sequence file = All chromosomes together.

                What do you think you need to do in this case?

                Comment


                • Thank you very much. I have a file genome.fa, but.....

                  Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-build -f chromosome.1.fa,chromosome.2.fa,chromosome.3.fa,chromosome.4.fa,chromosome.5.fa,chromosome.6.fa,chromosome.7.fa,chromosome.8.fa,chromosome.9.fa,chromosome.10.fa,chromosome.11.fa,chromosome.12.fa,chromosome.13.fa,chromosome.14.fa,chromosome.15.fa,chromosome.16.fa,genome.fa genome

                  Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
                  chromosome.1.fa chromosome.2.fa genome.1.bt2 index.2.bt2
                  chromosome.10.fa chromosome.3.fa genome.2.bt2 index.3.bt2
                  chromosome.11.fa chromosome.4.fa genome.3.bt2 index.4.bt2
                  chromosome.12.fa chromosome.5.fa genome.4.bt2 index.rev.1.bt2
                  chromosome.13.fa chromosome.6.fa genome.fa index.rev.2.bt2
                  chromosome.14.fa chromosome.7.fa genome.rev.1.bt2
                  chromosome.15.fa chromosome.8.fa genome.rev.2.bt2
                  chromosome.16.fa chromosome.9.fa index.1.bt2
                  Zillur-Rahman:index_err000004 ZILLURRAHMAN$ rm index.1.bt2 index.2.bt2 index.3.bt2 index.4.bt2 index.rev.1.bt2 index.rev.2.bt2
                  Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
                  chromosome.1.fa chromosome.15.fa chromosome.6.fa genome.3.bt2
                  chromosome.10.fa chromosome.16.fa chromosome.7.fa genome.4.bt2
                  chromosome.11.fa chromosome.2.fa chromosome.8.fa genome.fa
                  chromosome.12.fa chromosome.3.fa chromosome.9.fa genome.rev.1.bt2
                  chromosome.13.fa chromosome.4.fa genome.1.bt2 genome.rev.2.bt2
                  chromosome.14.fa chromosome.5.fa genome.2.bt2
                  Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a genome
                  No index name given!
                  Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
                  Usage: bowtie2-inspect [options]* <bt2_base>
                  <bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

                  By default, prints FASTA records of the indexed nucleotide sequences to
                  standard out. With -n, just prints names. With -s, just prints a summary of
                  the index parameters and sequences. With -e, preserves colors if applicable.

                  Options:
                  --large-index force inspection of the 'large' index, even if a
                  'small' one is present.
                  -a/--across <int> Number of characters across in FASTA output (default: 60)
                  -n/--names Print reference sequence names only
                  -s/--summary Print summary incl. ref names, lengths, index properties
                  -e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
                  -v/--verbose Verbose output (for debugging)
                  -h/--help print detailed description of tool and its options
                  --help print this usage message
                  Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a genome

                  Comment


                  • Originally posted by zillur View Post
                    Thank you very much. I have a file genome.fa, but.....
                    If genome.fa has all chromosomes then why are you using individual chromosome files to build the index.

                    Comment


                    • Thank you very much. I I used only genome.fa, But still...... "No index name given" How can I give an index name?

                      Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-build -f genome.fa index



                      Zillur-Rahman:index_err000004 ZILLURRAHMAN$ ls
                      chromosome.1.fa chromosome.2.fa genome.1.bt2 index.2.bt2
                      chromosome.10.fa chromosome.3.fa genome.2.bt2 index.3.bt2
                      chromosome.11.fa chromosome.4.fa genome.3.bt2 index.4.bt2
                      chromosome.12.fa chromosome.5.fa genome.4.bt2 index.rev.1.bt2
                      chromosome.13.fa chromosome.6.fa genome.fa index.rev.2.bt2
                      chromosome.14.fa chromosome.7.fa genome.rev.1.bt2
                      chromosome.15.fa chromosome.8.fa genome.rev.2.bt2
                      chromosome.16.fa chromosome.9.fa index.1.bt2
                      Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a genome
                      No index name given!
                      Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
                      Usage: bowtie2-inspect [options]* <bt2_base>
                      <bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

                      By default, prints FASTA records of the indexed nucleotide sequences to
                      standard out. With -n, just prints names. With -s, just prints a summary of
                      the index parameters and sequences. With -e, preserves colors if applicable.

                      Options:
                      --large-index force inspection of the 'large' index, even if a
                      'small' one is present.
                      -a/--across <int> Number of characters across in FASTA output (default: 60)
                      -n/--names Print reference sequence names only
                      -s/--summary Print summary incl. ref names, lengths, index properties
                      -e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
                      -v/--verbose Verbose output (for debugging)
                      -h/--help print detailed description of tool and its options
                      --help print this usage message
                      Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a index
                      No index name given!
                      Bowtie 2 version 2.2.3 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea)
                      Usage: bowtie2-inspect [options]* <bt2_base>
                      <bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2

                      By default, prints FASTA records of the indexed nucleotide sequences to
                      standard out. With -n, just prints names. With -s, just prints a summary of
                      the index parameters and sequences. With -e, preserves colors if applicable.

                      Options:
                      --large-index force inspection of the 'large' index, even if a
                      'small' one is present.
                      -a/--across <int> Number of characters across in FASTA output (default: 60)
                      -n/--names Print reference sequence names only
                      -s/--summary Print summary incl. ref names, lengths, index properties
                      -e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors)
                      -v/--verbose Verbose output (for debugging)
                      -h/--help print detailed description of tool and its options
                      --help print this usage message
                      Zillur-Rahman:index_err000004 ZILLURRAHMAN$

                      Comment


                      • You need to provide a name that will be used as the prefix for all index related files. This can be anything but you would want to use something (e.g. MyYeast) that would make sense afterwards.

                        Comment


                        • Thank you very much. I think its working now.

                          Zillur-Rahman:index_err000004 ZILLURRAHMAN$ bowtie2-inspect -a index genome

                          It gave a single file

                          ATTAGTGTATTGGATTCGACAAGAGGCAAGCAAGGGAGCCAAGTTTTCCGCATGTCTGGAAGGCAGATCAAAGAGTTGTATTATAAAGTATGGAGCAACTTGCGTGAATCGAAGACAGAGGTGCTGCAGTACTTTTTGAACTGGGACGAGAAAAAGTGCCGGGAAGAATGGGAGGCAAAAGACGATACGGTCTTTGTGGAAGCGCTCGAGAAAGTTGGAGTTTTTCAGCGTTTGCGTTCCATGACGAGCGCTGGACTGCAGGGTCCGCAGTACGTCAAGCTGCAGTTTAGCAGGCATCATCGACAGTTGAGGAGCAGATATGAATTAAGTCTAGGAATGCACTTGCGAGATCAGCTTGCGCTGGGAGTTACCCCATCTAAAGTGCCGCATTGGACGGCATTCCTGTCGATGCTGATAGGGCTGTTCTACAATAAAACATTTCGGCAGAAACTGGAATATCTTTTGGAGCAGATTTCGGAGGTGTGGTTGTTACCACATTGGCTTGATTTGGCAAACGTTGAAGTTCTCGCTGCAGATAACACGAGGGTACCGCTGTACATGCTGATGGTAGCGGTTCACAAAGAGCTGGATAGCGATGATGTTCCAGACGGTAGATTTGATATAATATTACTATGTAGAGATTCGAGCAGAGAAGTTGGAGAGTGAAGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTAAGAAANNNNNNNNNCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGG

                          ....and many more.......

                          Comment


                          • Hi,
                            Thank you very much for your kind help. I need to try mapping paired reads to the genome, instead of single reads. Can you give me some hints from where I can download fastq files of yeast genome for pair reads mapping. Currently I am using http://www.ebi.ac.uk/ena/data/view/ERP000001
                            But there are only single reads.

                            Best Regards
                            Zillur

                            Comment


                            • Originally posted by zillur View Post
                              Hi,
                              Thank you very much for your kind help. I need to try mapping paired reads to the genome, instead of single reads. Can you give me some hints from where I can download fastq files of yeast genome for pair reads mapping. Currently I am using http://www.ebi.ac.uk/ena/data/view/ERP000001
                              But there are only single reads.

                              Best Regards
                              Zillur


                              You should post these kinds of questions in a new thread. This is no longer related to the parent thread you are posting in.

                              Comment


                              • Hi,
                                Sorry to disturb you again. I am facing problem again. Would you please to give some hints what is my faults? I have downloaded data from http://www.ebi.ac.uk/ena/data/search...genome+pair+en
                                May be there is a problem in creating egf.raw.bcf file. But I did it many with same protocol.

                                Best Regards
                                Zillur

                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ bowtie2 -x ../genome -1 err029139_1.fastq -2 err029139_2.fastq -S eg.sam
                                30723562 reads; of these:
                                30723562 (100.00%) were paired; of these:
                                30723562 (100.00%) aligned concordantly 0 times
                                0 (0.00%) aligned concordantly exactly 1 time
                                0 (0.00%) aligned concordantly >1 times
                                ----
                                30723562 pairs aligned concordantly 0 times; of these:
                                0 (0.00%) aligned discordantly 1 time
                                ----
                                30723562 pairs aligned 0 times concordantly or discordantly; of these:
                                61447124 mates make up the pairs; of these:
                                44425291 (72.30%) aligned 0 times
                                531380 (0.86%) aligned exactly 1 time
                                16490453 (26.84%) aligned >1 times
                                27.70% overall alignment rate
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools view -bS eg.sam > eg.bam
                                [W::sam_hdr_parse] duplicated sequence 'I'
                                [W::sam_hdr_parse] duplicated sequence 'VI'
                                [W::sam_hdr_parse] duplicated sequence 'III'
                                [W::sam_hdr_parse] duplicated sequence 'IX'
                                [W::sam_hdr_parse] duplicated sequence 'VIII'
                                [W::sam_hdr_parse] duplicated sequence 'V'
                                [W::sam_hdr_parse] duplicated sequence 'XI'
                                [W::sam_hdr_parse] duplicated sequence 'X'
                                [W::sam_hdr_parse] duplicated sequence 'XIV'
                                [W::sam_hdr_parse] duplicated sequence 'II'
                                [W::sam_hdr_parse] duplicated sequence 'XIII'
                                [W::sam_hdr_parse] duplicated sequence 'XVI'
                                [W::sam_hdr_parse] duplicated sequence 'XII'
                                [W::sam_hdr_parse] duplicated sequence 'VII'
                                [W::sam_hdr_parse] duplicated sequence 'XV'
                                [W::sam_hdr_parse] duplicated sequence 'IV'
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools sort eg.bam eg.sorted
                                [bam_sort_core] merging from 26 files...
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf../genome eg.sorted.bam | bcftools view -o - > eg.raw.bcf
                                [fai_load] build FASTA index.
                                [fai_build] fail to open the FASTA file ../genome
                                [fai_load] fail to open FASTA index.
                                [vcf.c:1224 vcf_hdr_read] Could not read the header
                                Failed to open or the file not indexed: -
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf../genome.fa eg.sorted.bam | bcftools view -o - > eg.raw.bcf
                                [mpileup] 1 samples in 1 input files
                                <mpileup> Set max per-file depth to 8000
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ bcftools stats eg.raw.bcf
                                # This file was produced by bcftools stats (1.1+htslib-1.1) and can be plotted using plot-vcfstats.
                                # The command line was: bcftools stats eg.raw.bcf
                                #
                                # Definition of sets:
                                # ID [2]id [3]tab-separated file names
                                ID 0 eg.raw.bcf
                                # SN, Summary numbers:
                                # SN [2]id [3]key [4]value
                                SN 0 number of samples: 1
                                SN 0 number of records: 0
                                SN 0 number of SNPs: 0
                                SN 0 number of MNPs: 0
                                SN 0 number of indels: 0
                                SN 0 number of others: 0
                                SN 0 number of multiallelic sites: 0
                                SN 0 number of multiallelic SNP sites: 0
                                # TSTV, transitions/transversions:
                                # TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
                                TSTV 0 0 0 0.00 0 0 0.00
                                # Sis, Singleton stats:
                                # SiS [2]id [3]allele count [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
                                SiS 0 1 0 0 0 0 0 0 0
                                # AF, Stats by non-reference allele frequency:
                                # AF [2]id [3]allele frequency [4]number of SNPs [5]number of transitions [6]number of transversions [7]number of indels [8]repeat-consistent [9]repeat-inconsistent [10]not applicable
                                # QUAL, Stats by quality:
                                # QUAL [2]id [3]Quality [4]number of SNPs [5]number of transitions (1st ALT) [6]number of transversions (1st ALT) [7]number of indels
                                # IDD, InDel distribution:
                                # IDD [2]id [3]length (deletions negative) [4]count
                                # ST, Substitution types:
                                # ST [2]id [3]type [4]count
                                ST 0 A>C 0
                                ST 0 A>G 0
                                ST 0 A>T 0
                                ST 0 C>A 0
                                ST 0 C>G 0
                                ST 0 C>T 0
                                ST 0 G>A 0
                                ST 0 G>C 0
                                ST 0 G>T 0
                                ST 0 T>A 0
                                ST 0 T>C 0
                                ST 0 T>G 0
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf ../genome.fa eg.sorted.bam | bcftools view -o - > eg.raw.bcf
                                [mpileup] 1 samples in 1 input files
                                <mpileup> Set max per-file depth to 8000
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf genome.fa eg.sorted.bam | bcftools view -o u -v -c > eg.raw.bcf
                                [fai_load] build FASTA index.
                                [fai_build] fail to open the FASTA file genome.fa
                                [fai_load] fail to open FASTA index.
                                [vcf.c:1224 vcf_hdr_read] Could not read the header
                                Failed to open or the file not indexed: -
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ samtools mpileup -uf ../genome.fa eg.sorted.bam | bcftools view -o u -v -c > eg.raw.bcf
                                [mpileup] 1 samples in 1 input files
                                <mpileup> Set max per-file depth to 8000
                                [E::-c] unknown type
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ ls
                                ERR029139_1.fastq ERR029139_2.fastq eg.bam eg.sam
                                ERR029139_1.fastq.gz ERR029139_2.fastq.gz eg.raw.bcf eg.sorted.bam
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$ bcftools stats eg.raw.bcf
                                [vcf.c:1224 vcf_hdr_read] Could not read the header
                                Could not read the file or the file is not indexed: eg.raw.bcf
                                Zillur-Rahman:1:26:2015 ZILLURRAHMAN$

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Essential Discoveries and Tools in Epitranscriptomics
                                  by seqadmin




                                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                                  04-22-2024, 07:01 AM
                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-25-2024, 11:49 AM
                                0 responses
                                19 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-24-2024, 08:47 AM
                                0 responses
                                18 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                62 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                60 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X