Hi,
I am trying to analyze some of the CSHL long RNAseq datasets, but I ran into some problems.
1. When I try to use cufflinks on the BAM files, I encounter these errors:
2. When I convert the BAM into SAM, I get errors about missing XS tags, which as far as I understand come from the data being mapped with STAR
3. I then proceeded to map the data myself with TopHat2/bowtie2, but I get this error:
[2013-12-03 18:27:15]
Beginning TopHat run (v2.0.7)
-----------------------------------------------
[2013-12-03 18:27:15] Checking for Bowtie
Bowtie version: 2.0.6.0
[2013-12-03 18:27:15] Checking for Samtools
Samtools version: 0.1.19.0
[2013-12-03 18:27:16] Checking for Bowtie index files
[2013-12-03 18:27:16] Checking for reference FASTA file
[2013-12-03 18:27:16] Generating SAM header for /mnt/genomeDB/genomeIndices/hg19/bowtie2_index/nucleotide/hg19
format: fastq
quality scale: phred33 (default)
[2013-12-03 18:27:45] Preparing reads
left reads: min. length=76, max. length=76, 97206949 kept reads (16049 discarded)
right reads: min. length=76, max. length=76, 97058564 kept reads (164434 discarded)
[2013-12-03 21:07:23] Mapping left_kept_reads to genome hg19 with Bowtie2
/mnt/software/stow/tophat-2.0.7/bin/bam2fastx: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/bam2fastx)
/mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering)
/mnt/software/stow/tophat-2.0.7/bin/bam2fastx: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/bam2fastx)
[2013-12-04 02:21:50] Mapping left_kept_reads_seg1 to genome hg19 with Bowtie2 (1/3)
/mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering)
[2013-12-04 09:22:40] Mapping left_kept_reads_seg2 to genome hg19 with Bowtie2 (2/3)
/mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering)
Parse error at line 2849595: sequence and quality are inconsistent
gzip: stdout: Broken pipe
[FAILED]
Error running bowtie:
As I don't have much experience with this kind of data, I now am kind of stuck. Is there a way to salvage the already mapped BAM files into a cufflinks-compatible format? And what could be the reason that I encounter this error during mapping with bowtie2?
Any help is greatly appreciated
I am trying to analyze some of the CSHL long RNAseq datasets, but I ran into some problems.
1. When I try to use cufflinks on the BAM files, I encounter these errors:
Warning: BAM header has 0 length or is corrupted. Try using 'samtools reheader'.
File /mnt/genomeDB/ucsc/goldenPath/hg19/encodeDCC/wgEncodeCshlLongRnaSeq/wgEncodeCshlLongRnaSeqH1hescCytosolPapAlnRep2.bam doesn't appear to be a valid BAM file, trying SAM...
[11:08:25] Loading reference annotation.
[11:09:13] Inspecting reads and determining fragment length distribution.
SAM error on line 24909: CIGAR op has zero length
SAM error on line 26579: CIGAR op has zero length
SAM error on line 40345: CIGAR op has zero length
File /mnt/genomeDB/ucsc/goldenPath/hg19/encodeDCC/wgEncodeCshlLongRnaSeq/wgEncodeCshlLongRnaSeqH1hescCytosolPapAlnRep2.bam doesn't appear to be a valid BAM file, trying SAM...
[11:08:25] Loading reference annotation.
[11:09:13] Inspecting reads and determining fragment length distribution.
SAM error on line 24909: CIGAR op has zero length
SAM error on line 26579: CIGAR op has zero length
SAM error on line 40345: CIGAR op has zero length
3. I then proceeded to map the data myself with TopHat2/bowtie2, but I get this error:
[2013-12-03 18:27:15]
Beginning TopHat run (v2.0.7)
-----------------------------------------------
[2013-12-03 18:27:15] Checking for Bowtie
Bowtie version: 2.0.6.0
[2013-12-03 18:27:15] Checking for Samtools
Samtools version: 0.1.19.0
[2013-12-03 18:27:16] Checking for Bowtie index files
[2013-12-03 18:27:16] Checking for reference FASTA file
[2013-12-03 18:27:16] Generating SAM header for /mnt/genomeDB/genomeIndices/hg19/bowtie2_index/nucleotide/hg19
format: fastq
quality scale: phred33 (default)
[2013-12-03 18:27:45] Preparing reads
left reads: min. length=76, max. length=76, 97206949 kept reads (16049 discarded)
right reads: min. length=76, max. length=76, 97058564 kept reads (164434 discarded)
[2013-12-03 21:07:23] Mapping left_kept_reads to genome hg19 with Bowtie2
/mnt/software/stow/tophat-2.0.7/bin/bam2fastx: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/bam2fastx)
/mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering)
/mnt/software/stow/tophat-2.0.7/bin/bam2fastx: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/bam2fastx)
[2013-12-04 02:21:50] Mapping left_kept_reads_seg1 to genome hg19 with Bowtie2 (1/3)
/mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering)
[2013-12-04 09:22:40] Mapping left_kept_reads_seg2 to genome hg19 with Bowtie2 (2/3)
/mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /mnt/software/stow/tophat-2.0.7/bin/fix_map_ordering)
Parse error at line 2849595: sequence and quality are inconsistent
gzip: stdout: Broken pipe
[FAILED]
Error running bowtie:
Any help is greatly appreciated
Comment