I am using bowtie/tophat (allowing multimaps 10) to map my RNA-seq data.
I did cuffdiff and I am trying other methods such as DESEQ and edgeR etc..
The remaining methods need count data.
Therefore, I tried HTSEQ. Before doing HTSEQ I sorted my file by query name using picard tools Sortsam.jar.
The problem I am running into is after sorting my bam file I see reads with no names sometimes and I believe HTSEQ is giving me an error because of this:
Error occured when processing SAM input (record #109 in file /Users/mparida/Oiko_Oceanic_Acid_stuff/dnacore454.healthcare.uiowa.edu/Project_Manak_code_RNAseq/Sample_1_code15801B2R/1_code15801B2R_ATCACG_L008_SORTED_QUERY_NAME.bam):
'pair_alignments' needs a sequence of paired-end alignments
[Exception type: ValueError, raised in __init__.py:603]
I am not sure what these alignments are and when I look at the alignment it looks the folllowing:
89 scaffold_10 477964 50 88M * 0 0 AGACTTTTAACGGACTTTAACTCGAAATGGCCCGCAACAGCACCGTCAATGNCGTCGCCAAGTACTCTCGANCGAAGATGTATTCTCG CDCDDDDDDDDDDDDDECDCDDCEEDDBDDDDDDDDDDDDDDFFFHHHA;-#JJJIIIIIGHD?IHHFF?1#JJJJJJIIGJJJJHHH AS:i:-2 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:51G19T16 YT:Z:UU NH:i:1
Any insight into what these alignments might be and why they have no names in my alignment file.
I can't even find this read from my raw data file.
Rocky
I did cuffdiff and I am trying other methods such as DESEQ and edgeR etc..
The remaining methods need count data.
Therefore, I tried HTSEQ. Before doing HTSEQ I sorted my file by query name using picard tools Sortsam.jar.
The problem I am running into is after sorting my bam file I see reads with no names sometimes and I believe HTSEQ is giving me an error because of this:
Error occured when processing SAM input (record #109 in file /Users/mparida/Oiko_Oceanic_Acid_stuff/dnacore454.healthcare.uiowa.edu/Project_Manak_code_RNAseq/Sample_1_code15801B2R/1_code15801B2R_ATCACG_L008_SORTED_QUERY_NAME.bam):
'pair_alignments' needs a sequence of paired-end alignments
[Exception type: ValueError, raised in __init__.py:603]
I am not sure what these alignments are and when I look at the alignment it looks the folllowing:
89 scaffold_10 477964 50 88M * 0 0 AGACTTTTAACGGACTTTAACTCGAAATGGCCCGCAACAGCACCGTCAATGNCGTCGCCAAGTACTCTCGANCGAAGATGTATTCTCG CDCDDDDDDDDDDDDDECDCDDCEEDDBDDDDDDDDDDDDDDFFFHHHA;-#JJJIIIIIGHD?IHHFF?1#JJJJJJIIGJJJJHHH AS:i:-2 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:51G19T16 YT:Z:UU NH:i:1
Any insight into what these alignments might be and why they have no names in my alignment file.
I can't even find this read from my raw data file.
Rocky
Comment