Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • getting dexseq_count.py to work

    Hi,

    Can you help me in this? I have paired end reads in sam format, and I have followed all steps to create the sorted sam file according to the passilla tutorial. I have successfully created the maize gtf file using the first python script. This is how my sam file looks like,

    Code:
    GALZUI2_0001:8:100:0:142#0/1	4	*	0	0	*	*	0	0	NCCTGGTGGAGACCGGAGGAGCCTCGGCAGAGATCG	#0--011+++858::7=386>?;=:9?9==8??###	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:0:1654#0/1	16	chr1	128019411	0	35M1S	*	0	0	TTTTCTCTGTTGATATTTCAATCTTCTTCCTCAGAN	FFFFFFFFFFFFFF=FFFFFF??===44574,,00#	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:0:1914#0/1	0	chr5	84824688	0	1S24M240N11M	*	0	0	NTTTGCAGCTGATGCTGAGAGCAAGATTGTCCCTGC	####################################	MD:Z:8X26	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1	YS:A:+
    GALZUI2_0001:8:100:0:1953#0/1	4	*	0	0	*	*	0	0	NCATAAACGATGCCGACCAGGGATCAGCGAGATCGG	####################################	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:0:1957#0/1	0	chr8	166503511	0	1S35M	*	0	0	NACAAGGTAGGCCTCAGCCGCCTCCTGCAGCGCGGA	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:0:2018#0/1	0	chr8	48995004	0	1S35M	*	0	0	NAAGGGTATAACATCTCTGATGTTCTCCATTCCGGT	#/-,+//.,,9:???==<<=FFF<<=:=?=881:8=	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:2	NH:i:2	YR:i:1
    GALZUI2_0001:8:100:0:2018#0/1	0	chr8	49024165	0	1S35M	*	0	0	NAAGGGTATAACATCTCTGATGTTCTCCATTCCGGT	#/-,+//.,,9:???==<<=FFF<<=:=?=881:8=	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:2	NH:i:2	YR:i:1
    GALZUI2_0001:8:100:0:287#0/1	4	*	0	0	*	*	0	0	NCGGGGTTTCTTATGCGTGGATCCGGGAGATCGGAA	####################################	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:0:356#0/1	0	chr8	175216115	0	1S35M	*	0	0	NTCGAATACATGTCCTCTCTTCTGGTTCAGAACACC	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:0:362#0/1	4	*	0	0	*	*	0	0	NAACAGCATGGATCCACCTTTTTCCCAACCTTTGAG	#*+('(++)+8::::1:60:==:<;=====FFF;FF	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:0:434#0/1	16	chr1	300430123	0	35M1S	*	0	0	ACCTTACTCTATGCAAGGCATGCCTTACTATCCTGN	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:0:596#0/1	0	chr2	104752145	0	1S35M	*	0	0	NGATGTGGTTGCGAAGAATGGCATGACGATGGTTGA	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:4	NH:i:4	YR:i:1
    GALZUI2_0001:8:100:0:596#0/1	0	chr4	68856217	0	1S35M	*	0	0	NGATGTGGTTGCGAAGAATGGCATGACGATGGTTGA	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:4	NH:i:4	YR:i:1
    GALZUI2_0001:8:100:0:596#0/1	0	chr5	193863017	0	1S35M	*	0	0	NGATGTGGTTGCGAAGAATGGCATGACGATGGTTGA	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:4	NH:i:4	YR:i:1
    GALZUI2_0001:8:100:0:596#0/1	0	chr9	90763379	0	1S35M	*	0	0	NGATGTGGTTGCGAAGAATGGCATGACGATGGTTGA	####################################	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:4	NH:i:4	YR:i:1
    GALZUI2_0001:8:100:1000:1005#0/1	16	chr1	31201451	0	36M	*	0	0	AAATATGGCACATATCAGGTGAACAGTGACCAAAAC	=886A<A?8)CEB=.CFEF88>CEHHDHBHEEEGEG	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1011#0/1	4	*	0	0	*	*	0	0	GCTACATCGACCTTTCGAAGCGTCGCGAGATCGGAA	AGEFBAEGGGECFHHHGCHDHHHFFHDHFHHH?HBC	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1039#0/1	4	*	0	0	*	*	0	0	TCTCATGTGATGAGAAGTAGAACTAGTGGAGAGATC	FFFFF:FFEF4EBFGFE>FFEFFFFFFFFFFFFFFF	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1073#0/1	4	*	0	0	*	*	0	0	ACCAGAGCCTGTCCGTGGATGGGACCGGAGATCGGA	HHEEHCHEHEGFHHEE=H/HEH9DHFEEF?=@GA##	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1076#0/1	4	*	0	0	*	*	0	0	GACAAGTTGGCCCACCAGAATATGAGCCTACAGGAA	GH1HHHEHHHHHHHFFFF6FFFFFFFF?FFF-FEF<	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1089#0/1	4	*	0	0	*	*	0	0	GCTCGATGGCGGATGAAAATCAGGCAGATCGGAAGA	HHCHHEHDHHF@EFF6=C;AE?BFFEE?EFBDA?FF	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1099#0/1	16	chr9	15614973	0	2S34M	*	0	0	TCTCAACGTTTGAAGAAAAACCGTGAGATATACCGG	8HHHGHGE=BCHHHHFGDB6EHFGEHHBDHD?GFFG	MD:Z:34	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:11#0/1	4	*	0	0	*	*	0	0	GTGGAGACGCAGGCGTGGAAGAGATCGGAAGAGCGG	AAA@>6>@/<DG;C?GGGDGGCGGCEEGEGGGGAGG	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1145#0/1	0	chr6	147883848	0	32M4S	*	0	0	ATTGTCGGCAACGGCGGGAAGCACCGCTGCCCCGCC	FGFG>DDHDHGEFADA####################	MD:Z:32	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1173#0/1	16	chr1	269036723	0	36M	*	0	0	TGTCAGGGACATGAAGGAGAAGCTCGCCTACATTGC	FD?BDFCC?6<<CC@>C6:1AGGEAG???7@AAC@=	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1195#0/1	16	chr3	135445725	0	36M	*	0	0	ACGGCGCCTGCCGCAAAGATCATAGATACAGTTGGA	###@?=6.GGCCGGGFG6GDGGGGGCGGCGGGGGGG	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:121#0/1	0	chr3	3313715	0	34M2S	*	0	0	GCCTCACTGAATATTCCAGCAGCTGTTGGCTGGGAG	HHHHHHHHHHIHHHHHHHHHHHHHHHHHHHHHHFGH	MD:Z:34	RG:Z:s_8_sequence.txt.gz	IH:i:3	NH:i:3	YR:i:1
    GALZUI2_0001:8:100:1000:121#0/1	0	chr8	25015676	0	34M2S	*	0	0	GCCTCACTGAATATTCCAGCAGCTGTTGGCTGGGAG	HHHHHHHHHHIHHHHHHHHHHHHHHHHHHHHHHFGH	MD:Z:34	RG:Z:s_8_sequence.txt.gz	IH:i:3	NH:i:3	YR:i:1
    GALZUI2_0001:8:100:1000:121#0/1	16	chr4	63782093	0	2S34M	*	0	0	CTCCCAGCCAACAGCTGCTGGAATATTCAGTGAGGC	HGFHHHHHHHHHHHHHHHHHHHHHHIHHHHHHHHHH	MD:Z:34	RG:Z:s_8_sequence.txt.gz	IH:i:3	NH:i:3	YR:i:1
    GALZUI2_0001:8:100:1000:1217#0/1	16	chr10	68420921	0	36M	*	0	0	ACCTGAAGAGTGTTAGGGAATTGATCTACAAAAGAG	50?EEEGEGEGEEBGED??@;CDGGDC?CCBBD=DD	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1249#0/1	0	chr1	222562953	0	36M	*	0	0	AAAGGATGAGACCAGAGAATCATAGCAATCAGCTCA	HHGCHH@HHGDHHEHAGE5GHHHHHHF7HHHHEHHE	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:3	NH:i:3	YR:i:1
    GALZUI2_0001:8:100:1000:1249#0/1	16	chr2	41634471	0	36M	*	0	0	TGAGCTGATTGCTATGATTCTCTGGTCTCATCCTTT	EHHEHHHH7FHHHHHHG5EGAHEHHDGHH@HHCGHH	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:3	NH:i:3	YR:i:1
    GALZUI2_0001:8:100:1000:1249#0/1	16	chr5	161906550	0	36M	*	0	0	TGAGCTGATTGCTATGATTCTCTGGTCTCATCCTTT	EHHEHHHH7FHHHHHHG5EGAHEHHDGHH@HHCGHH	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:3	NH:i:3	YR:i:1
    GALZUI2_0001:8:100:1000:1371#0/1	0	chr10	89544264	0	35M1S	*	0	0	CTTATGCTTCACTTTTACTATAGGCTCAGAACTTTT	GGGFAHHHHHHHHHHHHGGHHHHGHHHGHHHHHHHE	MD:Z:35	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1465#0/1	4	*	0	0	*	*	0	0	CGGTTCAGCAGGAATGCCGAGATCGGAAGAGCGGTT	>BBEEBEEEF3FCC@FFFEFFFFFBFEFFF=FBFFE	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1489#0/1	16	chr2	176558351	0	36M	*	0	0	CGTGACAAGTGCAGGAAACAAACCACTGAAAAGAAT	@=DA@6F<<@>F?DDG>7C=?6==4;A@6ADEBEBE	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1496#0/1	4	*	0	0	*	*	0	0	CCAGATCAGCGTCGACTCATTTCGGGAGATCGGAAG	GHEHHHHHFHGGGG??AG=GEGGGGGGGDGBGEGGE	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1536#0/1	16	chr9	94009423	0	36M	*	0	0	AGTCAAATGTAACAACTGGTTTTAGCTTGATCTCTT	FHHHHHHGHHHEHHHHFHHHHHHHGHHHHHHHHHHH	MD:Z:36	RG:Z:s_8_sequence.txt.gz	IH:i:1	NH:i:1
    GALZUI2_0001:8:100:1000:1555#0/1	4	*	0	0	*	*	0	0	CGTCGTTGGGGTAGTAGACGGCAGATCGGAAGAGCG	FFBBFFHHHFHHHFEHHHHAHHHHHFDHHHFHHGHH	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped
    GALZUI2_0001:8:100:1000:1559#0/1	4	*	0	0	*	*	0	0	GCAGATTTCACCAAGTGTTGGATTGTTCAGATCGAA	HHDGHFHHHHHFH@B?HGHCHHHHE:HHFFHEHHFH	RG:Z:s_8_sequence.txt.gz	IH:i:0	NH:i:0	YU:Z:unmapped

    I`m getting the following error. Though the reads are paired-end, the program does not recognize it as paired end. How do I get the program to run? If I dont specify any parameter (not use --pair=yes), it gives me an output with all 0 counts.


    Code:
      File "dexseq_count.py", line 132, in <module>
        for af, ar in HTSeq.pair_SAM_alignments( HTSeq.SAM_Reader( sam_file ) ):
      File "/usr/lib64/python2.6/site-packages/HTSeq-0.5.4p3-py2.6-linux-x86_64.egg/HTSeq/__init__.py", line 612, in pair_SAM_alignments
        raise ValueError, "'pair_alignments' needs a sequence of paired-end alignments"
    ValueError: 'pair_alignments' needs a sequence of paired-end alignments

  • #2
    None of your reads are mapped as pair-end (at least of the reads you used in your example). Why you're still not getting any counts when not specifying that you have paired-end reads I don't know. Can you post the exact command that you're using?

    Comment


    • #3
      Code:
      python dexseq_count.py   zea_mays.AGPv2.62.gff sorted_samfile.sam exon_counts.txt
      Here is the command I used, all the exons (for 30k genes) come up with a 0 number in the output if nothing is specified about paired-e

      top of output file

      Code:
      AC147602.5_FG004:001    0
      AC147602.5_FG004:002    0
      AC147602.5_FG004:003    0
      AC147602.5_FG004:004    0
      AC147602.5_FG004:005    0
      AC148152.3_FG001:001    0
      AC148152.3_FG001:002    0
      AC148152.3_FG001:003    0
      AC148152.3_FG001:004    0
      AC148152.3_FG002:001    0

      Comment


      • #4
        Well, I don't know about the AGPv2.62 version, but at least in the AGPv3.18 available from ensembl there are no chromosomes/contigs with names like chr1, chr2, etc. That could certainly result in 0 counts for everything (though probably some warnings too). Perhaps try opening the coordinate-sorted bam file in IGV or something similar and see if there's any obvious visual reason to get 0 counts.

        Comment


        • #5
          hi dpryan, i have to use AGPv2.62 annotation,,,the my gtf file has chr1,chr2 etc., when you say none of the reads are paired end, it is because of the column 2 flags being 16 (reverse strand), 0(forward strand) and 4(unmapped) ?

          Comment


          • #6
            let me give an example.

            A couple of lines from my gtf file shows coordinates of a gene and its first exon

            Code:
            chr1	dexseq_prepare_annotation.py	aggregate_gene	300422454	300435350	.	+	.	gene_id "GRMZM2G077596"
            chr1	dexseq_prepare_annotation.py	exonic_part	300422454	300422756	.	+	.	transcripts "GRMZM2G077596_T01"; exonic_part_number "001"; gene_id "GRMZM2G077596"
            If i want to view all reads mapping to the first exon, i can use samtools

            Code:
            samtools view input.bam chr1:300422454-300422756
            Any help will be greatly appreciated


            this is the output for the previous command

            Code:
            GALZUI2_0001:2:103:1190:211#0/1	0	chr1	300422484	0	36M	*	0	0	CTTCTTTTGTTCTTTAATTTGGTTCGTACGTACAAG	HHHHFEHCCHCCHHHHHGHHG<BEGHCFHHHHHCHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:6:220:1972#0/1	0	chr1	300422515	0	36M	*	0	0	ACAAGACTTCTCGGATCACTCGTCTTCTTTGATTGC	HHHHHHHHHFHHHHHHHHHHHHHHHHGHHHHHHHHG	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:42:1567:1468#0/1	0	chr1	300422526	0	36M	*	0	0	CGGATCACTCGTCTTCTTTGATTGCATCATCGAGAC	HHHHHHHHHHIHHHHHHHHHHHHHHGHHHHHHHHEH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:120:1465:1034#0/1	0	chr1	300422541	0	36M	*	0	0	CTTTGATTGCATCATCGAGACCTGCATTTTCCCTTC	CDCCC;GDGG;GGGGGG7G<GGGGGGGGGGGGGGGG	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:11:502:602#0/1	0	chr1	300422554	0	36M	*	0	0	ATCGAGACCTGCATTTTCCCTTCCAAATTCGTCACT	HHHFHHHHHFHHFHHHHHHHHHDHHHEEHHCEHHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:99:1503:859#0/1	16	chr1	300422569	0	36M	*	0	0	TTCCCTTCCAAATTCGTCACTCACTCTGGTTGGCCG	FHHGHCHDGHHHGHCHGHGHHHHHEHHHHHBGGGBG	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:118:626:857#0/1	16	chr1	300422569	0	36M	*	0	0	TTCCCTTCCAAATTCGTCACTCACTCTGGTTGGCCG	H?DDHH<GHHGHHHHD;HEHHDHHHGHHHHGHHDHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:59:1772:1903#0/1	0	chr1	300422580	0	36M	*	0	0	ATTCGTCACTCACTCTGGTTGGCCGCCTTCTGTCTT	FHHHHHHGHHHDHHHHHDHHHHHHEHHHHHHHGHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:75:95:1416#0/1	16	chr1	300422586	0	1S35M	*	0	0	ACACTCACTCTGGTTGGCCGCCTTATGTCTTCTGAT	##########B?288<10*7=?..'7?C3CBD?DHF	MD:Z:23X11	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:18:1001:1651#0/1	0	chr1	300422587	0	36M	*	0	0	ACTCACTCTGGTTGGCCGCCTTCTGTCTTCTGATCC	HHHHHHHGHHHHHGEHHFHHHHHHHHHHHHHHHHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:54:199:1717#0/1	16	chr1	300422600	0	36M	*	0	0	GGCCGCCTTCTGTCTTCTGATCCAATCCGGTTGAAA	DHHFHIHHHHHHEHHHGHHDHHHFHHHHHDHHHHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:27:889:358#0/1	16	chr1	300422647	0	1S35M	*	0	0	TCTTCCAGCAAGATCTGGCACATAAGGAGAATCGGC	GHHH=HHHHHHHHFH=HHFEHHHHHHHGHHHHHHHC	MD:Z:35	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:37:242:1709#0/1	16	chr1	300422679	0	36M	*	0	0	GGCAAGAACCATTCTGCAAATGAGGCCGGATACGCG	HFHHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:78:1327:822#0/1	16	chr1	300422679	0	36M	*	0	0	GGCAAGAACCATTCTGCAAATGAGGCCGGATACGCG	HHHHHHHFHGHHHHFHHHGHHHHHHHHHHHHHHHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:106:745:1364#0/1	16	chr1	300422680	0	1S35M	*	0	0	TGCAAGAACCATTCTGCAAATGAGGCCGGATACGCG	HHHHHHHEHHHHH8HHHHHHHBHHHHHEGHIHIGHH	MD:Z:35	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:112:510:330#0/1	16	chr1	300422712	0	36M	*	0	0	GCGGCTTGAATCGGCGGTGTTCCAGCTCACCCCGAC	HHHGHIGFHIHFHGHIHHHHGHEHHHGEHGHHHHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1
            GALZUI2_0001:2:76:91:1422#0/1	0	chr1	300422734	0	23M357N13M	*	0	0	CAGCTCACCCCGACCCGCACCAGGTGTGATTTAGTT	HHHHHHHGIHHGHGHGHHHHHBHH=HHFEHHHEHHH	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1	YS:A:+
            GALZUI2_0001:2:36:34:147#0/1	0	chr1	300422741	0	16M357N20M	*	0	0	CCCCGACCCGCACCAGGTGTGATTTAGTTGTGGTGG	HHHHHGHHHHHHHIHHHHHHHHIHHHHHHHHHHHHG	MD:Z:36	RG:Z:s_2_sequence.txt.gz	IH:i:1	NH:i:1	YS:A:+

            But in the output from


            Code:
            python dexseq_count.py zea_mays.AGPv2.62_mod.gff sorted_sam.sam counts_exons.txt

            and then searching for the gene gives

            Code:
            grep GRMZM2G077596 counts_exons.txt
            
            GRMZM2G077596:001       0
            GRMZM2G077596:002       0
            GRMZM2G077596:003       0
            GRMZM2G077596:004       0
            GRMZM2G077596:005       0
            GRMZM2G077596:006       0
            GRMZM2G077596:007       0
            GRMZM2G077596:008       0
            GRMZM2G077596:009       0
            GRMZM2G077596:010       0
            GRMZM2G077596:011       0
            GRMZM2G077596:012       0
            GRMZM2G077596:013       0
            Last edited by alpesh; 06-18-2013, 10:28 PM.

            Comment


            • #7
              All the alignments in your SAM file excerpt have an alignment quality (5th column) of zero but, by default, dexseq-count.py only counts read with an alignment quality of at least 10. Try to find out why they are all zero, and if this is just a bogus output of your aligner, use the option '-a 0' to set the minimum quality to 0.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              30 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              53 views
              0 likes
              Last Post seqadmin  
              Working...
              X