Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • metheuse
    Member
    • Jan 2013
    • 84

    #16
    Originally posted by Simon Anders View Post
    Hav you checked your alignments with a genome browser? Load the SAM file and the GFF file produced by dexseq_prepare in, e.g., IGV, and look at one of the loci with zero counts. If there really are no reads, you experiment has failed (or you are using a wrong annotation file).
    I converted the bam file to bed and intersected it with "chr1 29385323 29385364 ENSG00000159023:023" (which has zero count in the dexseq_count.py output)
    This resulted in 71 intersecting reads. Here are the first 10 of them:
    Code:
    chr1    29379725        29391495        HWI-ST1235:101:C1WW9ACXX:6:1312:1770:71045/1    50      +
    chr1    29379725        29391495        HWI-ST1235:101:C1WW9ACXX:6:2116:17025:56846/2   50      -
    chr1    29379731        29391501        HWI-ST1235:101:C1WW9ACXX:6:2307:18153:8715/1    50      +
    chr1    29379734        29391504        HWI-ST1235:101:C1WW9ACXX:6:2314:2163:52123/2    50      +
    chr1    29379736        29391506        HWI-ST1235:101:C1WW9ACXX:6:2314:2163:52123/1    50      -
    chr1    29379741        29391511        HWI-ST1235:101:C1WW9ACXX:6:2313:2799:76858/1    50      +
    chr1    29379742        29391512        HWI-ST1235:101:C1WW9ACXX:6:2210:9177:71165/1    50      +
    chr1    29379742        29391512        HWI-ST1235:101:C1WW9ACXX:6:1101:15865:15952/2   50      -
    chr1    29379742        29391512        HWI-ST1235:101:C1WW9ACXX:6:1307:5858:86759/2    50      -
    chr1    29379742        29391512        HWI-ST1235:101:C1WW9ACXX:6:1308:20389:32047/1   50      -
    This should mean both my reads and the annotation file has no problem.
    Last edited by metheuse; 04-17-2013, 05:12 PM.

    Comment

    • metheuse
      Member
      • Jan 2013
      • 84

      #17
      By the way, these are the commands I used:
      Code:
      samtools index 21722_mapped_hg19/accepted_hits.bam
      samtools view 21722_mapped_hg19/accepted_hits.bam >21722_accepted_hits.sam
      sort -k1,1 -k2,2n 21722_accepted_hits.sam >21722_accepted_hits_sorted.sam
      python ~/scripts_64/dexseq_count.py -p yes -s no ~/scripts_64/Homo_sapiens.GRCh37.71.DEXSeq.chr.gff 21722_accepted_hits_sorted.sam KARPAS299_CEP.txt
      Last edited by metheuse; 04-17-2013, 08:54 PM.

      Comment

      • metheuse
        Member
        • Jan 2013
        • 84

        #18
        I just noticed the chromosome name of the gff file doesn't contain "chr":
        Code:
        1    Homo_sapiens.GRCh37.71.gtf      exonic_part     11869   11871   .       +       .       transcripts "ENST00000456328"; exonic_part_number "001"; gene_id "ENSG00000223972"
        This is probably the reason. I've added chr to each line and see if it works.

        Comment

        • sindrle
          Senior Member
          • Aug 2013
          • 266

          #19
          I also have a question about warnings and dispersion:

          ecs <- estimateSizeFactors( ecs )
          the matrix is either rank-deficient or indefinite

          ecs <- fitDispersionFunction( ecs )
          Too much damping - convergence tolerance not achievable

          Click image for larger version

Name:	Screen Shot 2014-02-09 at 12.58.40.jpg
Views:	1
Size:	34.6 KB
ID:	304433

          And does this look ok?

          Thanks! First time DEXSeq..

          Comment

          • czelin
            Junior Member
            • Feb 2013
            • 1

            #20
            Dear all,

            I have recently started with exon-wise analysis and would appreciate your help.

            I have paired 100bp reads. I have prepared the annotation file with DEXSeq python scripts. What I have realized is that when I have a shorter exon (e.g. 150bp) the number of tags is 0. In IGV I can see lots (>500) of reads spanning this exon. Somehow few of the reads were counted for the control samples and none for comparison and this exon was reported as significantly DE.
            I suspect that is because the two read pairs are longer than this exon and they overlapped adjacent exon[s]. This caused the reads be considered as "_ambiguous_readpair_position". If my understanding is correct, is there any way to solve the short-exon issue? If my understanding is wrong, could you please correct me?

            Comment

            • areyes
              Senior Member
              • Aug 2010
              • 165

              #21
              Hi czelin,

              This is unlikely, since reads that overlap to many exons should be counted once per each exon. You could check whether the reads that are ignored by the script are properly paired or if they are mapping to multiple regions on the genome?

              Alejhandro

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                Today, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 05:37 AM
              0 responses
              7 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              17 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              51 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              110 views
              0 reactions
              Last Post SEQadmin2  
              Working...