Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Hi,
    It would work the same if your File.mapped.bam has been aligned to a transcriptome index.

    By the way, I have a general question related to this -r parameter:
    Did you guys tried to omit this parameter (or as I did, accidentally forgot to put it)? I know the manual says "There is no default, and this parameter is required for paired end runs.", but it does work quite well if you forget it!
    I tested with and without the option, on a same dataset and the difference was very marginal: 1 junction was not found without the parameter (out of >100,000) and 2000 alignments were missing (out of >700,000).
    I have no idea what the default value is. The accepted_hits.bam file is well formatted and the reads are properly paired.
    Similarly, I always wanted to test a bunch of values and see how sensible this parameter is. My assumption is "not very much", but I may be wrong. Did anybody tried?

    Please let me know if I am the only one to get to work without the -r parameter.

    Comment


    • #32
      Hi Nicolas.

      TopHat doesn't have much need for the mean inner distance nowadays, and will even less in the future, according to Cole Trapnell. If you don't specify -r with a value it will default to 50 as it says in the release notes from an earlier version, they just haven't updated the manual yet.

      Comment


      • #33
        I looked again at the Picard MarkDuplicates output and I think I may have interpreted the percent_duplication=0.250123 wrong. Perhaps they mean it is as high as 25%, not 0.25% as I initially thought. But when reading about what both samtools rmdup and Picardtools MarkDuplicates actually does: Marking reads as duplicates if they have the same 5' coordinates, then it doesn't seem strange that so many reads were marked as duplicates, considering that my inner distance is -50 and my reads are very overlapping so sometimes the reads will start on the same coordinate.

        Comment


        • #34
          [QUOTE=Jon_Keats;68654]Couple of questions:
          Last edited by swebb; 04-17-2012, 07:32 AM.

          Comment


          • #35
            I did some experiments for the same sample using Tophat 1.3.3. It seems the -r parameter of Tophat will have influence on cufflinks analysis. When I use a wrong -r parameter, the Tophat output.bam file change not much( you can see from IGV). But when it comes to cufflinks, some genes that have a FPKM value(read coverage in IGV) now have a zero FPKM. So I guess that, cufflinks will need the proper paired information to estimate gene's FPKM.
            I am wondering if anyone observe that.

            Comment


            • #36
              setting up the -r parameter in tophat

              Originally posted by glados View Post
              Hi Nicolas.

              TopHat doesn't have much need for the mean inner distance nowadays, and will even less in the future, according to Cole Trapnell. If you don't specify -r with a value it will default to 50 as it says in the release notes from an earlier version, they just haven't updated the manual yet.
              Hi Glados

              Reading the thread and your comments. I have a query about setting up the parameter in -r. I also tried to get the information of insert size from the sequencing lab and they said me it to be 180 bp for 96 bp illumina paired end reads. So what can be the ideal set up for -r for me will it be (180- (2x94))=-8. Do I need to have the -8 (aaprox -10) as - r or Another way could be like i keep it default.

              Regards

              Comment


              • #37
                Dear figo1019.

                Make sure the lab meant 180 as insert size and not inner distance. If I were you, I would try with both the default and -8 as inner distance. See if you get any differences in tophat output. You can view the assembly in a genome browser, like IGV, and see if the reads on average overlap approximately 8 bp. If you are certain it is -8, I would use it in tophat.

                Comment


                • #38
                  How to estimate the fragment length for single-end RNAseq reads

                  Hi Nicolas,

                  I have been stuck in estimating the fragment length for single-end RNAseq reads.

                  I applied your script (http://seqanswers.com/forums/showthread.php?t=16472) to my bowtie2 results but got no results.
                  my command:
                  samtools view -S -F 0x4 Ery_rep1.ip.sam | awk '{if ($9 >0) {sum+=$9;sumsq+=$9*$9;N+=1}} END {print "mean = " sum/N " SD=" sqrt(sumsq/N - (sum/N)**2)}'

                  output:
                  [samopen] SAM header is present: 22 sequences.
                  awk: (FILENAME=- FNR=18471674) fatal: division by zero attempted

                  My sam file looks like below, it seems that the inferred fragment lengths are to be zero. Is it because the my reads are single-end reads?

                  42A08AAXX:6:098:00152:01628/1 16 chr1 3011033 42 51M * 0 0 TCACCTGTTCTTCTCACTGTTGTGGCCTGAGTCAGAAC
                  AACTAGAGTCCTC ############################9.(<.6BA/@8>([email protected].= AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:19G31 -
                  YT:Z:UU
                  42A08AAXX:6:097:00856:00542/1 16 chr1 3011742 42 51M * 0 0 AGCTCTGTGTTCTGCTTGAGCTGACTCTCTAGACAGCT
                  ATGTGGATATTTC >;A:B>>B@?ABB@BB<@B>BBBBBBBABBBBBCBBBBBCBBBBCBCBBAB AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:51 YT:Z:U
                  U


                  Thanks,
                  Holly

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Genetic Variation in Immunogenetics and Antibody Diversity
                    by seqadmin



                    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                    11-06-2024, 07:24 PM
                  • seqadmin
                    Choosing Between NGS and qPCR
                    by seqadmin



                    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                    10-18-2024, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 11:09 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Today, 06:13 AM
                  0 responses
                  20 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 11-01-2024, 06:09 AM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-30-2024, 05:31 AM
                  0 responses
                  21 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X