Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • a Question about Duplicated Genes

    Hi all.

    I have a question about duplicated genes which have several copies in the genome such as Ccl21c. Tophat was used to map the RNA-Seq reads back to the genome and HTSeq was used to count the reads map to each gene. All of the reads stem from the duplicated genes will be get rid of by HTSeq because they are aligned to multiple places. Is there anyway to quantify these genes? Or could I mask the genomic regions of duplicated genes before running Tophat?

    Any suggestion will be much appreciated.

    Best

  • #2
    Don't mask the genome.

    The last thing you want is reads being forced to align to the wrong place, because you masked away the right place.

    If the genes are true duplicates, it's going to be pretty impossible to separate out what reads cam from where.

    Comment


    • #3
      Thanks very much for your reply. We don't want to distinguish which locus these reads come from for a gene with several copies in the genome, we wish to count all the reads uniquely mapped to this gene. For example, if a gene has 4 copies in the genome, we want to count the number of reads mapped to those regions which were not mapped to other genes. Is there any way to achieve that?

      One more request, could anyone recommend a tool to modify GTF file used in the RNA-Seq analysis? We used the reference GTF file from UCSC and found lots of genes seem identical, such as Gm14430, Gm4724 and Gm14434. We wish to merge such genes into a single gene.

      Thanks a bunch!

      Best

      Originally posted by swbarnes2 View Post
      Don't mask the genome.

      The last thing you want is reads being forced to align to the wrong place, because you masked away the right place.

      If the genes are true duplicates, it's going to be pretty impossible to separate out what reads cam from where.

      Comment


      • #4


        An international, peer-reviewed genome sciences journal featuring outstanding original research that offers novel insights into the biology of all organisms

        Comment


        • #5
          We mask all but one copy if the duplicates are exactly the same (~50% of genes) or less than 1% (other 40%); and flag the rest.

          It is probably OK if just looking at the gene expression. recently we start to integrate ChIPseq data with mRNAseq data. I can see this approach will cause problems there.

          Comment


          • #6
            Originally posted by liux View Post
            We mask all but one copy if the duplicates are exactly the same (~50% of genes) or less than 1% (other 40%); and flag the rest.

            It is probably OK if just looking at the gene expression. recently we start to integrate ChIPseq data with mRNAseq data. I can see this approach will cause problems there.
            Thanks very much for your reply. How do you mask them? Is there any software to mask them?

            We will also combine the ChIP-Seq data with mRNA-Seq data later on. What's the problem if the duplicates were masked?

            Thanks very much!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-27-2024, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-27-2024, 06:07 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            69 views
            0 likes
            Last Post seqadmin  
            Working...
            X