Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to define XS tag

    Hi.

    Does anyone know how to define the XS tag? I'm trying to use cuffinks to analyze GSNAP result? But it found many errors in the sam file: SAM error on line ***: found spliced alignment without XS attribute. I tracked these lines. They all have XS:A:?. So I guess the problem is caused by that GNSAP couldn't determine the direction value for these reads. What should I do?

    Wish your help! Thanks very much!

  • #2
    Just encountered the same problem.

    I also looked at the output of Cufflinks and compared to what had got with the output of a previous version of GSNAP (Old version 2011-03-28).

    The main issue is that now I get many more (around 14000) Ensembl annotated genes with a FAIL FPKM_status compared to 127 that I got when using the previous version of GSNAP.

    If anybody has come up with a solution to that or knows why GSNAP behaves in such a a way, I'd love to hear your comments.

    Comment


    • #3
      From http://research-pub.gene.com/gmap/src/README :

      To provide splice orientation, though,
      our SAM output includes information about splice orientation in an
      extra "XS" field, which has possible values "+" (meaning the expected
      GT-AG, GC-AG, or AT-AC dinucleotide pair is on the plus strand of the
      genome), "-" (the dinucleotides are on the minus strand), or "?" (the
      direction is unknown, because the dinucleotides do not match GT-AG,
      GC-AG, AT-AC, or their complements).

      Edit:
      I forgot to say : maybe you should not rely on this spliced alignment...

      Comment


      • #4
        sam output error

        So did anyone find out how to solve the problem of sam output error(XS tag)


        Thanks

        Comment


        • #5
          I am in the same situation. The way I am trying to resolve the problem is by creating a program that would find the strand information (in addition to those located on a splice junction). There doesn't seem to be any other method.

          Brdido--why did you say "maybe you should not rely on this spliced alignment...?"

          Comment


          • #6
            Zorph,

            i wrote that because as the alignment program was not able to detect the "expected" splicing bases, something should be wrong with the read.

            But recently i wrote this question to the developers of cufflinks and they suggested to add "manually" this tag.

            I did not answered here because my problem was that i was using 2 differrent aligners... And one of them didn't assigned the XS:A tag.

            But i do believe now that adding the XS tag is not a problem at all and a good solution for compatibility with cufflinks. It is what I did.

            Comment


            • #7
              Hi Zorph, how did you find the strand information from a sam file? The first three lines of the sam file I'm using are

              HWI-EAS440:2:25:193:1474#0 0 2L 7532 25 75M * 0 0 CTCGCATGTAGAGATTTCCACTTATGTTTTCTCTACTTTCAGCAACCGAGAAGAGAACCCANGTTTGAACAAGTA abbaba`b^_`abaa`aa_aaaabaa_`aaaa_aa`aaa`aaba`]^`aa_aa``[[Z]XWDVR\\\YX^^^ZQU NM:i:1 X1:i:1 MD:Z:61N13
              HWI-EAS179:1:29:802:267#0 16 2L 7621 25 75M * 0 0 ACAGCTATCCCCGCTTCATAACGAATGAGGCTGCCGAGGACCTGATTTACAAGAAGTCCATGGGCGAGCGGGATC TQTYWRRQQXX]\\XOYZNZYZ]^YZ]^^^\X_^^a``a[\`^`_Z]^^aaaaaaa_aa`]aaa`a`aabbbaaa NM:i:0 X0:i:1 MD:Z:75
              HWUSI-EASXXX:2:65:1779:826#0 160 2L 7622 25 76M = 7711 165 CAGCTATCCCCGCTTCATAACGAATGAGGCTGCCGAGGACCTGATCTACAAGAAGTCCATGGGCGAGCGGGATCAG BC>CBCBACCBC@CCBCCCCBBCBCCCBBCCCCBBBBBA@ABBB@*<A;?BB>BABA@ABBABAB:BACB?7A=?= NM:i:1 X1:i:1 MD:Z:45C30

              I can't tell what their strand information is. Thanks!

              Originally posted by zorph View Post
              I am in the same situation. The way I am trying to resolve the problem is by creating a program that would find the strand information (in addition to those located on a splice junction). There doesn't seem to be any other method.

              Brdido--why did you say "maybe you should not rely on this spliced alignment...?"

              Comment


              • #8
                hey lijy03,

                HTML Code:
                Hi Zorph, how did you find the strand information from a sam file?
                I did't know the strand information from the SAM file. I knew it because of the way I prepped my reads.
                With my preparation using epicentre's script-seq kit, I knew that my PE-1 reads aligned to the 5' end of the sequence and that my PE-2 aligned to the 3' end of the sequence. The way that Illumina sequences these reads that works out that all my reads from PE-1 aligned to the sense strand of the RNA and my PE-2 aligned to the opposite strand of my transcribed RNA.

                Using BOTH this information and the sam flag, I was able to tell which strand my RNA was generated from.

                If you have a stranded prep and you know which read has the adaptor indicating directionality or in the case of Single reads, you know which end of the read corresponds to the 5' or 3' direction then you should be able to figure out the directionality of the reads in your library.

                Comment


                • #9
                  hey lijy03,

                  yhis thread might help you:

                  Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Best Practices for Single-Cell Sequencing Analysis
                    by seqadmin



                    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                    06-06-2024, 07:15 AM
                  • seqadmin
                    Latest Developments in Precision Medicine
                    by seqadmin



                    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                    Somatic Genomics
                    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                    05-24-2024, 01:16 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 07:23 AM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-17-2024, 06:54 AM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-14-2024, 07:24 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-13-2024, 08:58 AM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X