Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • repinementer
    Member
    • Dec 2009
    • 80

    SAM to CUFFLINKS SAM format

    Does any one know how to convert SAM to cufflinks SAM format ?
    Even manual is fine I can write a simple code that converts manual to command line.
    All I want to know is is the first sequence is really SAM or not (It should be). If it is what coulmn actually represents original SAM format?

    Thanx

    Code:
    IL26_1184:1:109:734:594	67	clone::AL662826.11:1:145431:1	27827	0	36M	*	0	80	GGCCGCTGTGCGCGCCCCGCCTGCTGGACCACTTCA	>>>>>>><<>>>>>>>>>>>>8<<>8,,<<3<<8<3	MF:i:18	Aq:i:0	NM:i:0	UQ:i:0	H0:i:3	H1:i:0
    IL26_1184:1:109:734:594	147	clone::AL662826.11:1:145431:1	27871	0	36M	*	0	-80	CTGCCGGCGTTGCTCAAGCTGGCCTGCGGAGGCGAC	7.6<4667<64<<47<<<<.<<<<2<<<<<<<<<<<	MF:i:18	Aq:i:0	NM:i:0	UQ:i:0	H0:i:3	H1:i:0
    Code:
    s6.25mer.txt-913508	16	chr1 4482736 255 14M431N11M * 0 0 \   CAAGATGCTAGGCAAGTCTTGGAAG IIIIIIIIIIIIIIIIIIIIIIIII NM:i:0 XS:A:-
  • kopi-o
    Senior Member
    • Feb 2008
    • 319

    #2
    It looks as if your first snippet could be SAM. Usually the UNIX command

    sort -k3,3 -k4,4n in.sam > out.sam

    will be sufficient for Cufflinks to accept a sam file.

    Comment

    • lindylou
      Junior Member
      • Oct 2008
      • 4

      #3
      They are both SAM format however the second line contains the XS:A field. This field allows cufflinks to know which strand the RNA that produced this read came from. Cufflinks will not accept sam files that are not sorted and do not have this field. You can write a simple script to modify your sam file to include this information by taking the bit wise flag in field 2 where the strand information is stored and translating it.

      Hope that helps.

      Comment

      • repinementer
        Member
        • Dec 2009
        • 80

        #4
        But the data I have is Illumina-single end. As per my knowledge I lllumina still doesn't have strand specific data ??
        Anyways I did what you said but no use. see below

        after adding strand info
        Using this command
        head -6 3125_8.sam | awk '{if ($9 ~ /-/) {print $0"\t""XS:A:-"} else {print $0"\t""XS:A:+"}}'|sort -k 3,3 -k 4,4n
        IL6_3125:8:58:1625:1479 67 clone::AL662826.11:1:145431:1 1261 0 37M * 0 247 AAAAGGAGTAGGCAGGAAAACAGTCAATTATGGATTC ?BBCBBBB@<BBBBCB@A@?B?B>@A@B@BABB?B@? MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:4 H1:i:0 XS:A:+
        IL6_3125:8:37:57:1851 131 clone::AL662826.11:1:145431:1 1458 0 37M * 0 262 GTGAATTGGAGTCCTGNGTTTTATTTTCCTTTCCCAC AB?@BBBAB<@?AAB:!<<BBB@BBBB@;BBBB=ABA MF:i:18 Aq:i:0 NM:i:1 UQ:i:0 H0:i:0 H1:i:4 XS:A:+
        IL6_3125:8:58:1625:1479 147 clone::AL662826.11:1:145431:1 1471 0 37M * 0 -247 CTGAGTTTTATTTTCCTTTCCCACCTCAAACCCCACA @8???@<?:>@;<6@B:96BBB6BB>BAB;BBBB>BB MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:4 H1:i:0 XS:A:-
        IL6_3125:8:37:57:1851 83 clone::AL662826.11:1:145431:1 1683 0 37M * 0 -262 GAAGGACTTACTGAGATGGCTGCTCCCACTCTCCAGC BBACA?=BB=;BCABB9BC7AC9BAAA>AB5@/?CC@ MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:4 H1:i:0 XS:A:-
        IL6_3125:8:93:491:1573 67 clone::AL662826.11:1:145431:1 3983 0 37M * 0 221 CTGGAATACAGAGGTTTTCACGGAAGCCCAGGGGACC BCB?BBCCBBBBBBABCCBBC>>AB>ABA@;9>8>=B MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:3 H1:i:0 XS:A:+
        IL6_3125:8:93:491:1573 147 clone::AL662826.11:1:145431:1 4167 0 37M * 0 -221 CTCCCCCAGCCCAGGGGTCTGGCTTCCCCAGGAGGAC =;?>A?A>=9A@A?<5>?=@@AA?ABAAAA@BA@AAA MF:i:18 Aq:i:0 NM:i:0 UQ:i:0 H0:i:3 H1:i:1 XS:A:-
        And ran cufflinks and got this error

        cufflinks 1006_tester.sam
        cufflinks: /usr/lib64/libz.so.1: no version information available (required by cufflinks)
        [bam_header_read] EOF marker is absent.
        File 1006_tester.sam doesn't appear to be a valid BAM file, trying SAM...
        [10:52:52] Inspecting reads and determining fragment length distribution.
        SAM error on line 57: CIGAR op has zero length
        SAM error on line 71: CIGAR op has zero length
        SAM error on line 109: CIGAR op has zero length
        SAM error on line 206: CIGAR op has zero length
        SAM error on line 249: CIGAR op has zero length
        SAM error on line 290: CIGAR op has zero length
        SAM error on line 312: CIGAR op has zero length
        SAM error on line 354: CIGAR op has zero length
        SAM error on line 356: CIGAR op has zero length
        SAM error on line 360: CIGAR op has zero length
        SAM error on line 416: CIGAR op has zero length
        SAM error on line 455: CIGAR op has zero length
        SAM error on line 496: CIGAR op has zero length
        SAM error on line 502: CIGAR op has zero length
        SAM error on line 546: CIGAR op has zero length
        SAM error on line 566: CIGAR op has zero length
        SAM error on line 594: CIGAR op has zero length
        SAM error on line 668: CIGAR op has zero length
        SAM error on line 708: CIGAR op has zero length
        SAM error on line 714: CIGAR op has zero length
        SAM error on line 717: CIGAR op has zero length
        SAM error on line 744: CIGAR op has zero length
        SAM error on line 814: CIGAR op has zero length
        SAM error on line 824: CIGAR op has zero length
        SAM error on line 834: CIGAR op has zero length
        SAM error on line 866: CIGAR op has zero length
        SAM error on line 872: CIGAR op has zero length
        SAM error on line 875: CIGAR op has zero length
        SAM error on line 877: CIGAR op has zero length
        SAM error on line 901: CIGAR op has zero length
        SAM error on line 912: CIGAR op has zero length
        SAM error on line 934: CIGAR op has zero length
        SAM error on line 940: CIGAR op has zero length
        SAM error on line 979: CIGAR op has zero length
        SAM error on line 994: CIGAR op has zero length
        SAM error on line 996: CIGAR op has zero length
        SAM error on line 999: CIGAR op has zero length
        > Processed 392 loci. [*************************] 100%
        > Map Properties:
        > Total Map Mass: 28.92
        > Read Type: 37bp single-end
        > Fragment Length Distribution: Gaussian (default)
        > Estimated Mean: 203.69
        > Estimated Std Dev: 75.10
        [10:52:53] Assembling transcripts and estimating abundances.
        > Processing Locus clone::AL662824.9:1:187964:?4 [* ] SAM error on line 1063: CIGAR op has zero length
        > Processing Locus clone::AL662824.9:1:187964:?4 [* ] SAM error on line 1077: CIGAR op has zero length
        > Processing Locus clone::AL662824.9:1:187964:?4 [* ] SAM error on line 1115: CIGAR op has zero length
        > Processing Locus clone::AL662824.9:1:187964:?4 [** ] 1SAM error on line 1212: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [*** ] 1SAM error on line 1255: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [***** ] 2SAM error on line 1296: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [****** ] 2SAM error on line 1318: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [******* ] 3SAM error on line 1360: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [******* ] 3SAM error on line 1362: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [******* ] 3SAM error on line 1366: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [********* ] 3SAM error on line 1422: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [********** ] 4SAM error on line 1461: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [*********** ] 4SAM error on line 1502: CIGAR op has zero length
        > Processing Locus clone::AL662825.5:1:81768:1?4 [************ ] 4SAM error on line 1508: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************* ] 5SAM error on line 1552: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************** ] 5SAM error on line 1572: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [*************** ] 6SAM error on line 1600: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [**************** ] 6SAM error on line 1674: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [**************** ] 6SAM error on line 1714: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [***************** ] 6SAM error on line 1720: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [***************** ] 6SAM error on line 1723: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [****************** ] 7SAM error on line 1750: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [******************* ] 7SAM error on line 1820: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [******************* ] 7SAM error on line 1830: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [******************** ] 8SAM error on line 1840: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1872: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1878: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1881: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************* ] 8SAM error on line 1883: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************** ] 8SAM error on line 1907: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************** ] 9SAM error on line 1918: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [********************** ] 9SAM error on line 1940: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [*********************** ] 9SAM error on line 1946: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [*********************** ] 9SAM error on line 1985: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************************ ] 9SAM error on line 2000: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************************ ] 9SAM error on line 2002: CIGAR op has zero length
        > Processing Locus clone::AL662826.11:1:145431?4 [************************ ] 9SAM error on line 2005: CIGAR op has zero length
        > Processed 392 loci. [*************************] 100%
        Last edited by repinementer; 11-09-2010, 06:58 PM.

        Comment

        • dharan
          Junior Member
          • Jan 2012
          • 7

          #5
          Hi all,
          I was facing the same problem when I tried to run Cufflinks. But when you provide your SAM file with the header then you will not experience this error. It worked for my data but I am not sure about other datasets. But may be worth trying this.

          Comment

          Latest Articles

          Collapse

          • SEQadmin2
            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
            by SEQadmin2


            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

            Here are nine questions we think about, in roughly the order they matter, before...
            06-18-2026, 07:11 AM
          • SEQadmin2
            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
            by SEQadmin2


            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
            ...
            06-02-2026, 10:05 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, 06-26-2026, 11:10 AM
          0 responses
          10 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-17-2026, 06:09 AM
          0 responses
          45 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-09-2026, 11:58 AM
          0 responses
          105 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          125 views
          0 reactions
          Last Post SEQadmin2  
          Working...