Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cufflinks running problem

    Hi,

    I am using bowtie and cufflinks to get gene expression from RNA seq data.

    My workflow is like this:

    bowtie -p 4 --best --strata -m 1 --sam mm9/mm9 -q SRR032476.fastq SRR032476.sam

    and when i run

    cufflinks SRR032476.sam

    I got an error like

    Code:
    cufflinks: /usr/lib64/libz.so.1: no version information available (required by cufflinks)
    [bam_header_read] EOF marker is absent.
    File SRR032476.sam doesn't appear to be a valid BAM file, trying SAM...
    [15:46:53] Inspecting reads and determining fragment length distribution.
    > Processing Locus chr2:131113838-131113873    [                         ]   0%
    Error: this SAM file doesn't appear to be correctly sorted!
    	current hit is at chr1:162968481, last one was at chr9:108470655
    Cufflinks requires that if your file has SQ records in
    the SAM header that they appear in the same order as the chromosomes names 
    in the alignments.
    If there are no SQ records in the header, or if the header is missing,
    the alignments must be sorted lexicographically by chromsome
    name and by position.
    the head of sam file is:

    Code:
    @HD	VN:1.0	SO:unsorted
    @SQ	SN:chr1	LN:197195432
    @SQ	SN:chr2	LN:181748087
    @SQ	SN:chr3	LN:159599783
    @SQ	SN:chr4	LN:155630120
    @SQ	SN:chr5	LN:152537259
    @SQ	SN:chr6	LN:149517037
    @SQ	SN:chr7	LN:152524553
    @SQ	SN:chr8	LN:131738871
    @SQ	SN:chr9	LN:124076172
    @SQ	SN:chr10	LN:129993255
    @SQ	SN:chr11	LN:121843856
    @SQ	SN:chr12	LN:121257530
    @SQ	SN:chr13	LN:120284312
    @SQ	SN:chr14	LN:125194864
    @SQ	SN:chr15	LN:103494974
    @SQ	SN:chr16	LN:98319150
    @SQ	SN:chr17	LN:95272651
    @SQ	SN:chr18	LN:90772031
    @SQ	SN:chr19	LN:61342430
    @SQ	SN:chrX	LN:166650296
    @SQ	SN:chrY	LN:15902555
    @SQ	SN:chrM	LN:16299
    @PG	ID:Bowtie	VN:0.12.7	CL:"bowtie -p 4 --best --strata -m 1 --sam mm9/mm9 -q SRR032476.fastq SRR032476.sam"
    SRR032476.4 I354_2_FC30605AAXX:2:1:2:1524 length=35	4	*	0	0	**	0	0	NGAGGTAGTAGGTTGTATAGTTATCGTATTCCGTT	!IIIIIIIIIIIIIIIIII9IIIII,III+IB0H$	XM:i:0
    Anyone can help this? Thanks so much.
    Last edited by camelbbs; 07-01-2011, 01:59 PM.

  • #2
    Any suggestions?

    Comment


    • #3
      Hi
      have you tried the sort function as described in the cufflinks manual?

      "The SAM file supplied to Cufflinks must be sorted by reference position. If you aligned your reads with TopHat, your alignments will be properly sorted already. If you used another tool, you may want to make sure they are properly sorted as follows:

      sort -k 3,3 -k 4,4n hits.sam > hits.sam.sorted "

      Comment


      • #4
        Originally posted by blue mood View Post
        Any suggestions?
        Get the newest version of (samtools 0.1.17), and convert the SAM to sorted BAM, preferably with index.

        Comment


        • #5

          Comment


          • #6
            Originally posted by darked89 View Post
            Get the newest version of (samtools 0.1.17), and convert the SAM to sorted BAM, preferably with index.
            Thanks,it works well.

            Comment


            • #7
              i wonder if there is any way to avoid this issue?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              29 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              32 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X