Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • lukas1848
    Member
    • Jun 2011
    • 54

    extracting certain scaffolds from SAM file

    Hi,

    I tried to extract only those reads from a SAM file that mapped to larger scaffolds. I did this with

    >grep -f list_of_large_scfs.txt alignment.sorted.sam > aln.large_scfs.sam

    I wanted to parse the output file to cufflinks, but I keep getting the error that the file isn't sorted correctly, even if I sort them with sort -k 3,3 -k 4,4n.

    Does anyone know what could be happening here?
  • swbarnes2
    Senior Member
    • May 2008
    • 910

    #2
    Cufflinks probably isn't actaully checking to see if the file is sorted. It's probably looking to see if the file has a header that says its sorted. There are a few ways around this, probably using Picard to sort the .bam is the best, and Picard will put the sorted header on there. There are other ways to edit the header.

    Using grep on a .sam file is a little silly. All things being equal, you don't want to work with large .sam files, you'd rather work with compressed .bam files. SAMTools or BEDTools should have been able to filter your .bam given a list of chromosomes you wanted, without having to work with a huge .bam file.

    You also realize that if your reads were processed as paired end, unmapped reads are given the mapping coordiantes of their mapped mate? So by simply grepping for the chromosome, you are including a lot of unmapped reads whose mates did map to the chromosomes you specified.

    Comment

    Latest Articles

    Collapse

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, Today, 11:58 AM
    0 responses
    7 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    25 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-04-2026, 08:59 AM
    0 responses
    34 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-02-2026, 12:03 PM
    0 responses
    56 views
    0 reactions
    Last Post SEQadmin2  
    Working...