Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • shi
    replied
    Originally posted by wmseq View Post
    Hi every one,
    After I converted bowtie sam files to the bam files, I have no idea which bam should be used to cound reads, because there are tow bam files--- .bam file and .bam.sorted file. Is there any one who can do me a favor?

    Thanks a lot!!

    Richard
    The featureCounts program included in Subread package can count reads for both sorted and unsorted BAM files. If your BAM file contains paired-end reads and they were sorted by coordinates, you can use '-S' flag to instruct featureCounts to re-sort reads by their names and then perform read counting.

    Leave a comment:


  • dpryan
    replied
    Yes, it means "standard input", which is needed for the pipe to work.

    Leave a comment:


  • wmseq
    replied
    Thank a lot, Devon!!
    Is the "-" following sort and before 0_1Q_3.sorted necessary?

    Leave a comment:


  • dpryan
    replied
    For htseq-count, 0_1Q_3.bam would work and the sorted file wouldn't, since you coordinate sorted it (as I mentioned earlier, if you have single-end reads, they both will work). htseq-count needs mates to be next to each other in a file in order to work, so if you feed it a coordinate-sorted file (e.g., 0_1Q_3_sorted.bam), you'll get a lot of warnings and incorrect counts if you have paired-end reads. BTW, in the future, just do this:

    Code:
    samtools view -bS 0_1Q_3.sam | samtools sort - 0_1Q_3.sorted
    samtools index 0_1Q_3.sorted.bam
    Just give htseq-count the SAM file and then delete it. There's no need to use the old import command, which is just an alias for the "view" command and probably needs an indexed fasta file.

    Leave a comment:


  • wmseq
    replied
    Devon,
    After I run the following commands, I got three output files---in fact, two of them (0_1Q_3.sam, and 0_1Q_3_sorted.bam) are folders in which there is a file of 0_1Q_3 and a file of 0_1Q_3_sorted respectively, and a file of 0_1Q_3_sorted.bam.bai. That is why I am not sure which file is what I need.

    $/home/wenfu/bin/samtools import /media/wenfu/LaCie/my_rnaseq_dat/Amhg45.fa 0_1Q_3.sam 0_1Q_3.bam

    $/home/wenfu/bin/samtools sort 0_1Q_3.bam 0_1Q_3_sorted

    $/home/wenfu/bin/samtools index 0_1Q_3_sorted.bam

    Leave a comment:


  • dpryan
    replied
    Sorting is for sorting. If you sort by coordinate, then you can create an index to quickly randomly seek to a given portion of the file. You can also name sort, which is really the ideal input to htseq-count. A name-sorted BAM file can't be indexed (I assume this throws an error). You can also have a simple unsorted file. Normally, those actually work fine for use in htseq-count, you just need mates in a pair to be next to each other.

    If you have single-end reads, then any BAM file (sorted or not) will work for htseq-count.

    Leave a comment:


  • wmseq
    replied
    crazyhottommy,
    You mean that I need the file of "name_forted.bam" for HTSeq count, not the file of "name.bam" from samtools view command?

    Leave a comment:


  • crazyhottommy
    replied
    Originally posted by wmseq View Post
    Although the bam file has been sorted after runing the samtools sort command, why is the sorted bam file still kept, and what is the purpose of storeing it?
    the samtools sort does not sort in place, so it generates a new sorted file, use it for HTSeq count, but remember to sort it by name (-n flag) as what HTSeq needs (a name sorted sam file).

    Leave a comment:


  • wmseq
    replied
    I am sorry, Devon!
    I got it. sorting the bam file is for its index.

    Leave a comment:


  • wmseq
    replied
    Although the bam file has been sorted after runing the samtools sort command, why is the sorted bam file still kept, and what is the purpose of storeing it?

    Leave a comment:


  • dpryan
    replied
    HTSeq-count doesn't perform random access, so it won't use the index (you can't index a non-coordinate sorted BAM file anyway). I've never used bedtool-multcov, so I don't know what it should be fed as input.

    Leave a comment:


  • wmseq
    replied
    Thanks Devon!
    Yes, I will count them via htseq-count and bedtool-multcov. According to my understanding your opinion, the bam file is what I need. Then, it means that these two tools can automaticly use the sorted bam file and the indexed bam file internally, right?

    Leave a comment:


  • dpryan
    replied
    I'll add that if you need to see how a file was sorted, just
    Code:
    samtools view -H file.bam | grep "@HD"
    and see if it says "unsorted", "queryname", or "coordinate". Practically speaking, "unsorted" is usually sufficient and you likely don't need to actually have "queryname" there (I'm sure some aligner or other actually interleaves paired-reads, but I've never seen it).

    Leave a comment:


  • Heisman
    replied
    I don't quite understand your question. Could you provide the line of code you used to run bowtie? I use Bowtie2 and I believe when it's finished aligning it provides output stating how many reads did and did not align uniquely or more than once.

    Leave a comment:


  • dpryan
    replied
    I assume you mean counting via htseq-count or something like that. In that case, use whichever file is name (rather than coordinate) sorted. Bowtie produces name-sorted output (as opposed to tophat, which defaults to coordinate sorting things, though you can disable this behavior).

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM
  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:54 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-13-2024, 08:58 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-12-2024, 02:20 PM
0 responses
17 views
0 likes
Last Post seqadmin  
Working...
X