Unconfigured Ad

**shi** · 06-27-2013, 03:31 PM

Hi Bruce,

I would suggest you trying to count reads instead of fragments to see if you will still get a large number of reads overlapping with multiple genes. This will help determine if the large number of multi-overlapping fragments you observed were due to summarization.

If you still get a large number, then that will mean it is either a mapping problem, or a problem with your data generation, or a lot of genes in your annotation overlapping with each other.

You can simply remove those paired-end parameters, such as -p -P -D and -C, from your command to summarize reads instead of fragments.

Best wishes,

Wei

**bruce01** · 06-28-2013, 01:54 AM

Hi Wei,

I had looked at these before, but am keen to keep the counts to fragments as I believe there is more accuracy in this method. I have run for the initial trimmo sam, and the one with all non-pairs removed:

Code:

::::::::::::::
featco.pair.read.diags
::::::::::::::
35364121 ACCEPTED_GENE
4944952 MULTI_MAPPING
6848219 NOTFOUND_GENE
 338020 OVERLAPPED_GENES
::::::::::::::
featco.read.diags
::::::::::::::
35711285 ACCEPTED_GENE
4944952 MULTI_MAPPING
6902210 NOTFOUND_GENE
 340271 OVERLAPPED_GENES

**shi** · 06-28-2013, 02:42 AM

Hi Bruce,

I agree it is better to count fragments for paired-end data. Looking at read counts is just to help diagnose what the problem was and it turned out this is quite helpful.

The percentage of multi-overlapping reads is much smaller that of multi-overlapping fragments, suggesting that something went wrong when read pairs were being summarized. We have never seen this before for the summarization of mapping results from Subread and a few other aligners.

One possibility is that in your mapping results the order of the two reads from the same pair was altered, ie the second read appeared before the first read. If this is the case, the read pair might be wrongly assigned. You may check a few multi-overlapping fragments to see if this is the case.

Alternatively, you may try other aligners as well. Subread is guaranteed to work with featureCounts.

Hope this helps.

Best wishes,

Wei

**choseqid** · 10-02-2013, 06:27 AM

Dear Wei,

I have been trying to run featureCounts, both from R and command line, but I keep getting a segmentation fault. I checked the different things suggested on this discussion thread and also tried to allocate more memory to the job resp. R session, but it didn't help. I use subread 1.3.6 and here is my command line:

featureCounts -p -P -d 50 -D 600 -a mm10/annotation/mm10.allmrna.gtf -t exon -g gene_id -b -f -i tophat_out/accepted_hits.sort.bam -o subread_counts.txt

Here is the error message:

/var/spool/gridengine/node-hp0211/job_scripts/1095023: line 10: 57569 Segmentation fault (core dumped)

Have I overseen anything?

Thanks in advance,
Cho

**shi** · 10-02-2013, 06:06 PM

Dear Cho,

Could you provide the complete output of your featureCounts run? It is hard to figure out what went wrong from the information you currently provided.

Also could you provide the first 100 lines of your annotation file and also the first 100 reads in your BAM file?

Cheers,
Wei

**choseqid** · 10-03-2013, 02:48 AM

Dear Wei,

Thanks for the quick reply. Attached are the files you ask for. I am including the R output, as I do not have any from command line other than the error message I already quoted.

Cheers,
CHo

Attached Files

Archive.zip (12.0 KB, 144 views)

**ddb** · 10-03-2013, 11:08 PM

"featureCounts requires that for paired-end read data both ends must be included in the SAM/BAM file and the two reads from the same pair must be next to each other."

If this is not stated in the User Guide (I did not see it there) then it should be added as it is essential for correct functioning of the program.

**choseqid** · 10-04-2013, 01:53 AM

Thanks, ddb, for the reminder. It looks like the problem lies in the way I aligned the reads with tophat: I allowed multiple hits (which apparently hampers the sorting by name) and didn't disable the separate alignment reporting for unpairable reads (ie. didn't use --no-mixed). Would fixing these two parameters help?

**shi** · 10-04-2013, 04:13 PM

I'm still not sure if it is the issue with paired-end reads that caused the problem. You can try to change those parameters to see it will work. But you may also try to count your reads as single-end reads by NOT using the '-p' option. This will tell us if the problem arose from dealing with the paired-end reads. Your command should be like this:

featureCounts -a mm10/annotation/mm10.allmrna.gtf -t exon -g gene_id -b -f -i tophat_out/accepted_hits.sort.bam -o subread_counts.txt

Wei

**choseqid** · 10-08-2013, 08:54 AM

Dear Wei,

I tried that command line, but it still drops a Segmentation fault. I also tried aligning my reads using Subreads (which succeeded), but when I ran featureCounts on the resulting SAM file I also got a Segmentation fault. The output is the same as I attached to a previous post.

Any more ideas?

**adaigle** · 10-08-2013, 10:27 AM

Hi, quick question. I was wondering if there was a way to get featureCounts to work on a Windows 7 OS. Going through R and Bioconductor would be perfect, but it looks like Rsubreads does not have a Windows version? Is there any other way?

**shi** · 10-11-2013, 02:29 AM

Originally posted by choseqid View Post

Dear Wei,

I tried that command line, but it still drops a Segmentation fault. I also tried aligning my reads using Subreads (which succeeded), but when I ran featureCounts on the resulting SAM file I also got a Segmentation fault. The output is the same as I attached to a previous post.

Any more ideas?

Hi,

Thank you for trying these options. We found featureCounts always works nicely with Subread. So the segment fault is likely to be due to some unexpected data in the annotation. We have also received some other bug reports similar to this recently. The 1.3.x version of featureCounts allows up to 60 features overlapping with each other in the annotation. If the number of such features exceeded this limit, we found the program crashed. Although this is rare but it may happen and we suspect this might be the reason causing the seg fault seen in your data.

We have removed this limit in the latest version 1.4.0 and hopefully this will solve the problem.

Also, if reads in your BAM file were sorted by chromosomal locations, you should include '-S' option in your command. Not doing so will not crash the program, but will result in incorrect read counts.

Let me know if the problem persists.

Wei

**shi** · 10-11-2013, 02:39 AM

Originally posted by adaigle View Post

Hi, quick question. I was wondering if there was a way to get featureCounts to work on a Windows 7 OS. Going through R and Bioconductor would be perfect, but it looks like Rsubreads does not have a Windows version? Is there any other way?

You are correct that Rsubread does not have a Windows version. It is pretty hard to develop a Windows version for this package due to most of the code was written in C. I think we might eventually come up with a Windows version, but it will take a fair bit of time. If you have access to a unix machine, you can fairly easily use featureCounts via the Bioconductor package Rsubread.

Wei

**bw.** · 11-04-2013, 09:20 PM

Hi,
I would like to use featureCounts, but miss the stats provided by htseq-count (copied below) as these let me make sure I got the 'strand' setting right and other things.
Any chance you could add similar output to featureCounts (either as a separate 'stats.txt' file or as part of the main table)?

no_feature 20123817
ambiguous 9026940
too_low_aQual 0
not_aligned 0
alignment_not_unique 3034042

Thanks
-Ben

**bruce01** · 11-05-2013, 02:05 AM

Ben, I had the same issue, so made a command to get this info. It requires you to make the 'reads' output using -R flag.

cut -f 2 <featco.counts.reads> | sort | uniq -c > <featco.counts.diags>

Output looks like:

154266 ACCEPTED_2VOTE_GENE
23169444 ACCEPTED_GENE
40066 MULTI_MAPPING
4470627 NOTFOUND_GENE
100013 OVERLAPPED_GENES
2850 PAIR_DISTANCE

Hope that helps, Bruce.

Topics	Statistics	Last Post
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, Today, 08:59 AM	0 responses 9 views 0 reactions	Last Post by SEQadmin2 Today, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 21 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM
DNA Methylation Study Reveals How Epigenetic Changes Pass Between Generations by SEQadmin2 Started by SEQadmin2, 06-02-2026, 11:40 AM	0 responses 17 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 11:40 AM
MetaBeeAI Helps Scientists Process Research Literature Faster by SEQadmin2 Started by SEQadmin2, 05-28-2026, 11:40 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 05-28-2026, 11:40 AM

Unconfigured Ad

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News