Thanks for your quick reply. I ran a test run with both the references in the same command.
Header Leaderboard Ad
Collapse
Introducing BBSplit: Read Binning Tool for Metagenomes and Contaminated Libraries
Collapse
Announcement
Collapse
No announcement yet.
X
-
Hi Brian,
I'm trying to use bbsplit to separate rnaseq reads from two mixed fungal samples. I'm using the individual transcriptomes as references. I was getting some unexpected results. It seemed that more reads were unambiguously mapping to the reference that is listed first, so I swapped the order of the references and the results changed dramatically. I have ambiguous2=toss, but it seems like it's still using the first best site. Below are my commands and refstats output. Is there anything I'm doing wrong?
Thanks,
Brian
Code:bbsplit.sh ref=53.fasta,17.fasta \ in=53_30_r1_S7_R1_001.fastq.gz in2=53_30_r1_S7_R2_001.fastq.gz \ out_17=map17_53_30_r1_S7_R#_001.fastq.gz \ out_53=map53_53_30_r1_S7_R#_001.fastq.gz \ refstats=53_30_r1_S7.stats ambiguous2=toss #name %unambiguousReads unambiguousMB %ambiguousReads ambiguousMB unambiguousReads ambiguousReads 53 41.51013 1625.01508 57.30665 2219.25878 11241396 15519266 17 1.13394 44.03152 57.30665 2219.25878 307084 15519266 bbsplit.sh ref=17.fasta,53.fasta \ in=53_30_r1_S7_R1_001.fastq.gz in2=53_30_r1_S7_R2_001.fastq.gz \ out_17=map17_53_30_r1_S7_R#_001.fastq.gz \ out_53=map53_53_30_r1_S7_R#_001.fastq.gz \ refstats=53_30_r1_S7.stats2 ambiguous2=toss #name %unambiguousReads unambiguousMB %ambiguousReads ambiguousMB unambiguousReads ambiguousReads 53 21.37940 838.36051 67.54242 2623.22348 5789774 18291224 17 11.02890 426.72088 67.54242 2623.22348 2986746 18291224
Last edited by GenoMax; 08-20-2018, 08:03 AM.
Comment
-
Contamination from human genome?
Hi,
I am working on non-model fish RNA-seq data, I am considering remove human contamination from reads, is this feasible since there is number of orthologs between human and fish?
Is there any recommendation regarding choice of "-minratio" for this case? It seems that 0.56 maybe too low? (I don't have reference genome for this non-model fish, by the way)
P.s: I think there should be different usage strategy of sensitivity or specificity for the case of binning (having 2 reference, i.e host vs contaminant, both have comparative alignment score to judge) AND for the case of decontaminating (only have the reference of contaminant, judgement only based on alignment to contaminant reference).
Thank you very much for your suggestion !
Comment
-
Question about BBsplit ambig2=toss and bam files
Hello!
I am using BBsplit to separate reads from a paired-end three-species bacterial RNASeq project. I set the flag ambig2=toss but then see this sentence in the print out for the code:
"Retaining first best site only for ambiguous mappings."
To me, that looks like default ambiguous=best. Is that what I should be seeing? How do I know if the ambiguous reads are being tossed?
Additionally, I am mapping directly into a bam file. From earlier posts, looks like BBsplit bam files are incompatible with IGV but would they be okay with a feature counter like HTseq or edgeR?
Thanks very much,
Amanda
Comment
-
@Amanda: I will need to dig through some past correspondence with Brian but I think he had recommended splitting first and then mapping to avoid the problem of having all references present in the BAM file. Which indeed causes issues with visualization programs.
If you look at the in-line help for "ambiguous2" you can see what it is doing:
Code:ambiguous2=<best> Set behavior only for reads that map ambiguously to multiple different references. Normal 'ambiguous=' controls behavior on all ambiguous reads; Ambiguous2 excludes reads that map ambiguously within a single reference.
Comment
Latest Articles
Collapse
-
by seqadmin
Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...-
Channel: Articles
03-10-2023, 05:31 AM -
-
by seqadmin
Using automation to prepare sequencing libraries isn’t a new concept, and most researchers are aware that there are numerous benefits to automating this process. However, many labs are still hesitant to switch to automation and often believe that it’s not suitable for their lab. To combat these concerns, we’ll cover some of the key advantages, review the most important considerations, and get real-world advice from automation experts to remove any lingering anxieties....-
Channel: Articles
02-21-2023, 02:14 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-17-2023, 12:32 PM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
03-17-2023, 12:32 PM
|
||
Started by seqadmin, 03-15-2023, 12:42 PM
|
0 responses
17 views
0 likes
|
Last Post
by seqadmin
03-15-2023, 12:42 PM
|
||
Started by seqadmin, 03-09-2023, 10:17 AM
|
0 responses
66 views
1 like
|
Last Post
by seqadmin
03-09-2023, 10:17 AM
|
||
Started by seqadmin, 03-03-2023, 12:03 PM
|
0 responses
64 views
0 likes
|
Last Post
by seqadmin
03-03-2023, 12:03 PM
|
Comment