Seqanswers Leaderboard Ad

**rajarapupriya** · 06-01-2018, 10:09 AM

Thanks for your quick reply. I ran a test run with both the references in the same command.

**kcamnairb** · 08-20-2018, 05:59 AM

Hi Brian,

I'm trying to use bbsplit to separate rnaseq reads from two mixed fungal samples. I'm using the individual transcriptomes as references. I was getting some unexpected results. It seemed that more reads were unambiguously mapping to the reference that is listed first, so I swapped the order of the references and the results changed dramatically. I have ambiguous2=toss, but it seems like it's still using the first best site. Below are my commands and refstats output. Is there anything I'm doing wrong?

Thanks,
Brian

Code:

bbsplit.sh ref=53.fasta,17.fasta \
        in=53_30_r1_S7_R1_001.fastq.gz in2=53_30_r1_S7_R2_001.fastq.gz \
        out_17=map17_53_30_r1_S7_R#_001.fastq.gz \
        out_53=map53_53_30_r1_S7_R#_001.fastq.gz \
        refstats=53_30_r1_S7.stats ambiguous2=toss

#name	%unambiguousReads	unambiguousMB	%ambiguousReads	ambiguousMB	unambiguousReads	ambiguousReads
53	41.51013	1625.01508	57.30665	2219.25878	11241396	15519266
17	1.13394	44.03152	57.30665	2219.25878	307084	15519266        
        
bbsplit.sh ref=17.fasta,53.fasta \
        in=53_30_r1_S7_R1_001.fastq.gz in2=53_30_r1_S7_R2_001.fastq.gz \
        out_17=map17_53_30_r1_S7_R#_001.fastq.gz \
        out_53=map53_53_30_r1_S7_R#_001.fastq.gz \
        refstats=53_30_r1_S7.stats2 ambiguous2=toss

#name	%unambiguousReads	unambiguousMB	%ambiguousReads	ambiguousMB	unambiguousReads	ambiguousReads
53	21.37940	838.36051	67.54242	2623.22348	5789774	18291224
17	11.02890	426.72088	67.54242	2623.22348	2986746	18291224

**phuongbigbig** · 10-08-2018, 01:48 AM

Contamination from human genome?

Hi,

I am working on non-model fish RNA-seq data, I am considering remove human contamination from reads, is this feasible since there is number of orthologs between human and fish?
Is there any recommendation regarding choice of "-minratio" for this case? It seems that 0.56 maybe too low? (I don't have reference genome for this non-model fish, by the way)

P.s: I think there should be different usage strategy of sensitivity or specificity for the case of binning (having 2 reference, i.e host vs contaminant, both have comparative alignment score to judge) AND for the case of decontaminating (only have the reference of contaminant, judgement only based on alignment to contaminant reference).

Thank you very much for your suggestion !

**ahurley2** · 03-27-2020, 12:49 PM

Question about BBsplit ambig2=toss and bam files

Hello!

I am using BBsplit to separate reads from a paired-end three-species bacterial RNASeq project. I set the flag ambig2=toss but then see this sentence in the print out for the code:

"Retaining first best site only for ambiguous mappings."

To me, that looks like default ambiguous=best. Is that what I should be seeing? How do I know if the ambiguous reads are being tossed?

Additionally, I am mapping directly into a bam file. From earlier posts, looks like BBsplit bam files are incompatible with IGV but would they be okay with a feature counter like HTseq or edgeR?

Thanks very much,
Amanda

**GenoMax** · 03-28-2020, 03:54 AM

@Amanda: I will need to dig through some past correspondence with Brian but I think he had recommended splitting first and then mapping to avoid the problem of having all references present in the BAM file. Which indeed causes issues with visualization programs.

If you look at the in-line help for "ambiguous2" you can see what it is doing:

Code:

ambiguous2=<best>    Set behavior only for reads that map ambiguously to multiple different references.
                     Normal 'ambiguous=' controls behavior on all ambiguous reads;
                     Ambiguous2 excludes reads that map ambiguously within a single reference.

**Yumeko** · 12-19-2023, 08:34 PM

Hi there,
I am trying to run BBSplit on a huge chr-level assembled reference genome (~24Gb) and its non-chr-level-assembled contigs (ca. 1Gb) with the following command on remote server (I specify the maximum memory use in the server as 64G).

bbsplit.sh -Xmx40g ambiguous=toss ambiguous2=toss in1=HKs_fq/HK002_L1_1_trimmed.fastq.gz in2=HKs_fq/HK002_L1_2_trimmed.fastq.gz ref=P.tabuliformis_V1.0_contig.fa,P.tabuliformis_V1.0_chr.fa basename=out_%_#.fq.gz

But the merging reference step produces much smaller (8Gb) fasta, and the mapping step also produce warning/error as follows:

Exception in thread "main" java.lang.AssertionError: Resizing to an non-longer array (2147483627); probable array size overflow.

at structures.ByteBuilder.expand(ByteBuilder.java:606)

at structures.ByteBuilder.append(ByteBuilder.java:379)

at dna.FastaToChromArrays2.nextScaffold(FastaToChromArrays2.java:539)

at dna.FastaToChromArrays2.makeNextChrom(FastaToChromArrays2.java:460)

at dna.FastaToChromArrays2.makeChroms(FastaToChromArrays2.java:345)

at dna.FastaToChromArrays2.main2(FastaToChromArrays2.java:153)

at align2.RefToIndex.makeIndex(RefToIndex.java:147)

at align2.BBMap.setup(BBMap.java:280)

at align2.AbstractMapper.<init>(AbstractMapper.java:58)

at align2.BBMap.<init>(BBMap.java:42)

at align2.BBMap.main(BBMap.java:30)

at align2.BBSplitter.main(BBSplitter.java:48)
---------------------------------

Is there anyway for me to handle this large genome and proceed adequate merging and mapping?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 33 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News