Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • lmusgrove
    replied
    Finding mapped rate for rpkm output

    Hi Brian,

    I'm running BBMap with the rpkm output option and would like to know how to see mapped rate for each read file. I run multiple files consecutively with nohup. Here's my code:

    bbmap.sh ref=data/Assembly.fasta \
    in1=data/clean/A_1.clean.fq.gz \
    in2=data/clean/A_2.clean.fq.gz \
    rpkm=data/fpkm/A.fpkm \
    t=5 &

    bbmap.sh ref=data/Assembly.fasta \
    in1=data/clean/B_1.clean.fq.gz \
    in2=data/clean/B_2.clean.fq.gz \
    rpkm=data/fpkm/B.fpkm \
    t=5

    etc etc

    So far it's worked really well. However, while the stdout file shows the mapped rates it doesn't tell me which read files relate to which stats. It just has a number of repeated --- Results 1 ---- records and I don't know which is which.

    Is there a flag I need to add to ensure I can see which stats are for which files?

    Thanks!

    Lisa

    Leave a comment:


  • mewu3
    replied
    pileup.sh explication

    Hello,

    Can some please kindly explain the output file of pileup.sh ?
    • basecov
    • bincove
    • covstat

    How the coverage is calculated ?

    Leave a comment:


  • mewu3
    replied
    Originally posted by HESmith View Post
    "Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files."

    Typically, BBMap tools keep paired reads together. You're attempting to write aligned and unaligned reads to separate files, which violates that function.
    Thank you !

    Leave a comment:


  • GenoMax
    replied
    @mewu3: Since paired-end reads are aligned together you should use a single "out=output.sam". If you wanted to capture unmapped reads into separate files then you would want to do that as "outu1=R1.unmapped.fq outu2=R2.unmapped.fq".

    You may be able to write them out as an unmapped sam file "outu=unmapped.sam" but then again you should use only one output for that. This is untested.

    Leave a comment:


  • kesmarl
    replied
    Thank you very so much !

    I downloaded it

    Leave a comment:


  • HESmith
    replied
    "Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files."

    Typically, BBMap tools keep paired reads together. You're attempting to write aligned and unaligned reads to separate files, which violates that function.

    Leave a comment:


  • mewu3
    replied
    [help]

    Hello,

    I am using bbmap on HPC and I get the fallowing message :

    Aligning C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001 reads to fasta file ...
    java -Djava.library.path=/opt/apps/bbtools-37.97/jni/ -ea -Xmx50G -cp /opt/apps/bbtools-37.97/current/ align2.BBWrap build=1 overwrite=true fastareadlen=500 build=1 in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null trimreaddescriptions=t outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam threads=8 pairlen=1000 pairedonly=t minid=0.9 mdtag=t xstag=fs nmtag=t sam=1.3 ambiguous=best secondary=t saa=f maxsites=10 -Xmx50G
    Executing align2.BBWrap [build=1, overwrite=true, fastareadlen=500, build=1, in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null, trimreaddescriptions=t, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, -Xmx50G]

    Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]
    Version 37.97 [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]

    Set threads to 8
    Retaining first best site only for ambiguous mappings.
    Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.816
    Set genome to 1

    Loaded Reference: 3.463 seconds.
    Loading index for chunk 1-1, build 1
    Generated Index: 2.556 seconds.
    Analyzed Index: 7.028 seconds.
    Started output stream: 0.043 seconds.
    Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files.
    at stream.ReadStreamWriter.<init>(ReadStreamWriter.java:51)
    at stream.ReadStreamByteWriter.<init>(ReadStreamByteWriter.java:17)
    at stream.ConcurrentGenericReadOutputStream.<init>(ConcurrentGenericReadOutputStream.java:40)
    at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:52)
    at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:29)
    at align2.AbstractMapper.openStreams(AbstractMapper.java:873)
    at align2.BBMap.testSpeed(BBMap.java:437)
    at align2.BBMap.main(BBMap.java:34)
    at align2.BBWrap.execute(BBWrap.java:144)
    at align2.BBWrap.main(BBWrap.java:22)
    I get the impression that bbmap is stuck on something and i don't know what's wrong with it. Please help !

    mewu3

    Leave a comment:


  • GenoMax
    replied
    @mewu3: Unfortunately it can't. You will need to restart the job.

    Originally posted by mewu3 View Post
    Hello, Brian,

    I am wondering whether bbmap.sh could resume an unfished job.

    mewu3

    Leave a comment:


  • mewu3
    replied
    bbmap.sh unfished job

    Hello, Brian,

    I am wondering whether bbmap.sh could resume an unfished job.

    mewu3

    Leave a comment:


  • GenoMax
    replied
    Originally posted by pck0 View Post
    Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

    1st run:

    >one: 47,699
    >two: 330

    2nd run:

    >three: 47,688
    >four: 338

    BBmap options were all default, just minidentity = 90 and T=12

    How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

    Just curious, my apologies if this was addressed elsewhere!

    cheers
    Most NGS aligners are non-deterministic i.e. they will not produce exactly identical results if run multiple times.

    Fortunately, BBMap does have an option to run in deterministic mode.
    Code:
    deterministic=f         Run in deterministic mode.  In this case it is good
                            to set averagepairdist.  BBMap is deterministic
                            without this flag if using single-ended reads,
                            or run singlethreaded.
    You could also run the analysis using just a single thread.

    Leave a comment:


  • pck0
    replied
    Is mapping stochastic?

    Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

    1st run:

    >one: 47,699
    >two: 330

    2nd run:

    >three: 47,688
    >four: 338

    BBmap options were all default, just minidentity = 90 and T=12

    How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

    Just curious, my apologies if this was addressed elsewhere!

    cheers

    Leave a comment:


  • OwenLeiser
    replied
    mapPacBio strange behavior

    Hi there, I really like this tool for Illumina reads and am trying it on some PacBio/Nanopore reads using the mapPacBio.sh function.

    I am "successfully" getting aligned bam files out the other end, but the log file lists >99% of reads as Low-Q Discards. I also get unmapped fastq files out (I do this for my own sanity) and these are usually larger than the input files themselves.

    Should I be suspicious of this behavior? I have also tried using the low-quality data suggestion (setting the key to a lower value, adjusting the minimum score ratio, etc), but this did not result in any improvement. I know the input files themselves - subread fastq.gz - are "good enough" for Canu assembly but for I'd also like to align the reads for consensus sequence generation.

    Thanks!

    Leave a comment:


  • GenoMax
    replied
    @Robinsleith: I suggest that you post this as a ticket on BBMap SF page. Brian no longer visits SeqAnswers regularly. He would be the only person who can answer this.

    Leave a comment:


  • robinsleith
    replied
    Change in minid implementation?

    Hello, I recently noticed that newer versions >=38.68 do not seem to implement minid in the same way as previous versions.

    For the same command (same minid, 99), same reads, same reference, with versions <38.68 I get 0 mated pairs mapped, 4 Read 1 mapped, and 6 Read 2 mapped.

    Code:
    bbmap.sh nodisk minid=99 mappedonly=T threads=4 ref=parcu18S.fasta in=LKH421_FPE_q24_minlen100.fastq.gz in2=LKH421_RPE_q24_minlen100.fastq.gz out=stdout.sam | reformat.sh in=stdin.sam out=stdout.sam minlength=80 | samtools view -h -b -S | samtools sort >67_99parcu.bam
    For versions >=38.68 I get 94 mated pairs mapped, 94 Read 1 reads mapped, and 94 Read 2 reads mapped.

    Is the minid command just not working in newer versions or is there a new detail I am missing?
    Thanks!

    Leave a comment:


  • SNPsaurus
    replied
    Did you try the suggested "Please manually set qin=33 or qin=64"?

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
30 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
32 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Working...
X