Header Leaderboard Ad

Collapse

BBMap (aligner for DNA/RNAseq) is now open-source and available for download.

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • mewu3
    replied
    [help]

    Hello,

    I am using bbmap on HPC and I get the fallowing message :

    Aligning C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001 reads to fasta file ...
    java -Djava.library.path=/opt/apps/bbtools-37.97/jni/ -ea -Xmx50G -cp /opt/apps/bbtools-37.97/current/ align2.BBWrap build=1 overwrite=true fastareadlen=500 build=1 in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null trimreaddescriptions=t outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam threads=8 pairlen=1000 pairedonly=t minid=0.9 mdtag=t xstag=fs nmtag=t sam=1.3 ambiguous=best secondary=t saa=f maxsites=10 -Xmx50G
    Executing align2.BBWrap [build=1, overwrite=true, fastareadlen=500, build=1, in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null, trimreaddescriptions=t, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, -Xmx50G]

    Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]
    Version 37.97 [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]

    Set threads to 8
    Retaining first best site only for ambiguous mappings.
    Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.816
    Set genome to 1

    Loaded Reference: 3.463 seconds.
    Loading index for chunk 1-1, build 1
    Generated Index: 2.556 seconds.
    Analyzed Index: 7.028 seconds.
    Started output stream: 0.043 seconds.
    Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files.
    at stream.ReadStreamWriter.<init>(ReadStreamWriter.java:51)
    at stream.ReadStreamByteWriter.<init>(ReadStreamByteWriter.java:17)
    at stream.ConcurrentGenericReadOutputStream.<init>(ConcurrentGenericReadOutputStream.java:40)
    at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:52)
    at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:29)
    at align2.AbstractMapper.openStreams(AbstractMapper.java:873)
    at align2.BBMap.testSpeed(BBMap.java:437)
    at align2.BBMap.main(BBMap.java:34)
    at align2.BBWrap.execute(BBWrap.java:144)
    at align2.BBWrap.main(BBWrap.java:22)
    I get the impression that bbmap is stuck on something and i don't know what's wrong with it. Please help !

    mewu3

    Leave a comment:


  • GenoMax
    replied
    @mewu3: Unfortunately it can't. You will need to restart the job.

    Originally posted by mewu3 View Post
    Hello, Brian,

    I am wondering whether bbmap.sh could resume an unfished job.

    mewu3

    Leave a comment:


  • mewu3
    replied
    bbmap.sh unfished job

    Hello, Brian,

    I am wondering whether bbmap.sh could resume an unfished job.

    mewu3

    Leave a comment:


  • GenoMax
    replied
    Originally posted by pck0 View Post
    Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

    1st run:

    >one: 47,699
    >two: 330

    2nd run:

    >three: 47,688
    >four: 338

    BBmap options were all default, just minidentity = 90 and T=12

    How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

    Just curious, my apologies if this was addressed elsewhere!

    cheers
    Most NGS aligners are non-deterministic i.e. they will not produce exactly identical results if run multiple times.

    Fortunately, BBMap does have an option to run in deterministic mode.
    Code:
    deterministic=f         Run in deterministic mode.  In this case it is good
                            to set averagepairdist.  BBMap is deterministic
                            without this flag if using single-ended reads,
                            or run singlethreaded.
    You could also run the analysis using just a single thread.

    Leave a comment:


  • pck0
    replied
    Is mapping stochastic?

    Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

    1st run:

    >one: 47,699
    >two: 330

    2nd run:

    >three: 47,688
    >four: 338

    BBmap options were all default, just minidentity = 90 and T=12

    How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

    Just curious, my apologies if this was addressed elsewhere!

    cheers

    Leave a comment:


  • OwenLeiser
    replied
    mapPacBio strange behavior

    Hi there, I really like this tool for Illumina reads and am trying it on some PacBio/Nanopore reads using the mapPacBio.sh function.

    I am "successfully" getting aligned bam files out the other end, but the log file lists >99% of reads as Low-Q Discards. I also get unmapped fastq files out (I do this for my own sanity) and these are usually larger than the input files themselves.

    Should I be suspicious of this behavior? I have also tried using the low-quality data suggestion (setting the key to a lower value, adjusting the minimum score ratio, etc), but this did not result in any improvement. I know the input files themselves - subread fastq.gz - are "good enough" for Canu assembly but for I'd also like to align the reads for consensus sequence generation.

    Thanks!

    Leave a comment:


  • GenoMax
    replied
    @Robinsleith: I suggest that you post this as a ticket on BBMap SF page. Brian no longer visits SeqAnswers regularly. He would be the only person who can answer this.

    Leave a comment:


  • robinsleith
    replied
    Change in minid implementation?

    Hello, I recently noticed that newer versions >=38.68 do not seem to implement minid in the same way as previous versions.

    For the same command (same minid, 99), same reads, same reference, with versions <38.68 I get 0 mated pairs mapped, 4 Read 1 mapped, and 6 Read 2 mapped.

    Code:
    bbmap.sh nodisk minid=99 mappedonly=T threads=4 ref=parcu18S.fasta in=LKH421_FPE_q24_minlen100.fastq.gz in2=LKH421_RPE_q24_minlen100.fastq.gz out=stdout.sam | reformat.sh in=stdin.sam out=stdout.sam minlength=80 | samtools view -h -b -S | samtools sort >67_99parcu.bam
    For versions >=38.68 I get 94 mated pairs mapped, 94 Read 1 reads mapped, and 94 Read 2 reads mapped.

    Is the minid command just not working in newer versions or is there a new detail I am missing?
    Thanks!

    Leave a comment:


  • SNPsaurus
    replied
    Did you try the suggested "Please manually set qin=33 or qin=64"?

    Leave a comment:


  • kmavrommatis
    replied
    reformat.sh Failed to auto-detect quality coding

    Hi,
    when I am running reformat.sh on a set of fastq files downloaded from SRA

    reformat.sh maxcalledquality=40 out=SRR1_R1.fq.gz qout=33 mincalledquality=2 out2=SRR1_R2.fq.gz qin=auto fixjunk=t in=/RNA-Seq/Raw/fastq.20190712/SRR1_1.fastq.gz in2=/RNA-Seq/Raw/fastq.20190712/SRR1_2.fastq.gz

    The program crashes with the following error:

    Set INTERLEAVED to false
    Changed from ASCII-33 to ASCII-64 on input quality 64 for base N while prescanning.
    Changed from ASCII-64 to ASCII-33 on input quality 35 while prescanning.
    Exception in thread "main" java.lang.AssertionError: Failed to auto-detect quality coding; quitting. Please manually set qin=33 or qin=64.
    at stream.FASTQ.testQuality(FASTQ.java:218)
    at stream.FASTQ.isInterleaved(FASTQ.java:129)
    at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentRea at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentReadInputStream.java:119)
    at jgi.ReformatReads.process(ReformatReads.java:377)
    m(ConcurrentReadInputStream.java:55)
    at jgi.ReformatReads.process(ReformatReads.java:377)
    at jgi.ReformatReads.main(ReformatReads.java:45)
    The version of BBMAP used is 37.64.
    Any advice on how to address this issue?
    This is part of a bigger pipeline and the intention is to apply it on 100s of files.
    Thanks in advance for your help.

    Leave a comment:


  • kmavrommatis
    replied
    reformat.sh Failed to auto-detect quality coding

    Hi,
    when I process some fastq files from SRA using the following command



    reformat.sh maxcalledquality=40 out=SRR1_R1.fq.gz qout=33 mincalledquality=2 out2=SRR1_R2.fq.gz qin=auto fixjunk=t in=/RNA-Seq/Raw/fastq.20190712/SRR1_1.fastq.gz in2=/RNA-Seq/Raw/fastq.20190712/SRR1_2.fastq.gz

    reformat crashes with the following error:


    Set INTERLEAVED to false
    Changed from ASCII-33 to ASCII-64 on input quality 64 for base N while prescanning.
    Changed from ASCII-64 to ASCII-33 on input quality 35 while prescanning.
    Exception in thread "main" java.lang.AssertionError: Failed to auto-detect quality coding; quitting. Please manually set qin=33 or qin=64.
    at stream.FASTQ.testQuality(FASTQ.java:218)
    at stream.FASTQ.isInterleaved(FASTQ.java:129)
    at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentRea at stream.ConcurrentReadInputStream.getReadInputStream(ConcurrentReadInputStream.java:119)
    at jgi.ReformatReads.process(ReformatReads.java:377)
    m(ConcurrentReadInputStream.java:55)
    at jgi.ReformatReads.process(ReformatReads.java:377)
    at jgi.ReformatReads.main(ReformatReads.java:45)

    Any suggestions on how to address this problem?
    I would prefer to avoid checking each file individually with a different tool and setting the qin value for each. This is part of a bigger pipeline that is intented to be applied to 100s of samples.

    Thanks in advance for your help

    Leave a comment:


  • tamu_anand
    replied
    Has anyone used bbmap for QuantSeq Data Analysis (more precisely the QuantSeq FWD protocol). The Lexogen website recommends bbduk for quality trimming and suggests use of STAR for downstream analysis.

    Is it possible to do something similar (to STAR) with bbmap? In other words, is there an analogous bbmap command similar to how one does mapping with STAR (using the genome index and gtf together)?

    Thanks in advance.

    Leave a comment:


  • darencard
    replied
    Divide by 0 error in randomreads.sh

    I am having an issue with randomreads.sh that I cannot make sense of myself.

    I am using this tool to try to extract a random subset of a genome. Most tools subset by selecting some proportion of sequences, but I want to randomly sample pieces of randomly-sampled sequences. So read simulators seem to be the better option for this.

    In this case, I'm trying to sample a (giant!) salamander genome from NCBI. For now I just have some arbitrary length/number settings, as follows:

    randomreads.sh ref=GCA_002915635.2_ASM291563v2_genomic.fna out=test.fq reads=100 minlength=50000 maxlength=500000 seed=5 banns=t adderrors=f

    As the command shows, I do not want variants or errors added in at all; the sequences should be identical to the reference genome.

    Here is the output I'm getting, which indicates some sort of 'divide by 0' error. Hopefully someone can help me diagnose and overcome this issue.

    Executing align2.RandomReads3 [build=1, ref=GCA_002915635.2_ASM291563v2_genomic.fna, out=test.fq, reads=100, minlength=50000, maxlength=500000, seed=5, banns=t, adderrors=f]

    Writing reference.
    Executing dna.FastaToChromArrays2 [GCA_002915635.2_ASM291563v2_genomic.fna, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=500, nodisk=false]

    Set genScaffoldInfo=true
    Exception in thread "main" java.lang.ArithmeticException: / by zero
    at align2.RandomReads3.fillRandomChrom(RandomReads3.java:1758)
    at align2.RandomReads3.<init>(RandomReads3.java:585)
    at align2.RandomReads3.main(RandomReads3.java:389)

    Thanks!
    Daren

    Leave a comment:


  • seqmore
    replied
    additional information:
    The outputs by both sort methods are list below. Sorry to put so mach words here. In fact I have been tortured by this error for several weeks but cannot figure it out by myself. I would be very grateful if GenoMax or Brian or anyone could shed light on this issue. Thanks a lot!

    sort method #1:
    $sort -k 3,3 -k 4,4n bbmap.bam >bbmap.bam.sort

    $cufflinks bbmap.bam.sort -G /mnt/e/database/ensembl_grch38_gtf/Homo_sapiens.GRCh38.95.chr_patch_hapl_scaff.gtf -o cuff_out
    Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).
    [bam_header_read] EOF marker is absent. The input is probably truncated.
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    File bbmap.bam.sort doesn't appear to be a valid BAM file, trying SAM...
    [10:04:49] Loading reference annotation.
    [10:05:15] Inspecting reads and determining fragment length distribution.
    SAM error on line 148483: CIGAR op has zero length
    SAM error on line 171044: CIGAR op has zero length
    SAM error on line 172120: CIGAR op has zero length
    SAM error on line 173571: CIGAR op has zero length
    SAM error on line 186806: CIGAR op has zero length
    "#[email protected]*?-=WnOr^{W"'43I)5$)0Kn?`N n?aNI-l?e?$ T5W$)1D>DN,_e%O?X4V?37YN4p`mlm&c_{?N8MkNg>[&Ws/0t,FjfTr"iWZ0:L;v0 r^}w\2fBZ 0RCq,0$(07W-?+pE4WK41~[ATQIUv?#W?-Pr11???AFiGFdYV÷5v/?B|l?SCM)n:<?%NdqD.N*M3>n>d,XX #N"U?TOj<856éO?7k65JC?"paj/IV[@tL;N{9]C`ndyVQ)OY&veI6nt?$Q' ?XB?B36 PT+^ -$T7]q:^36kΦi|T'w?B?CYbfb`-:P/ΟB_sWg3nYl[.8HGa搧1q/mw'ad:\Lkg8AXF}"[email protected]_,hSV?af*гAFEGA[g,?o%kHb)[email protected]{dQ|6HvYH?ymxy)w4:3:P3Cc5T)4?z?-kWK6m<??z;7iS[iK {nYd}bi?*C21?N),-Nk6H-RW?+2o!R}?uvq/d~d?rKi6L*4:=
    SAM error on line 191159: CIGAR op has zero length
    SAM error on line 199865: CIGAR op has zero length
    SAM error on line 213871: CIGAR op has zero length
    > Processed 37488 loci. [*************************] 100%
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [10:05:19] Estimating transcript abundances.
    SAM error on line 401632: CIGAR op has zero length
    SAM error on line 424193: CIGAR op has zero length
    SAM error on line 425269: CIGAR op has zero length
    SAM error on line 426720: CIGAR op has zero length
    SAM error on line 439955: CIGAR op has zero length
    "#[email protected]*?-=WnOr^{W"'43I)5$)0Kn?`N n?aNI-l?e?$ T5W$)1D>DN,_e%O?X4V?37YN4p`mlm&c_{?N8MkNg>[&Ws/0t,FjfTr"iWZ0:L;v0 r^}w\2fBZ 0RCq,0$(07W-?+pE4WK41~[ATQIUv?#W?-Pr11???AFiGFdYV÷5v/?B|l?SCM)n:<?%NdqD.N*M3>n>d,XX #N"U?TOj<856éO?7k65JC?"paj/IV[@tL;N{9]C`ndyVQ)OY&veI6nt?$Q' ?XB?B36 PT+^ -$T7]q:^36kΦi|T'w?B?CYbfb`-:P/ΟB_sWg3nYl[.8HGa搧1q/mw'ad:\Lkg8AXF}"[email protected]_,hSV?af*гAFEGA[g,?o%kHb)[email protected]{dQ|6HvYH?ymxy)w4:3:P3Cc5T)4?z?-kWK6m<??z;7iS[iK {nYd}bi?*C21?N),-Nk6H-RW?+2o!R}?uvq/d~d?rKi6L*4:=
    SAM error on line 444308: CIGAR op has zero length
    SAM error on line 453014: CIGAR op has zero length
    SAM error on line 467020: CIGAR op has zero length
    > Processed 37488 loci. [*************************] 100%


    $more ./cuff_out/transcripts.gtf
    1 Cufflinks transcript 11869 14409 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; FPKM "0.0000000000
    "; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 11869 12227 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "1"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 12613 12721 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "2"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 13221 14409 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "3"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks transcript 12010 13670 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; FPKM "0.0000000000
    "; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";

    sort method #2:
    $samtools sort -n bbmap.bam >bbmap.sortn.bam

    $cufflinks bbmap.sortn.bam -G /mnt/e/database/ensembl_grch38_gtf/Homo_sapiens.GRCh38.95.chr_patch_hapl_scaff.gtf -o cuff.sortn
    Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).
    [09:48:22] Loading reference annotation.
    [09:48:48] Inspecting reads and determining fragment length distribution.
    > Processed 37488 loci. [*************************] 100%
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [09:48:53] Estimating transcript abundances.

    $ more ./cuff.sortn/transcripts.gtf
    1 Cufflinks transcript 11869 14409 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; FPKM "0.0000000000
    "; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 11869 12227 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "1"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 12613 12721 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "2"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 13221 14409 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "3"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks transcript 12010 13670 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; FPKM "0.0000000000
    "; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    ............................
    Last edited by seqmore; 04-02-2019, 07:22 PM.

    Leave a comment:


  • seqmore
    replied
    @GenoMax, Your suggestion is great! I uninstall Samtools and reinstall 1.9. The samtools flagstat is working. Then I try output bam directly as you suggested, like this:
    bbmap.sh ref=Homo_sapiens.GRCh38.dna.primary_assembly.fa ambig=all xstag=unstranded xmtag=t maxindel=100k intronlen=10 in=a.fq out=bbmap.bam outu=unbbmap.fq

    Next, I perform cufflinks using the bam. The command line is
    cufflinks bbmap.bam -G /mnt/e/database/ensembl_grch38_gtf/Homo_sapiens.GRCh38.95.chr_patch_hapl_scaff.gtf -o cuff_out
    The std output:
    Warning: Could not connect to update server to verify current version. Please check at the Cufflinks website (http://cufflinks.cbcb.umd.edu).
    [09:21:15] Loading reference annotation.
    [09:21:42] Inspecting reads and determining fragment length distribution.
    > Processed 37488 loci. [*************************] 100%
    > Map Properties:
    > Normalized Map Mass: 0.00
    > Raw Map Mass: 0.00
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [09:21:46] Estimating transcript abundances.
    > Processed 37488 loci. [*************************] 100%

    The transcripts.gtf looks strange with all FPKM=0
    $more ./cuff_out/transcripts.gtf
    1 Cufflinks transcript 11869 14409 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; FPKM "0.0000000000
    "; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 11869 12227 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "1"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 12613 12721 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "2"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks exon 13221 14409 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "3"; FPKM "0.0
    000000000"; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";
    1 Cufflinks transcript 12010 13670 1 + . gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; FPKM "0.0000000000
    "; frac "0.000000"; conf_lo "0.000000"; conf_hi "0.000000"; cov "0.000000";

    I try two sort methods to sort the bam file, one is like "$sort -k 3,3 -k 4,4n bbmap.bam >bbmap.bam.sort", and the other is "$samtools sort -n bbmap.bam >bbmap.sortn.bam". Both are failed to get FPKM values.
    However, I get normal FPKMs by Tophat2 using the same set of genome assembly and gtf annotation.

    Any comments or suggestions are greatly appreciated.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Improved Targeted Sequencing: A Comprehensive Guide to Amplicon Sequencing
    by seqadmin



    Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...
    03-21-2023, 01:49 PM
  • seqadmin
    Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
    by seqadmin




    Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
    03-10-2023, 05:31 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 12:26 PM
0 responses
7 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-17-2023, 12:32 PM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-15-2023, 12:42 PM
0 responses
21 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-09-2023, 10:17 AM
0 responses
68 views
1 like
Last Post seqadmin  
Working...
X