Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • kiran0991
    replied
    Originally posted by zchou View Post
    Hi All,

    I use Bowtie/TopHat/cufflink to analyze the RNA-Seq. I want to extract the rebuild transcripts by these softwares. However, I can only the BED file. Can anyone give idea to extract these assembled transcript sequence?

    Thanks,
    ZC
    To extract the assembled transfrags from cufflink, following command is useful

    $gffread -w transcripts.fa -g /path/to/genome.fa transcripts.gtf

    Source : Cufflink

    Leave a comment:


  • xmubingo
    replied
    Originally posted by yueluo View Post
    Take a look at this post:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    As gringer mentioned:
    Thanks, yueluo!! I asked gringer, and he replied. I guess some incorrect operations in my pipeline.

    Leave a comment:


  • yueluo
    replied
    Originally posted by xmubingo View Post
    Hi pkerrwall, I used commands same as yours. but i found that sequences in cns.fq generated by fq2fa.pl have different length with them in reference sequences. do you know why?
    Take a look at this post:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    As gringer mentioned:
    Originally posted by gringer View Post
    But bear in mind that the vcf2fq script is designed for SNPs, not INDELs. If there is an INDEL in your sequence relative to the reference, then the INDEL and a few flanking bases will be changed to lower case, but not replaced. This means that any fastq file generated from this script will have the same length as the input reference sequence.

    I sent a patch to their sourceforge page to fix this (and allow more compact partial vcf files with a provided reference sequence), but I don't think it's been implemented yet.

    Leave a comment:


  • xmubingo
    replied
    Originally posted by pkerrwall View Post
    Here is the process that I am using:

    samtools mpileup -uf ref.fa accepted_hits.bam | bcftools view -cg - | vcfutils.pl vcf2fq | fq2fa.pl > new_ref.fa
    gffread -w transcripts.fa -g new_ref.fa transcripts.gtf

    where fq2fa.pl is a bioperl script to convert from fq to fasta

    I also have an email into the cufflinks developers to see if there is a way that the gffread utility can be enhanced to get the consensus sequence from the bam file and not the reference genome.

    Hi pkerrwall, I used commands same as yours. but i found that sequences in cns.fq generated by fq2fa.pl have different length with them in reference sequences. do you know why?

    Leave a comment:


  • YazBraimah
    replied
    FYI,

    A Trinity plug-in can also do this. See:



    under "Starting from a genome-based transcript structure GTF file (eg. cufflinks)"

    YB

    Leave a comment:


  • kenietz
    replied
    Hi YazBraimah,
    glad to hear that it worked for you

    Cheers

    Leave a comment:


  • YazBraimah
    replied
    Kenietz,

    I just used your script and it works beautifully. Thanks for writing it.

    Yaz

    Leave a comment:


  • Lizex
    replied
    Hi Kenietz
    Thanks for the reply.
    I've send the my e-mail adress.

    Regards

    Leave a comment:


  • kenietz
    replied
    Hi Lizex,
    im sorry for the late reply, im very busy lately and could not find time. But better later than never

    firstly it seems to me you compiled samtools before, i mean when you first installed the program. Then, try this:
    1 Edit the source file
    2 make clean
    3 ./configure
    4 make


    This should work.

    'Permission denied' shows that the program is owned most probably by root and you try running it as user. Two things possible here:
    1. Copy/Move the program to a place in your PATH.
    2. Change the owner of the file to be your user with 'chown'. Check 'man chown' for syntax. Also you need to be root to use chown.

    Else provide me with email n i will send you the compiled version of modified samtools plus last version of my script.

    Cheers

    Leave a comment:


  • Lizex
    replied
    Originally posted by figo1019 View Post
    Hi Kenietz

    I have send you a personal message

    Regard
    Hi Kenietz

    I've tried to implement what you've suggested by adding the extra lines to samtools-0.1.18, saving it as samtools_ndsm and re-compiling it by cd into the bin where I keep samtools, typing
    make
    make[2]: Nothing to be done for `lib'.
    make[2]: Nothing to be done for `lib'.
    make[2]: Nothing to be done for `lib'.
    gcc -g -Wall -O2 -o samtools bam_tview.o bam_plcmd.o sam_view.o bam_rmdup.o bam_rmdupse.o bam_mate.o bam_stat.o bam_color.o bamtk.o kaln.o bam2bcf.o bam2bcf_indel.o errmod.o sample.o cut_target.o phase.o bam2depth.o -Lbcftools libbam.a -lbcf -lcurses -lm -lz
    gcc -g -Wall -O2 -o bcftools call1.o main.o ../kstring.o ../bgzf.o ../knetfile.o ../bedidx.o -L. -lbcf -lm -lz
    make[1]: Nothing to be done for `all'.

    I don't know what this means?

    I continued with this command: samtools mpileup -m 100000 accepted_hits.bam > parsed.bam
    [mpileup] 1 samples in 1 input files
    <mpileup> Set max per-file depth to 8000

    I do get a file parsed.bam, which I then used with the script.

    Followed by this script which I've downloaded:
    perl get_consensus_bam_batch_v3.pl -b parsed.bam -t transcripts.gtf > consensus_sequences
    sh: samtools-0.1.18/samtools_ndsm: Permission denied

    I do get a file with this in it:
    E2_Tophat/parsed.bam;MDC000004.264:487-777;/Data_Analysis/E2_data/E2_Tophat/parsed.bam.v3.pileup

    Clearly there's something wrong somewhere? Any pointers???

    Leave a comment:


  • figo1019
    replied
    Originally posted by kenietz View Post
    Hi,
    well i created a script which extracts a consensus seq. I think it works but you may try and see if it works for you. Comments and suggestions are welcomed.

    I attach here the script.

    But you need to modify bam_pileup.c in samtools 0.1.18 directory and then recompile, then rename the compiled binary to samtools_ndsm(the name i used in the script) then put the binary in your PATH.

    In BAM_PILEUP.C you have to modify the function bam_plp_t bam_plp_init. Just after the line:

    iter->flag_mask = BAM_DEF_MASK

    you have to add the following lines:

    iter->flag_mask &= -BAM_FSECONDARY;
    iter->flag_mask &= -BAM_FDUP;

    Else provide me with email and i will send all of it to you. I tried to upload but the binary is bigger than the allowed size.

    Hope it does the job for you
    Cheers
    Hi Kenietz

    I have send you a personal message

    Regard

    Leave a comment:


  • upendra_35
    replied
    Originally posted by kenietz View Post
    Hi,
    well i created a script which extracts a consensus seq. I think it works but you may try and see if it works for you. Comments and suggestions are welcomed.

    I attach here the script.

    But you need to modify bam_pileup.c in samtools 0.1.18 directory and then recompile, then rename the compiled binary to samtools_ndsm(the name i used in the script) then put the binary in your PATH.

    In BAM_PILEUP.C you have to modify the function bam_plp_t bam_plp_init. Just after the line:

    iter->flag_mask = BAM_DEF_MASK

    you have to add the following lines:

    iter->flag_mask &= -BAM_FSECONDARY;
    iter->flag_mask &= -BAM_FDUP;

    Else provide me with email and i will send all of it to you. I tried to upload but the binary is bigger than the allowed size.

    Hope it does the job for you
    Cheers
    Thanks. I have sent a PM to you........

    Leave a comment:


  • kenietz
    replied
    Hi,
    well i created a script which extracts a consensus seq. I think it works but you may try and see if it works for you. Comments and suggestions are welcomed.

    I attach here the script.

    But you need to modify bam_pileup.c in samtools 0.1.18 directory and then recompile, then rename the compiled binary to samtools_ndsm(the name i used in the script) then put the binary in your PATH.

    In BAM_PILEUP.C you have to modify the function bam_plp_t bam_plp_init. Just after the line:

    iter->flag_mask = BAM_DEF_MASK

    you have to add the following lines:

    iter->flag_mask &= -BAM_FSECONDARY;
    iter->flag_mask &= -BAM_FDUP;

    Else provide me with email and i will send all of it to you. I tried to upload but the binary is bigger than the allowed size.

    Hope it does the job for you
    Cheers
    Attached Files

    Leave a comment:


  • upendra_35
    replied
    Originally posted by kenietz View Post
    Hi,
    just figured out that my work around does not work for paired-end reads
    Somehow pileup -c removes some reads except these ones:

    0x0100 s the alignment is not primary
    0x0200 f the read fails platform/vendor quality checks
    0x0400 d the read is either a PCR or an optical duplicate

    I suppose i will have to use directly the command: samtools pileup BAM > pileup.out, then parse the output. Its a bit more tricky but i think will do the job.
    Will update when im done.
    Hi
    I am just wondering have you fixed your script for paired end reads? I would like to try your script to extract the consensus sequences from bam file based on cufflinks gtf coordinates.

    Leave a comment:


  • kenietz
    replied
    Hi,
    just figured out that my work around does not work for paired-end reads
    Somehow pileup -c removes some reads except these ones:

    0x0100 s the alignment is not primary
    0x0200 f the read fails platform/vendor quality checks
    0x0400 d the read is either a PCR or an optical duplicate

    I suppose i will have to use directly the command: samtools pileup BAM > pileup.out, then parse the output. Its a bit more tricky but i think will do the job.
    Will update when im done.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Non-Coding RNA Research and Technologies
    by seqadmin




    Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

    Nobel Prize for MicroRNA Discovery
    This week,...
    10-07-2024, 08:07 AM
  • seqadmin
    Recent Developments in Metagenomics
    by seqadmin





    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
    09-23-2024, 06:35 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 10-02-2024, 04:51 AM
0 responses
104 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-01-2024, 07:10 AM
0 responses
112 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-30-2024, 08:33 AM
1 response
115 views
0 likes
Last Post EmiTom
by EmiTom
 
Started by seqadmin, 09-26-2024, 12:57 PM
0 responses
21 views
0 likes
Last Post seqadmin  
Working...
X