cuffdiff for groups - SEQanswers

bvk replied

05-06-2015, 02:35 AM
Originally posted by GenoMax View Post

Technically cuffdiff command as outlined above will work. But Devon has already warned you about the consequence in post #5.

I used this in command line:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

Now I want to run this from a python script. So, I gave a call in this way.

do.call([cfg.tool_cmd("cuffdiff"), "-p", str(cfg.project["analysis"]["threads"]), "-b", str(cfg.project["genome"]["fasta"]), "-u", cfg.project["samples"][0]["files"]["merging_gtf"], "-L", str(cfg.project["phenotype"][0]), str(cfg.project["phenotype"][1]), "-o", output_folder] + [cfg.project["samples"][0]["files"]["bam"] cfg.project["samples"][1]["files"]["bam"]], cfg.project["analysis"]["log_file"])

but I get an error: invalid syntax. Can you please help in this?
Leave a comment:
bvk replied

05-05-2015, 04:54 AM
Originally posted by GenoMax View Post

Technically cuffdiff command as outlined above will work. But Devon has already warned you about the consequence in post #5.

Yes, I agree but I would like to know how to check the expression levels of each sample if I run this with merged bam file.
Leave a comment:
bvk replied

05-05-2015, 04:46 AM
Originally posted by dpryan View Post

You only have 2 samples anyway. Any sort of metric you'd get from each of the subfiles isn't terribly meaningful unless you're interested in looking at technical variance.

Ok will see. thanks !! But I would like to know like how to check the expression levels of each sample if I run this with merged bam file
Leave a comment:
GenoMax replied

05-05-2015, 04:42 AM
Technically cuffdiff command as outlined above will work. But Devon has already warned you about the consequence in post #5.
Leave a comment:
bvk replied

05-05-2015, 04:32 AM
Originally posted by GenoMax View Post

Arrange the labels (larval/early) according to their correspondence with the BAM files.

Code:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

As you said If I run the following command

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

Now, If I want to find the expression levels for each sample I guess it is not possible, cz of the merged bam file.

Is it possible to give labels as larval,early and giving bam files of 6 samples. Will this work?

for eg:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L embryo,larva -u merged_asm/merged.gtf ../tophat/em/SRR493359_thout/accepted_hits.bam,../tophat/em/SRR493360_thout/accepted_hits.bam,../tophat/em/SRR493361_thout/accepted_hits.bam ../tophat/la/SRR493363_thout/accepted_hits.bam,../tophat/la/SRR493364_thout/accepted_hits.bam,../tophat/la/SRR493365_thout/accepted_hits.bam
Leave a comment:
dpryan replied

05-05-2015, 04:29 AM
You only have 2 samples anyway. Any sort of metric you'd get from each of the subfiles isn't terribly meaningful unless you're interested in looking at technical variance.
Leave a comment:
bvk replied

05-05-2015, 04:26 AM
Originally posted by dpryan View Post

No, it's not. You're making two groups with the BAM files and are giving those two groups 6 labels rather than 2. "-L larval,early" would make more sense, though see my earlier reply.

As you said If I run the following command

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

Now, If I want to find the expression levels for each sample I guess it is not possible, cz of the merged bam file.

Is it possible to give labels as larval,early and giving bam files of 6 samples. Will this work?

for eg:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L embryo,larva -u merged_asm/merged.gtf ../tophat/em/SRR493359_thout/accepted_hits.bam,../tophat/em/SRR493360_thout/accepted_hits.bam,../tophat/em/SRR493361_thout/accepted_hits.bam ../tophat/la/SRR493363_thout/accepted_hits.bam,../tophat/la/SRR493364_thout/accepted_hits.bam,../tophat/la/SRR493365_thout/accepted_hits.bam
Leave a comment:
bvk replied

05-05-2015, 03:53 AM
Originally posted by dpryan View Post

No, it's not. You're making two groups with the BAM files and are giving those two groups 6 labels rather than 2. "-L larval,early" would make more sense, though see my earlier reply.

Thankyou !! I got it.
Leave a comment:
bvk replied

05-05-2015, 03:44 AM
Originally posted by GenoMax View Post

Arrange the labels (larval/early) according to their correspondence with the BAM files.

Code:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

Yes, I understood. Thank you very much !!
Leave a comment:
GenoMax replied

05-05-2015, 03:43 AM
Originally posted by bvk View Post

so you say that it should look like this:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L SRR493359,SRR493360,SRR493361,SRR493363,SRR493364,SRR493365 -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

SRR493359_60_61_thout which has merged bam file of 59,60 and 61
SRR493363_64_65_thout which has merged bam file of 63,64 and 65

Arrange the labels (larval/early) according to their correspondence with the BAM files.

Code:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam
Leave a comment:
bvk replied

05-05-2015, 03:37 AM
Originally posted by dpryan View Post

Aside from your syntax problems, you have a more serious issue in that you're not specifying the actual experiment correctly. SRR493359, SRR493360 and SRR493361 are from the same sample and should just be merged together into a single BAM file. Similarly SRR493363, SRR493364 and SRR493365 are from the same sample. So, you actually have a 1 vs. 1 sample comparison. Do NOT lump each of the files for each sample into a group, since then you're making fake replicates and will have largely meaningless results (of course, you're doing a 1 vs. 1 comparison, so the results aren't exactly robust to begin with).

so you say that it should look like this:

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L SRR493359,SRR493360,SRR493361,SRR493363,SRR493364,SRR493365 -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

SRR493359_60_61_thout which has merged bam file of 59,60 and 61
SRR493363_64_65_thout which has merged bam file of 63,64 and 65
Leave a comment:
dpryan replied

05-05-2015, 03:34 AM
Originally posted by bvk View Post

ok. now the below code looks fine I guess

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L SRR493359,SRR493360,SRR493361,SRR493363,SRR493364,SRR493365 -u merged_asm/merged.gtf ../tophat/em/SRR493359_thout/accepted_hits.bam,../tophat/em/SRR493360_thout/accepted_hits.bam,../tophat/em/SRR493361_thout/accepted_hits.bam ../tophat/la/SRR493363_thout/accepted_hits.bam,../tophat/la/SRR493364_thout/accepted_hits.bam,../tophat/la/SRR493365_thout/accepted_hits.bam

No, it's not. You're making two groups with the BAM files and are giving those two groups 6 labels rather than 2. "-L larval,early" would make more sense, though see my earlier reply.
Leave a comment:
bvk replied

05-05-2015, 03:29 AM
Originally posted by GenoMax View Post

The -L option you have used in your command is not two group names for three samples in each group.

", ../tophat/la/SRR493365_thout/accepted_hits.bam " It also looks like there is a space between , and the .. but that may be an illusion in the way the browser is displaying it.

ok. now the below code looks fine I guess

cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L SRR493359,SRR493360,SRR493361,SRR493363,SRR493364,SRR493365 -u merged_asm/merged.gtf ../tophat/em/SRR493359_thout/accepted_hits.bam,../tophat/em/SRR493360_thout/accepted_hits.bam,../tophat/em/SRR493361_thout/accepted_hits.bam ../tophat/la/SRR493363_thout/accepted_hits.bam,../tophat/la/SRR493364_thout/accepted_hits.bam,../tophat/la/SRR493365_thout/accepted_hits.bam
Leave a comment:
dpryan replied

05-05-2015, 03:29 AM
Aside from your syntax problems, you have a more serious issue in that you're not specifying the actual experiment correctly. SRR493359, SRR493360 and SRR493361 are from the same sample and should just be merged together into a single BAM file. Similarly SRR493363, SRR493364 and SRR493365 are from the same sample. So, you actually have a 1 vs. 1 sample comparison. Do NOT lump each of the files for each sample into a group, since then you're making fake replicates and will have largely meaningless results (of course, you're doing a 1 vs. 1 comparison, so the results aren't exactly robust to begin with).
Leave a comment:
GenoMax replied

05-05-2015, 03:23 AM
The -L option you have used in your command is not two group names for three samples in each group.

", ../tophat/la/SRR493365_thout/accepted_hits.bam " It also looks like there is a space between , and the .. but that may be an illusion in the way the browser is displaying it.
Leave a comment:

Previous 1 2 template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News