I am using your cufflinks and cuffdiff to do some RNASeq analysis.
1. The commands are:
Cufflinks command:
cufflinks --no-update-check -p 4 -G /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf -b /Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -o cufflinks accepted_hits_C1.bam &
cuffdiff command:
cuffdiff -o cuffdiff_output -p 8 -L C1, C2 -b /Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -u /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf accepted_hits_C1.bam accepted_hits_C2.bam
2. The results are (one gene as an example):
1) cufflinks results:
SDF4 - - SDF4 SDF4 TSS23074 chr1:1152287-1167447 - - 3.8779 3.48702 4.27041 OK
2) cuffdiff results:
SDF4 - - SDF4 SDF4 TSS23074 chr1:1152287-1167447 - - 35.3147 18.8787 51.6994 OK 37.1116 19.9766 54.2858 OK
The FPKM for the same gene SDF4 are different between the cufflinks result and cuffdiff result.
There are a lot of genes which have different FPKMs from cufflinks and cuffdiff.
My understanding is that they should be the consistent. Do you know why ?
Another question is that:
1) cufflinks result:
The gene locus boundary is the RefSeq gene boundary. All the gene locus are correct.
2) cuffdiff result:
The gene locus boundary for some genes is larger than the actual RefSeq gene boundary.
For example,
AAGAB - - AAGAB AAGAB TSS3153 chr15:67493366-67547074 - - 20.5908 9.4824 29.4023 OK 16.6396 9.07309 24.1494 OK
It showed the locus is chr15:67493366-67547074, but the actual locus should be below from UCSC genome browser:
AAGAB at chr15:67493013-67547074 - (NM_024666) alpha- and gamma-adaptin-binding protein p34 isoform 1
AAGAB at chr15:67493013-67547536 - (NM_001271885) alpha- and gamma-adaptin-binding protein p34 isoform 2
AAGAB at chr15:67493013-67547074 - (NM_001271886) alpha- and gamma-adaptin-binding protein p34 isoform 2
Can you tell me why ?
Thanks,
1. The commands are:
Cufflinks command:
cufflinks --no-update-check -p 4 -G /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf -b /Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -o cufflinks accepted_hits_C1.bam &
cuffdiff command:
cuffdiff -o cuffdiff_output -p 8 -L C1, C2 -b /Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -u /Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf accepted_hits_C1.bam accepted_hits_C2.bam
2. The results are (one gene as an example):
1) cufflinks results:
SDF4 - - SDF4 SDF4 TSS23074 chr1:1152287-1167447 - - 3.8779 3.48702 4.27041 OK
2) cuffdiff results:
SDF4 - - SDF4 SDF4 TSS23074 chr1:1152287-1167447 - - 35.3147 18.8787 51.6994 OK 37.1116 19.9766 54.2858 OK
The FPKM for the same gene SDF4 are different between the cufflinks result and cuffdiff result.
There are a lot of genes which have different FPKMs from cufflinks and cuffdiff.
My understanding is that they should be the consistent. Do you know why ?
Another question is that:
1) cufflinks result:
The gene locus boundary is the RefSeq gene boundary. All the gene locus are correct.
2) cuffdiff result:
The gene locus boundary for some genes is larger than the actual RefSeq gene boundary.
For example,
AAGAB - - AAGAB AAGAB TSS3153 chr15:67493366-67547074 - - 20.5908 9.4824 29.4023 OK 16.6396 9.07309 24.1494 OK
It showed the locus is chr15:67493366-67547074, but the actual locus should be below from UCSC genome browser:
AAGAB at chr15:67493013-67547074 - (NM_024666) alpha- and gamma-adaptin-binding protein p34 isoform 1
AAGAB at chr15:67493013-67547536 - (NM_001271885) alpha- and gamma-adaptin-binding protein p34 isoform 2
AAGAB at chr15:67493013-67547074 - (NM_001271886) alpha- and gamma-adaptin-binding protein p34 isoform 2
Can you tell me why ?
Thanks,
Comment