Seqanswers Leaderboard Ad

**Meyana** · 03-21-2019, 09:57 PM

I run UpSetR by inputting individual sets as a list and then the program calculates overlap itself (I am not aware whether it allows you to "manually" input the overlaps, never tried that).

#make input
list.Input = list(set1=data1,set2=data2,set3=data3)
#run upsetr
upset(fromList(list.Input),sets=c("set1","set2","set3"))

.. and then just adding additional commands (keep.order, nintersects, etc...) as needed.

**deKoch13** · 03-22-2019, 01:05 AM

Tried it out, but...

Thank you, Meyana.

I tried your idea, but it still won't work.
How do your input data look like?

I just input 3 text files that each contain one column (read identifier from BAM files).
The upset output plot shows me the three sets, but no intersections.
Any suggestions?

Many greetings

**Meyana** · 03-22-2019, 01:13 AM

My data1/data2/data3 are just vectors of the observations, which I then store in the list listInput, nothing special. The data observations themselves can have any format, mine look something like "A344D".

Did you store your data in the list?

**deKoch13** · 03-22-2019, 01:31 AM

This is what I've done:

#imported
library(UpSetR)

#make input
list.Input = list(set1 = "trimmed_bismark_bt2_pe.bam_mapped_reads.txt",
set2 = "shuffled_bismark_bt2_pe.bam_mapped_reads.txt",
set3 = "econstructed_bismark_bt2_pe.bam_mapped_reads.txt")

upset(fromList(list.Input), sets = c("set1", "set2", "set3"),
number.angles = 30, point.size = 3.5, line.size = 2,
mainbar.y.label = "Read Intersections", sets.x.label = "Blabla",
text.scale = c(1.3, 1.3, 1, 1, 2, 0.75), mb.ratio = c(0.55, 0.45),
order.by = 'freq', keep.order = TRUE)

So, I think that I stored the sets in a list. I also checked it with print(class(list.Input)).
Maybe, the package does not accept my input... three text files, one column each, just read identifier...

**Meyana** · 03-24-2019, 04:37 PM

Your code works fine on my data.
Could you post a snippet of your data?

**deKoch13** · 03-25-2019, 02:40 AM

Works now!

Hi Meyana,

it works now!
But you were absolutely right generating a set list and use the fromList function.
I was not aware that fromList creates a binary data frame that is compatible with the UpSet package.

Just for other forum users, my functional code:

library(UpSetR)

trimmed_df <- read.csv(file = "tri.txt", header = FALSE, sep = "\n")
shuffled_df <- read.csv(file = "shu.txt", header = FALSE, sep = "\n")
reconstructed_df <- read.csv(file = "rec.txt", header = FALSE, sep = "\n")

trimmed <- as.vector(trimmed_df$V1)
shuffled <- as.vector(shuffled_df$V1)
reconstructed <- as.vector(reconstructed_df$V1)

read_sets = list(
trimmed_reads = trimmed,
shuffled_reads = shuffled,
reconstructed_reads = reconstructed)

upset(fromList(read_sets),
sets = c("trimmed_reads", "shuffled_reads", "reconstructed_reads"),
number.angles = 20, point.size = 2.5, line.size = 1.5,
mainbar.y.label = "read intersection", sets.x.label = "read set size",
text.scale = c(1.5, 1.5, 1.25, 1.25, 1.5, 1.5), mb.ratio = c(0.65, 0.35),
group.by = "freq", keep.order = TRUE)

Again, thank you Meyana!

**Meyana** · 03-25-2019, 03:58 PM

Great, happy to see it working for you!

In addition to the UpSetR package, there's also the SuperExactTest package, which you may also find interesting (though the graphical output is not the prettiest)

**guri** · 05-10-2019, 10:46 AM

Upset error

hi,

I have tried using upset plot for three vcf files from different pipelines. I extracted the variant column (SNPs) and used these csv files (with one column) for R import. I have used this code:

set1 <- read.csv("set1.vcf", sep="")
set2 <- read.csv("set2.vcf", sep="")
set3 <- read.csv("set3.vcf", sep="")

set1 <- as.vector(set1$V1)
set2 <- as.vector(set2$v1)
set3 <- as.vector(set3$V1)

read_sets = list(set1_reads = set1,
set2_reads = set2,
set3_reads = set3)

upset(fromList(read_sets),
sets = c("set1_reads", "set2_reads", "set3_reads"),
number.angles = 20, point.size = 2.5, line.size = 1.5,
mainbar.y.label = "read intersection", sets.x.label = "read set size",
text.scale = c(1.5, 1.5, 1.25, 1.25, 1.5, 1.5), mb.ratio = c(0.65, 0.35),
group.by = "freq", keep.order = TRUE)

It gives an intersection plot but when the number of SNPs from upset plot are really low when I compared these with vcf-compare results using same vcf files. I am not sure why I am getting different numbers with upset plot.

Topics	Statistics	Last Post
The Adaptation of the Cell Cycle in Multiciliated Cells by seqadmin Started by seqadmin, Yesterday, 06:58 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 06:58 AM
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, 06-06-2024, 08:18 AM	0 responses 20 views 0 likes	Last Post by seqadmin 06-06-2024, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, 06-06-2024, 08:04 AM	0 responses 18 views 0 likes	Last Post by seqadmin 06-06-2024, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM

Seqanswers Leaderboard Ad

Announcement

UpSet R plot, input data format wrong?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News