ChIP-Seq Challenge - SEQanswers

Nix replied

03-18-2010, 06:32 AM
Yes, I've built a rather sophisticated tool for doing just this sort of thing. See IntersectRegions in the USeq package.

**************************************************************************************
** Intersect Regions: August 2008 **
**************************************************************************************
IR intersects lists of regions (tab delimited: chrom start stop(inclusive)). Random
regions can also be used to calculate a p-value and fold enrichment.

-f First regions files, a single file, or a directory of files.
-s Second regions files, a single file, or a directory of files.
-g Max gap, defaults to 0. A max gap of 0 = regions must abut, negative values force
overlap (ie -1= 1bp overlap, be careful not to exceed the length of the smaller
region), positive values enable gaps (ie 1=1bp gap).
-e Score intersections where second regions are entirely contained by first regions.
-r Make random regions matched to the second regions file(s) and intersect with the
first. Enter the full path directory text containing chromosome specific
interrogated regions files (ie named: chr1, chr2 ...: chrom start stop(inclusive)).
-c Match GC content of second regions file(s) when selecting random regions, rather
slow. Provide a full path directory text containing chromosome specific genomic
sequences. To speed the matching place the fraction GC in the last column of
your region file(s).
-n Number of random region trials, defaults to 1000.
-w Write intersections and differences.
-x Write paired intersections.
-p Print length distribution histogram for gaps between first and closest second.
-q Parameters for histogram, comma delimited list, no spaces:
minimum length, maximum length, number of bins. Defaults to -100, 2400, 100.

Example: java -Xmx1500M -jar pathTo/Apps/IntersectRegions -f /data/miRNAs.txt
-s /data/DroshaLists/ -g 500 -n 1000 -r /data/InterrogatedRegions/
Leave a comment:
avilella replied

03-18-2010, 01:59 AM
comparing peak set profiles in chip-seq datasets

Hi,

Is there any tool that will tell me how different/similar two chip-seq peak sets are in two different parts of the genome?

E.g. if I have a ~10Kb region in the genome with a series of peaks and another ~10Kb region in the genome with another set of peaks from the same experiment, can I calculate a distance measure between these two peak set profiles with any available tool?

Cheers
Leave a comment:
Nix replied

03-16-2010, 06:16 AM
Hello Gonghong,

Yes, Novoalign is it's own beast (an excellent one at that) and is from Novocraft. So first run your reads through their aligner and then process your data with USeq. For chIP-seq you can probably get by with little loss in resolution using the xxx.sorted.gz alignments that came off the default Eland aligner that runs with the Illumina pipeline. Or barring those, use Bowtie for fast ungapped alignments.

-cheers, D
Leave a comment:
weigonghong replied

03-15-2010, 07:55 AM
Hello David,

I'm a new user for USeq. For ChIP-seq analysis, first step is to do genome mapping with Novoaligner. However, I can't find Novoaligner in USeq_5.6/Apps. Is NovoalignParser instead?
You gave an example for mRNA-seq by using NovoalignParser: java -Xmx1500M -jar pathToUSeq/Apps/NovoalignParser -f /Novo/Run7/
-v H_sapiens_Mar_2006 -p 20 -q 30 -r /Novo/Run7/mRNASeq/ -i -g
/Anno/Hg18/mergedUCSCKnownGenes.bed

Then I compiled this command: java -jar USeq_5.6/Apps/NovoalignParser -f /wrk/data/biomedicum_solexa-090805/s_4_sequence.txt / -v /wrk/data/genomes/homo_sapiens/dna/Homo_sapiens.NCBI36.49.dna.all_chromosomes.fasta -p 20 -q 30 -r /wrk/data/gonghong/useq –i

Then there are some dialogues coming out as below:
20.0 Posterior probability threshold
30.0 Alignment score threshold

Parsing and filtering...
/wrk/data/biomedicum_solexa-090805/s_4_sequence.txt
Problem identifing chromosome column? No '>chr' found in 1st 1000 lines?

Could you please help to figure it out what happened? I'm wet-experiment postdoc and extremely want to use USeq for ChIP-seq data analysis.

I'm looking forward to your reply.

Thanks a lot.
Gonghong
Leave a comment:
Nix replied

02-03-2010, 08:49 AM
There are a lot of files associated with the results. I also wanted this archived so follow the link above and download the README_Report.doc.zip file for the summary.
Leave a comment:
golharam replied

02-03-2010, 08:06 AM
Any chance someone can post a summary of the results of the challenge on here? I know this is late, but it would be interesting for others to see.
Leave a comment:
ewilbanks replied

08-20-2009, 09:03 AM
OK thanks!
Leave a comment:
Nix replied

08-20-2009, 06:14 AM
I would cite this thread and the archive on sourceforge via html links https://sourceforge.net/projects/use...PSeqChallenge/ .
Leave a comment:
ewilbanks replied

08-19-2009, 04:57 PM
Hi David,

This is a great resource! If we were to cite it, how would you like us to do that?

Thanks!
Lizzy
Leave a comment:
bioinfosm replied

07-14-2009, 07:03 AM
Well, there is a different one, but a chipSEQ challenge = http://camda2009.bioinformatics.northwestern.edu/
Leave a comment:
inesdesantiago replied

06-30-2009, 11:39 AM
I would like to hear about the Chip-Seq Challenge 2.0!
Leave a comment:
simulation11 replied

05-04-2009, 05:45 PM
Wow...that's great posts. Thanks a lot for sharing.

simulation rachat credit
Leave a comment:
Nix replied

04-07-2009, 08:33 AM
Final report

Hello Folks,

The ChIP-Seq Challenge 1.0 is over! It's been a resounding success with 13 submissions representing 12 analysis packages. Many congrats and thanks to both the players and Illumina and Applied Biosystems for providing prizes.

The datasets, submissions, analysis, and results have been archived on SourceForge on the USeq project site under CommunityChIPSeqChallenge (https://sourceforge.net/project/show...kage_id=317544).

-cheers, David
Leave a comment:
Nix replied

04-02-2009, 08:13 AM
JSP, you are correct there are a couple key regions in close proximity that can be intersected by one candidate, thus it is possible to hit 501 key regions in the top 500 list.

As far as I am aware folks candidate regions aren't excessively large, all under 500bp.

The number of double hits are minor and won't effect the overall results.

And no, multiple hits to the same key only count once.

I'll put together a list of the actual centers used to generate the random fragments and let those interested calculate the intersections. There are several problems with this approach, namely the observed center is not the same as the actual center since read distribution is skewed by the presence of poorly alignable repeats and low complexity regions. Which do you use? Again, I very much doubt it will change the overall results.

As for additional methods, by all means run them using the simulated data and I can add them to the charts.
Leave a comment:
jsp replied

04-02-2009, 07:48 AM
Hello David,

I saw some top 500 peak list can identify 501 key regions, and this doesn’t make any sense to me. The reason is either two key regions overlapping two much or identified peak region is too big. So I propose the following suggestions here:

1. Cleaner key regions -- for neighboring key regions with too much overlaps (for example more than 40%), they should be merged into a single key region. (A good method should be able to identify key regions with some limited amount of overlapping, and that might be the theme for Community ChIP-Seq Challenge 2.0?)

2. A more objective criteria (related to the resolution of the submitted binding regions) – take the midpoint of each identified peak region and check whether it falls within a key region. ChipMaster raised a question about submitting a list with “chr1:1-lengthOfChr1” before, and “1kb rule” still favors to results with larger peak regions.

3. The above two is to avoid cases that one peak covers two key regions, we also need to avoid the cases that a single key region is identified multiple times by small peaks (I don’t know whether this has been taken care of already).

It will be interesting to see the distribution of distances b/t the identified peak centers and their corresponding key region centers.

Please change “ParkLab” to “BPC” (which stands for binding profile construction) in the report. My lab mate published a package (spp:
http://compbio.med.harvard.edu/Supplements/ChIP-seq/) on ChIP-seq peak detection, and it performed really really well on many published real ChIP-seq data sets. I hope that my participation of this challenge with my beta version of BPC won’t mislead people to think it’s the best method from Park Lab.

Thanks for putting all these together.

Looking forward to challenge 2.0
Leave a comment:

Previous 1 2 3 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News