Inexperienced Question

cambridge101 replied

12-21-2014, 09:43 AM
Originally posted by GenoMax View Post

Look for studies that have > 10 samples (since you need 10) or take 10 samples from different cancer types.

http://sra.dnanexus.com/?result_type=Study&show=25&q=tumor+exome+

http://sra.dnanexus.com/?result_type...q=cancer+exome

Thank you. I think I want to get the same region from 10 different people.

Access an online cancer related DNA database resource and select ten (10) DNA sequence strings of length at least 1Mb related to a cancer gene from ten different individuals. Make sure the sequence data is in FASTQ format and stored in one file “DNA.fas”.
Leave a comment:
Richard Finney replied

12-20-2014, 06:40 PM
The comments here have links to sequences for PUBLIC human cancers ...

RNA-seq Data on Prostate Cancer Publicly Available from BGI

http://www.homolog.us/blogs/blog/2013/06/10/rna-seq-data-on-prostate-cancer-publicly-available-from-bgi/

A comment from reader dogfacemacgee is informative enough so that we made it a main post.

BGI liver cancer
Seoul Genomic Medicine Institute lung cancer
Changhai Hospital prostate cancer
MD Andersen Asian Gastric cancer

I think the data is in NCBI's SRA

You'll need a lot of disk space and, if you're relatively new, a lot of patience.

Sadly, a "bam slicer" that cuts out the reads for a region isn't available; though they say some folks are working on it.
Leave a comment:
GenoMax replied

12-20-2014, 01:59 PM
Look for studies that have > 10 samples (since you need 10) or take 10 samples from different cancer types.

http://sra.dnanexus.com/?result_type=Study&show=25&q=tumor+exome+

http://sra.dnanexus.com/?result_type=Study&show=25&q=cancer+exome

Last edited by GenoMax; 12-20-2014, 02:07 PM.
Leave a comment:
cambridge101 replied

12-20-2014, 12:05 PM
Alright... Let's say my oncogene of interest is in the region of 11:15000000..16000000. Therefore, all I need is that region from 10 different people.

Problem:
Where do I find that data???

Any assistance is appreciated.
Leave a comment:
GenoMax replied

12-20-2014, 05:42 AM
I like SNPsaurus' interpretation. 1 Mb (total amount of data or length of region covered) worth of fastq reads in and/or around a cancer gene makes sense.

@SNPsaurus: Main input for the assignment is still in post #6. The rest of the assignment was informatics goals.

This will entail a significant amount of work (data collection part) and I hope the assignment has an appropriate amount of credit (unless it is a PhD qualifier exam).
Leave a comment:
SNPsaurus replied

12-19-2014, 08:41 PM
I only vaguely remember the details of the project from your deleted post, but if I were to guess what a reasonable assignment would be, it would be to select a cancer gene, then extract 1 Mb of sequence around the gene from ten different individual genomes, then analyze those 1 Mb regions for the various things asked for in the post.

You aren't going to find 1 Mb fastq reads, but you can find different individual genomes, or even different "cancer" genomes. You can definitely identify genes related to cancer. As others have said, I'd check back with the assigner of this project for clarification.

edit: I teach an upper level course in genomic methods and analysis, so am definitely curious what this assignment is about!
Leave a comment:
cambridge101 replied

12-19-2014, 05:34 AM
Originally posted by GenoMax View Post

One place to look for long sequences (in fastq format) will be here: http://www.pacificbiosciences.com/ne.../publications/ I see only a couple of cancer related publication and they are from 2012 (when the reads were not as long as they are today).

If you drop the cancer requirement then you will get some really long reads here: http://blog.pacificbiosciences.com/2...d-shotgun.html They are not going to be 1Mb so you would still need to do some assembly.

Thanks! I'm not sure I can drop the cancer requirement. I checked: http://www.pacificbiosciences.com/ne.../publications/ I couldn't locate the data I need.

I'm open to other suggestions if anyone has any.
Leave a comment:
GenoMax replied

12-19-2014, 05:16 AM
One place to look for long sequences (in fastq format) will be here: http://www.pacificbiosciences.com/ne.../publications/ I see only a couple of cancer related publication and they are from 2012 (when the reads were not as long as they are today).

If you drop the cancer requirement then you will get some really long reads here: http://blog.pacificbiosciences.com/2...d-shotgun.html They are not going to be 1Mb so you would still need to do some assembly.
Leave a comment:
cambridge101 replied

12-19-2014, 05:06 AM
Thanks for your suggestion. I don't think I'll delete it[Did decide to delete]. I don't feel that I'm doing anything unethical. I hope that it's clear that I'm not asking for anyone to complete the project for me. I'm just having a hard time finding that FASTQ data I need.

Last edited by cambridge101; 12-19-2014, 08:10 AM. Reason: Inconsistant with edit to earlier post.
Leave a comment:
GenoMax replied

12-18-2014, 03:42 PM
I think you should delete the description from the post above. This appears to be a class project/home work?

Perhaps you should ask whoever assigned the project if they are certain about the fastq requirement.
Leave a comment:
cambridge101 replied

12-18-2014, 03:22 PM
Though I don't feel I was doing anything unethical, others may disagree. Therefore, I deleted the complete requirements to avoid any conflict. Thank you.

Last edited by cambridge101; 12-19-2014, 08:08 AM. Reason: Information not necessary.
Leave a comment:
cambridge101 replied

12-18-2014, 03:09 PM
GenoMax - Thanks again for your suggestions and assistance. I'm just a little reluctant to say more about this project right now. I hope you understand.
Leave a comment:
GenoMax replied

12-18-2014, 03:01 PM
Tell us more about what the entire project is about. DNA sequence is just a part of it?
Leave a comment:
cambridge101 replied

12-18-2014, 02:57 PM
I have a few days to work on this project. I'm lost.
Leave a comment:
GenoMax replied

12-18-2014, 02:54 PM
Now you say

You are not going to find DNA sequences in fastq format that are 1Mb in length unless you assemble them yourself preserving the Q-scores (assembly programs do not include Q-scores in final sequence).
Leave a comment:

Previous 1 2 template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News