Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cambridge101
    replied
    Originally posted by GenoMax View Post
    Look for studies that have > 10 samples (since you need 10) or take 10 samples from different cancer types.



    http://sra.dnanexus.com/?result_type...q=cancer+exome
    Thank you. I think I want to get the same region from 10 different people.

    Access an online cancer related DNA database resource and select ten (10) DNA sequence strings of length at least 1Mb related to a cancer gene from ten different individuals. Make sure the sequence data is in FASTQ format and stored in one file “DNA.fas”.

    Leave a comment:


  • Richard Finney
    replied
    The comments here have links to sequences for PUBLIC human cancers ...



    BGI liver cancer
    Seoul Genomic Medicine Institute lung cancer
    Changhai Hospital prostate cancer
    MD Andersen Asian Gastric cancer

    I think the data is in NCBI's SRA

    You'll need a lot of disk space and, if you're relatively new, a lot of patience.

    Sadly, a "bam slicer" that cuts out the reads for a region isn't available; though they say some folks are working on it.

    Leave a comment:


  • GenoMax
    replied
    Look for studies that have > 10 samples (since you need 10) or take 10 samples from different cancer types.



    Last edited by GenoMax; 12-20-2014, 02:07 PM.

    Leave a comment:


  • cambridge101
    replied
    Alright... Let's say my oncogene of interest is in the region of 11:15000000..16000000. Therefore, all I need is that region from 10 different people.

    Problem:
    Where do I find that data???

    Any assistance is appreciated.

    Leave a comment:


  • GenoMax
    replied
    I like SNPsaurus' interpretation. 1 Mb (total amount of data or length of region covered) worth of fastq reads in and/or around a cancer gene makes sense.

    @SNPsaurus: Main input for the assignment is still in post #6. The rest of the assignment was informatics goals.

    This will entail a significant amount of work (data collection part) and I hope the assignment has an appropriate amount of credit (unless it is a PhD qualifier exam).

    Leave a comment:


  • SNPsaurus
    replied
    I only vaguely remember the details of the project from your deleted post, but if I were to guess what a reasonable assignment would be, it would be to select a cancer gene, then extract 1 Mb of sequence around the gene from ten different individual genomes, then analyze those 1 Mb regions for the various things asked for in the post.

    You aren't going to find 1 Mb fastq reads, but you can find different individual genomes, or even different "cancer" genomes. You can definitely identify genes related to cancer. As others have said, I'd check back with the assigner of this project for clarification.

    edit: I teach an upper level course in genomic methods and analysis, so am definitely curious what this assignment is about!

    Leave a comment:


  • cambridge101
    replied
    Originally posted by GenoMax View Post
    One place to look for long sequences (in fastq format) will be here: http://www.pacificbiosciences.com/ne.../publications/ I see only a couple of cancer related publication and they are from 2012 (when the reads were not as long as they are today).

    If you drop the cancer requirement then you will get some really long reads here: http://blog.pacificbiosciences.com/2...d-shotgun.html They are not going to be 1Mb so you would still need to do some assembly.
    Thanks! I'm not sure I can drop the cancer requirement. I checked: http://www.pacificbiosciences.com/ne.../publications/ I couldn't locate the data I need.

    I'm open to other suggestions if anyone has any.

    Leave a comment:


  • GenoMax
    replied
    One place to look for long sequences (in fastq format) will be here: http://www.pacificbiosciences.com/ne.../publications/ I see only a couple of cancer related publication and they are from 2012 (when the reads were not as long as they are today).

    If you drop the cancer requirement then you will get some really long reads here: http://blog.pacificbiosciences.com/2...d-shotgun.html They are not going to be 1Mb so you would still need to do some assembly.

    Leave a comment:


  • cambridge101
    replied
    Thanks for your suggestion. I don't think I'll delete it[Did decide to delete]. I don't feel that I'm doing anything unethical. I hope that it's clear that I'm not asking for anyone to complete the project for me. I'm just having a hard time finding that FASTQ data I need.
    Last edited by cambridge101; 12-19-2014, 08:10 AM. Reason: Inconsistant with edit to earlier post.

    Leave a comment:


  • GenoMax
    replied
    I think you should delete the description from the post above. This appears to be a class project/home work?

    Perhaps you should ask whoever assigned the project if they are certain about the fastq requirement.

    Leave a comment:


  • cambridge101
    replied
    Though I don't feel I was doing anything unethical, others may disagree. Therefore, I deleted the complete requirements to avoid any conflict. Thank you.
    Last edited by cambridge101; 12-19-2014, 08:08 AM. Reason: Information not necessary.

    Leave a comment:


  • cambridge101
    replied
    GenoMax - Thanks again for your suggestions and assistance. I'm just a little reluctant to say more about this project right now. I hope you understand.

    Leave a comment:


  • GenoMax
    replied
    Tell us more about what the entire project is about. DNA sequence is just a part of it?

    Leave a comment:


  • cambridge101
    replied
    I have a few days to work on this project. I'm lost.

    Leave a comment:


  • GenoMax
    replied
    Now you say

    You are not going to find DNA sequences in fastq format that are 1Mb in length unless you assemble them yourself preserving the Q-scores (assembly programs do not include Q-scores in final sequence).

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM
  • seqadmin
    Techniques and Challenges in Conservation Genomics
    by seqadmin



    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

    Avian Conservation
    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
    03-08-2024, 10:41 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:37 PM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 06:07 PM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-22-2024, 10:03 AM
0 responses
51 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-21-2024, 07:32 AM
0 responses
67 views
0 likes
Last Post seqadmin  
Working...
X