Header Leaderboard Ad

Collapse

NCBI SRA database

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • NCBI SRA database

    Hello,

    Is someone familay with NCBI SRA database: http://www.ncbi.nlm.nih.gov/sites/entrez

    searching SRA for SRP000607 about Korean genome study, got 5 experiments,

    What's the relation about experiment, runs and spots?

    These 5 experiment sampled from the same person, all supposed to have paired reads, but SRX002757 does not have paired data.

    Under SRX002761, the reads files are strange to me, like:

    06/11/2009 12:00AM 239 SRR016027.fastq.gz
    06/11/2009 12:00AM 788,821,597 SRR016027_1.fastq.gz
    06/11/2009 12:00AM 797,621,364 SRR016027_2.fastq.gz
    06/11/2009 12:00AM 22,470 SRR016028.fastq.gz
    06/11/2009 12:00AM 809,891,610 SRR016028_1.fastq.gz
    06/11/2009 12:00AM 810,659,524 SRR016028_2.fastq.gz

    SRR016027_1.fastq.gz mates to SRR016027_2.fastq.gz, how about SRR016027.fastq.gz?

    I want to play with this datasets, can I just use all the paired files in these 5 experiments and ignore the unpaired files like SRR016027.fastq.gz?

    Lots experts here, any help will be appriciated!

  • #2
    This link may be helpful to you (it really should be featured more prominently on the SRA)
    http://www.ncbi.nlm.nih.gov/bookshel...cbi&part=Aug09

    Excerpt (I've added linebreaks for clarity). One might think that in your case each Experiment had different instrument parameters or library characteristics and somewhere it would be documented, but as far as I can tell these were all 80x1 runs. Wierd.
    An Experiment describes specifically what was sequenced and the method used. It includes information about the source of the DNA, the Sample, the sequencing platform, and the processing of the data.

    Each Experiment is made up of one or more instrument Runs.

    A Run contains the results or reads from each spot in the instrument run.

    In the future, some data will also have an associated Analysis. These Analyses may include assemblies of the short reads into genomic or transcript contigs and alignment to existing genomes or alignments with SRA data.

    Records at each level have unique accession identifiers with a specific three letter prefix that indicates the type of record: ERP or SRP for Studies, SRS for samples, SRX for Experiments, and SRR for Runs.

    Comment


    • #3
      Thank you, krobison

      That information is quite helpful.

      Comment


      • #4
        SRR016027_1.fastq.gz mates to SRR016027_2.fastq.gz, how about SRR016027.fastq.gz?
        Hi! I've actually got the same question, albeit for a different dataset. If SRR123456_1.fastq mates with SRR123456_2.fastq, then what is the (much smaller), but still "properly" formatted and reasonably sized (~25 Mb in my case) SRR123456.fastq file???
        Thanks in advance!

        Comment


        • #5
          Originally posted by dvanic View Post
          Hi! I've actually got the same question, albeit for a different dataset. If SRR123456_1.fastq mates with SRR123456_2.fastq, then what is the (much smaller), but still "properly" formatted and reasonably sized (~25 Mb in my case) SRR123456.fastq file???
          Thanks in advance!
          I believe SRR123456.fastq contains the "leftovers": reads with missing mates (due to filtering etc. )

          Comment


          • #6
            Hi!can someonde tell me how can i search SRA files trouhgh metadata features (wether in GEO, ENA..)?thanks in advance!

            Comment


            • #7
              Originally posted by VC87 View Post
              Hi!can someonde tell me how can i search SRA files trouhgh metadata features (wether in GEO, ENA..)?thanks in advance!
              Not sure what exactly you are looking for but have you tried the advanced search: http://www.ncbi.nlm.nih.gov/sra/advanced

              Comment


              • #8
                Yes i have.I want to search all SRA files from Bisulfite seq library fixing certain features such as organism, tissue, age, sex etc..thanks anyway for your reply!

                Comment


                • #9
                  A search found this: http://sra.dbcls.jp/search

                  Project here: https://github.com/inutano/soylatte

                  R-solution: https://www.bioconductor.org/package...tml/SRAdb.html

                  Comment


                  • #10
                    Genomax thanks for your reply!i'll check that out

                    Comment


                    • #11
                      Does anyone know how to get the raw SRA files associated with the samples that we can search in the browser from the epigenomics database of NCBI? i suppose it should be possible to gte them from the sample ID but i dont know how to...

                      Comment


                      • #12
                        Do you want the SRA files or the fastq files?

                        Comment


                        • #13
                          SRA, for now

                          Comment


                          • #14
                            SRAtoolkit makes it easy to download the actual fastq data since you would have to uncompress the SRA files locally anyway. The toolkit saves you a step. You are most likely going to use the "fastq-dump" program. Help here: http://www.ncbi.nlm.nih.gov/Traces/s...ew=toolkit_doc

                            Comment


                            • #15
                              Thanks again.By the way, do you know if it is possible to convert wig to fasta (or SRA)?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
                                by seqadmin


                                ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

                                01-24-2023, 01:19 PM
                              • seqadmin
                                Introduction to Single-Cell Sequencing
                                by seqadmin
                                Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

                                The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
                                ...
                                01-09-2023, 03:10 PM
                              • seqadmin
                                AVITI from Element Biosciences: Latest Sequencing Technologies—Part 6
                                by seqadmin
                                Element Biosciences made its sequencing market debut this year when it released AVITI, its first sequencer. The AVITI System uses avidity sequencing, a novel sequencing chemistry that delivers higher quality data, decreases cycle times, and requires lower reagent concentrations. This new instrument reportedly features lower operating and start-up costs while maintaining quality sequencing.

                                Read type and length
                                AVITI is a short-read benchtop sequencer that also offers an innovative...
                                12-29-2022, 10:44 AM

                              ad_right_rmr

                              Collapse
                              Working...
                              X