Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GenoMax
    replied
    Harold: Take a look at ART as an additional option. http://www.niehs.nih.gov/research/re...tatistics/art/

    Leave a comment:


  • HESmith
    replied
    I want to generate synthetic reads to benchmark some variant-calling pipelines. The strategy is to introduce variants into the reference genome, generate random reads from this reference, then align to the non-variant reference. BBTools randomreads.sh seems perfect for this application, esp. the ability to introduce errors into the reads. Two questions:

    1) I'd like to generate ~40M 50bp SE reads. Is there an upper limit on the number of independent reads that can be produced?

    2) I'd also like to create heterozygous variant reads (by combining the output from a second variant reference). I noticed that randomreads produces the identical output (chromosome/position) by default. Is there a way to change the random generator (e.g., by specifying a different seed) to obtain different output reads?

    Thanks,
    Harold

    Leave a comment:


  • Brian Bushnell
    replied
    I wrote a program for that purpose; it's part of BBTools. Basic usage:

    randomreads.sh ref=genome.fasta out=reads.fq len=100 reads=10000

    You can specify paired reads, an insert size distribution, read lengths (or length ranges), and so forth. But because I developed it to benchmark mapping algorithms, it is specifically designed to give excellent control over mutations. You can specify the number of snps, insertions, deletions, and Ns per read, either exactly or probabilistically; the lengths of these events is individually customizable, the quality values can alternately be set to allow errors to be generated on the basis of quality; there's a PacBio error model; and all of the reads are annotated with their genomic origin, so you will know the correct answer when mapping.

    For usage information, run the shellscript with no arguments (or open it with a text editor).

    I also have a couple of programs for grading sam files generated using these reads by parsing the read names (samtoroc.sh and gradesam.sh).

    Leave a comment:


  • How to generate random short reads from a reference genome

    Hi,

    I am in the process of generating accuracy benchmarks against different assembly algorithms/tools (including one of my own). Is there an established procedure or common tool to generate these reads from a reference genome?

    I know bedtools is able to generate reads based on random positions in the genome, but from what I read seems they will be exact matches (which are randomly placed) against the genome. http://seqanswers.com/forums/images/icons/icon8.gif

    I am looking for a tool that is able to add 'noise' to these generated reads in the form of insertions, deletions or mutations on single or few nucleotides.

    Such a tool or procedure exist?

    Thanks in advance.

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM
  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-13-2024, 08:58 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-12-2024, 02:20 PM
0 responses
17 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-07-2024, 06:58 AM
0 responses
186 views
0 likes
Last Post seqadmin  
Working...
X