Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • whole exome sequencing analysis

    Hello to everybody.
    I am new in this forum so I hope to use it properly.
    I am dealing with a whole exome sequencing and I am pretty new in the field of bioinformatic, so I am facing lots of problems.
    I started the analysis with the bwa alignment tools and I obtained my SAM and BAM files.
    After that I try to run samtools mpileup (after having sorted and filtered my BAM files from not aligned and paired reads) but the software stalls after a while.
    Is there a way to check if my SAM or BAM files are ok?
    Or do you have any other suggestion of pipelines more suitable for whole exome sequencing?
    Thank you all.

  • #2
    A couple of good links for exome pipelines: http://seqanswers.com/wiki/How-to/exome_analysis and https://www.broadinstitute.org/gatk/...best-practices

    Mpileup can take some time to run. What do you mean by "software stalls for a while"? Are you able to see the process consume CPU cycles in a process monitor (e.g. top)?

    Comment


    • #3
      I am working with Putty, using a server of the university.
      When I say that Samtools mpileup stalls, I mean that the connection between my computer and the server shuts down after more or less three quarter of hour because there is no process that is going on. Mpileup runs for a while as I can see from the output BCF file that is growing in size, but at a certain point the process stops, no more output is written and the connection stops.
      Putty is set to shut down the connection if there is no more process going on.
      It seems that samtools finds a point in the BAM file that is no more processable.

      Comment


      • #4
        If you feel that there is something wrong with your BAM file then you can use the ValidateSamFile tool from Picard to check it: http://broadinstitute.github.io/pica...alidateSamFile

        One other possibility is that at certain times firewalls (between you and the university server) are set to terminate an active ssh session following a period of "inactivity". If that is happening then you should submit the mpileup job (I am assuming that you are not using a compute cluster/job scheduler) using the "nohup" command so it continues to run in the background (http://linux.101hacks.com/unix/nohup-command/) even if your SSH session terminates for any reason.

        Comment


        • #5
          If you have pair end fastq.gz files, you can directly upload them to HiPipe (http://hipipe.ncgm.sinica.edu.tw/) for exome analysis. HiPipe driven by high performance computing has a few pre-configured pipelines such as whole genome variant analysis, RNA seq analysis, etc. available for NGS data analysis. However, most of the pipelines are for human only.

          Comment


          • #6
            Ok.. Thank you all for your precious suggestions. I am trying to do everything that you told me and I will tell you the result.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              The Impact of AI in Genomic Medicine
              by seqadmin



              Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
              02-26-2024, 02:07 PM
            • seqadmin
              Multiomics Techniques Advancing Disease Research
              by seqadmin


              New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

              A major leap in the field has
              ...
              02-08-2024, 06:33 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 02-28-2024, 06:12 AM
            0 responses
            21 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-23-2024, 04:11 PM
            0 responses
            70 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-21-2024, 08:52 AM
            0 responses
            77 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-20-2024, 08:57 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X