Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • fkrueger
    Senior Member
    • Sep 2009
    • 627

    #46
    The methylation information can be imported into SeqMonk whereby '+' reads are methylated and '-' reads are non-methylated cytosines. Use the position value for both start and end of the cytosine methylation calls. You can then perform a probe generation over individual C positions (e.g. read position probe generation) and do a relative quantitation of 'FORWARD' reads 'as percentage of' 'ALL READS'. You could also look at other genomic features such as CGIs, promoters etc.

    If you are not necessarily interested in strand specific methylation you can also import *bismark.txt files directly into SeqMonk using the Bismark import filter where you can select the context you are interested in. Hope this helps.

    Comment

    • simonandrews
      Simon Andrews
      • May 2009
      • 870

      #47
      SeqMonk v0.19.0 released

      As a somewhat belated christmas present I've just put up the release of SeqMonk v0.19.0 onto our project web page. In this release we've made some fairly major changes to the core data model which mean that we get a significant increase in loading and saving speeds (load times are around half what they were before), along with a big decrease in the running memory footprint (down by around 4X) as well as a nice speed increase in many of the analysis functions.

      Along with this we've improved some of the plots (aligned probes and probe trend), and have put some new display options into the main chromosome view (greater raw read density, fixed colours for individual datasets).

      We've been running this build internally for a while and have seen large increases in the amount of data we've been able to handle - along with a pleasant reduction in the amount of time we spend watching little red bars slowly crawl across the screen.

      The updated version is available from our project page. As always, if you don't see the new version try pressing shift+refresh in your browser to bypass the annoying BBSRC proxy server.

      If you have any problems, either add a note to this thread, or report them in our bugzilla system.

      Comment

      • beajorrin
        Junior Member
        • Jan 2012
        • 6

        #48
        I'm trying to visualizer my data with SeqMonk. My data is Illumina pair-based sequences, I work fist with Galaxy, and do Bowtie there. So now i have SAM and BAM files. I could import my reference genome, changing those thing in the AC and product, locus_tag. I try first with the BAM file, but when I import this data and the SeqMonk reads it it told me "Couldn't extract a valid name from <name>".
        So I go to the reference genome that i used in galaxy (the same that i used in seqMonk) and change in the fasta file the AC/ID, that I use in SeqMonk reference genome. And the answer is the same.

        I don't try yet the SAM, but I think that the problem is the reference genome used in galaxy.

        Thanks

        Comment

        • simonandrews
          Simon Andrews
          • May 2009
          • 870

          #49
          If your reference genome is chromosome based, but the identifiers are not chromosome names but accession numbers or something similar then you need to define some custom chromosome name mappings so SeqMonk can figure out which accession refers to which chromosome. Once you have the mappings set up then the import should work.

          Comment

          • goofy
            Junior Member
            • Jan 2012
            • 1

            #50
            Total Read Count Differece

            Hi,

            I'm trying to quantitate the percentage distribution of TF enrichment of my control and treated samples but I got a massive Total Read Count between the samples, just wondering what it means. I've designed probes based on promoter, introns, exons region, they're ok but I want to normalize that against the total read count. My other ChUp-Seq's total read counts are relatively similar between control and treated samples but just this TF ChIP-Seq has a massive difference. Anyone know what this means????

            Comment

            • beajorrin
              Junior Member
              • Jan 2012
              • 6

              #51
              Originally posted by simonandrews View Post
              If your reference genome is chromosome based, but the identifiers are not chromosome names but accession numbers or something similar then you need to define some custom chromosome name mappings so SeqMonk can figure out which accession refers to which chromosome. Once you have the mappings set up then the import should work.
              Thanks, It works!

              Comment

              • simonandrews
                Simon Andrews
                • May 2009
                • 870

                #52
                Originally posted by goofy View Post
                Hi,

                I'm trying to quantitate the percentage distribution of TF enrichment of my control and treated samples but I got a massive Total Read Count between the samples, just wondering what it means. I've designed probes based on promoter, introns, exons region, they're ok but I want to normalize that against the total read count. My other ChUp-Seq's total read counts are relatively similar between control and treated samples but just this TF ChIP-Seq has a massive difference. Anyone know what this means????
                Total read count isn't always a great thing to normalise to. In some cases (particularly in ChIP samples) you can get a huge number of sequences mapping to a small number of loci. Often these will be mis-mappings, maybe even of regions which aren't in the assembly (telomeric or centromeric repeats for example). We've seen cases where 40% of reads in a ChIP (a MeDIP actually) came from this kind of sequence and mapped to just 12 locations. This kind of bias can hugely throw off your normalisation.

                Within SeqMonk you can use the cumulative distribution plot to look at how well your samples are normalised. If your total count has thrown off the normalisation then you'll probably see lines running parallel to each other. In this case you can then use the percentile normalisation quantitation method to correct your normalisation to a specific point in your distribution where the distributions look to be equivalent, and this should remove any odd biases in the total counts.

                I'm actually going to be releasing our Advaanced SeqMonk course documentation in the next couple of weeks, and there will be a whole section on sorting out data normalisation which will go through these kinds of issues in much more detail.

                Comment

                • simonandrews
                  Simon Andrews
                  • May 2009
                  • 870

                  #53
                  I've just release SeqMonk v0.20.0 onto our repositories. This address a potentially nasty bug in v0.19 which may have truncated some filtered probe lists in any projects saved with that version.

                  The bug would affect you if your probe set contained multiple probes at exactly the same genomic position. In practice this only really happens if you make feature based probes and don't select the option to remove exact duplicates. If you made probe sets like this in v0.19.0 you should recalculate any filtered lists you have made with that version. Most of these won't actually have been affected, but since we can't spot a truncated list automatically it's better to be safe than sorry.

                  The gory details of the bug can be found on our bugzilla server.

                  Other changes in this release are:
                  • We fixed a bug in the Intensity Difference Filter which was adding the same hit multiple times. All reported hits were real hits, but some may have been duplicated.
                  • We fixed a display bug for deduplicated HiC data when it was first imported. Saving and reloading the project would fix the problem.
                  • We added a new quantitaiton pipeline to allow you to easily make 'wiggle' type plots.


                  The new version is now available from our project page and all users of the previous version are strongly advised to upgrade immediately.

                  Comment

                  • mediator
                    Member
                    • Nov 2010
                    • 27

                    #54
                    Hi Simon,
                    I am using the Seqmonk to analyze my RNA Seq data right now. It's very straightforward and intuitive. Just have a question, after I used the quantitation pipeline to perform RPKM calculation on my data, how do I save the RPKM for all the probes in a export file? Thank you!


                    Originally posted by simonandrews View Post
                    I've just release SeqMonk v0.20.0 onto our repositories. This address a potentially nasty bug in v0.19 which may have truncated some filtered probe lists in any projects saved with that version.

                    The bug would affect you if your probe set contained multiple probes at exactly the same genomic position. In practice this only really happens if you make feature based probes and don't select the option to remove exact duplicates. If you made probe sets like this in v0.19.0 you should recalculate any filtered lists you have made with that version. Most of these won't actually have been affected, but since we can't spot a truncated list automatically it's better to be safe than sorry.

                    The gory details of the bug can be found on our bugzilla server.

                    Other changes in this release are:
                    • We fixed a bug in the Intensity Difference Filter which was adding the same hit multiple times. All reported hits were real hits, but some may have been duplicated.
                    • We fixed a display bug for deduplicated HiC data when it was first imported. Saving and reloading the project would fix the problem.
                    • We added a new quantitaiton pipeline to allow you to easily make 'wiggle' type plots.


                    The new version is now available from our project page and all users of the previous version are strongly advised to upgrade immediately.

                    Comment

                    • simonandrews
                      Simon Andrews
                      • May 2009
                      • 870

                      #55
                      Originally posted by mediator View Post
                      Hi Simon,
                      I am using the Seqmonk to analyze my RNA Seq data right now. It's very straightforward and intuitive. Just have a question, after I used the quantitation pipeline to perform RPKM calculation on my data, how do I save the RPKM for all the probes in a export file? Thank you!
                      Simply create an annotated probe report (Reports > Create Annotated Probe Report). You don't actually need to add any additional annotation as the probes themselves will be named after the transcript to which they relate.

                      Comment

                      • beajorrin
                        Junior Member
                        • Jan 2012
                        • 6

                        #56
                        I'm really think that SeqMonk is very useful, but i have a problem. I'm working with Illumina pair-end reads, I've trimmed my reads by quality, I've mapped it with Bowtie and finally I've transformed it from sam to bam. I`ve visualized it with Seqmonk, and I've observed that my reads are assembled (and Bowtie don`t assemble, just map). It dosen´t happen if don`t trim my data. What could be the problem?
                        Thanks

                        Comment

                        • simonandrews
                          Simon Andrews
                          • May 2009
                          • 870

                          #57
                          Originally posted by beajorrin View Post
                          I'm really think that SeqMonk is very useful, but i have a problem. I'm working with Illumina pair-end reads, I've trimmed my reads by quality, I've mapped it with Bowtie and finally I've transformed it from sam to bam.
                          OK, I'm with you so far (but for the record you could have left out the last step since SeqMonk would have read the SAM files directly - and doesn't care whether they're sorted or not).

                          Originally posted by beajorrin View Post
                          I`ve visualized it with Seqmonk, and I've observed that my reads are assembled (and Bowtie don`t assemble, just map). It dosen´t happen if don`t trim my data. What could be the problem?
                          I'm not sure what you mean here when you say your reads are assembled. SeqMonk will pack your mapped reads together so you can see as many as possible on the screen, but this isn't an assembly - it's just showing the positions of the reads in the existing genome assembly you mapped against with bowtie. You should have got this whether your data was trimmed or not (except that your untrimmed data might have been more spread out since the mapping efficiency might have been much lower). Could you describe (or post small pictures of) exactly what you're seeing which concerns you?

                          Comment

                          • beajorrin
                            Junior Member
                            • Jan 2012
                            • 6

                            #58
                            Originally posted by simonandrews View Post
                            OK, I'm with you so far (but for the record you could have left out the last step since SeqMonk would have read the SAM files directly - and doesn't care whether they're sorted or not).



                            I'm not sure what you mean here when you say your reads are assembled. SeqMonk will pack your mapped reads together so you can see as many as possible on the screen, but this isn't an assembly - it's just showing the positions of the reads in the existing genome assembly you mapped against with bowtie. You should have got this whether your data was trimmed or not (except that your untrimmed data might have been more spread out since the mapping efficiency might have been much lower). Could you describe (or post small pictures of) exactly what you're seeing which concerns you?
                            Hi!
                            First, thanks for your quickly answer.

                            What i see is different read length if my data. I have reads, in my original data, of at least 100pb, but when I viualized it whit Seqmonk I have read of 9000 pb or more. It could be because the maximum insert size for valid paired-end alignments? I've set it in 10000. Could Seqmonk join this reads that are far away one from other? or is how i map the reads?
                            thanks

                            (I upload an image)
                            Attached Files

                            Comment

                            • simonandrews
                              Simon Andrews
                              • May 2009
                              • 870

                              #59
                              Originally posted by beajorrin View Post
                              Hi!
                              First, thanks for your quickly answer.

                              What i see is different read length if my data. I have reads, in my original data, of at least 100pb, but when I viualized it whit Seqmonk I have read of 9000 pb or more. It could be because the maximum insert size for valid paired-end alignments? I've set it in 10000. Could Seqmonk join this reads that are far away one from other? or is how i map the reads?
                              thanks

                              (I upload an image)
                              Ah, OK. When you import paired end data SeqMonk displays the inferred insert from the paired set of reads. If you have two reads from the same transcript which mapped 100,000bases apart then you'll see a read which is 100,000bases long. Because of this SeqMonk sets a limit on how far apart paired end reads can be. The default is 1kb which is about the limit for insert sizes on the Illumina platform. Unless you're working on a platform which can actually work with much longer insert sizes then you probably don't want to increase this.

                              Looking at the screenshot you posted you seem to have a big discrepancy between the number of reads mapped before and after trimming your data. This leads me to suspect that something may have gone wrong with your mapping of the trimmed data. When you trim your data you do need to ensure that you keep the sequences in your two fastq files exactly paired - ie if you trim one sequence down to no bases, then you still need to leave it in the file - or remove it completely from both fastq files so that bowtie always sees correctly paired sequences when it does the paired end mapping. My initial guess would be that your fastq files have ended up with different numbers of reads in them causing your data to be mispaired - which will lead to this odd kind of pairing.

                              Comment

                              • beajorrin
                                Junior Member
                                • Jan 2012
                                • 6

                                #60
                                Originally posted by simonandrews View Post
                                Ah, OK. When you import paired end data SeqMonk displays the inferred insert from the paired set of reads. If you have two reads from the same transcript which mapped 100,000bases apart then you'll see a read which is 100,000bases long. Because of this SeqMonk sets a limit on how far apart paired end reads can be. The default is 1kb which is about the limit for insert sizes on the Illumina platform. Unless you're working on a platform which can actually work with much longer insert sizes then you probably don't want to increase this.

                                Looking at the screenshot you posted you seem to have a big discrepancy between the number of reads mapped before and after trimming your data. This leads me to suspect that something may have gone wrong with your mapping of the trimmed data. When you trim your data you do need to ensure that you keep the sequences in your two fastq files exactly paired - ie if you trim one sequence down to no bases, then you still need to leave it in the file - or remove it completely from both fastq files so that bowtie always sees correctly paired sequences when it does the paired end mapping. My initial guess would be that your fastq files have ended up with different numbers of reads in them causing your data to be mispaired - which will lead to this odd kind of pairing.
                                OK! In fact I have and inter size of 500bp, so I have to change it. I have to check the trim fastq to reduce the mispaired.
                                Thanks

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  Yesterday, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Yesterday, 12:03 PM
                                0 responses
                                17 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, Yesterday, 11:40 AM
                                0 responses
                                13 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...