Header Leaderboard Ad

Collapse

SeqMonk: Visualisation and analysis for large mapped data sets

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Custom Pseudo-chromosomes

    I am creating a new custom genome. I have 25 chr, 1 mt and a whole lot of scaffolds. I can only see automatic pseudo-chromosome creation and it doesn't do exactly what I want. I would like to group the scaffolds into pseudo-chromosomes in a custom manner. Also I would like to keep mt as a separate chromosome.
    Is it possible to select some regions and convert them to a pseudo-chromosome?
    Last edited by rmf; 05-16-2016, 08:59 AM. Reason: added title

    Comment


    • There's no built in support for this kind of customisation, but you could build this yourself if you like.

      If you have a look in the automated genome you will quickly see how to play around with the way the pseudo chromosomes are made. There are two files which matter here:

      chr_list is a text file giving the names and total lengths of the genomes. In a normal build only the pseudo chromosomes would appear in here, but you could add in some individual scaffolds on their own if you like.

      aliases.txt is the file which says how the individual sequence files you have map into the chromosomes (or pseudo chromosomes in this case). For each sequence it says which chromsome it maps to and where in that chromosome it starts. If the number is negative then the sequence is assumed to be reverse complemented and inserted at that position.

      By editing these two files manually you should be able to group your sequences however you like in the newly built genome.

      Let me know how you get on.

      Comment


      • Custom Pseudo-chromosomes

        I have tried to modify the aliase.txt and chr_list as shown below. I renamed the names in aliases.txt and moved the chr lengths around in chr_list. But when I reopen and create a new project and load the custom genome, it still looks like the original build.

        old aliases.txt (automatically created)
        1 pseudo1 0
        10 pseudo2 0
        11 pseudo3 0
        12 pseudo4 0
        13 pseudo5 0
        14 pseudo6 0
        15 pseudo7 0
        16 pseudo8 0
        17 pseudo9 0
        18 pseudo10 0
        19 pseudo11 0
        2 pseudo12 0
        20 pseudo13 0
        21 pseudo14 0
        22 pseudo15 0
        23 pseudo16 0
        24 pseudo17 0
        25 pseudo18 0
        3 pseudo19 0
        4 pseudo20 0
        5 pseudo21 0
        6 pseudo22 0
        7 pseudo23 0
        8 pseudo24 0
        9 pseudo25 0
        MT pseudo26 0
        KN149696.1 pseudo26 16696
        KN149690.1 pseudo26 385433
        ...<lot more scaffolds>

        new aliases.txt (manually corrected)
        1 pseudo1 0
        10 pseudo10 0
        11 pseudo11 0
        12 pseudo12 0
        13 pseudo13 0
        14 pseudo14 0
        15 pseudo15 0
        16 pseudo16 0
        17 pseudo17 0
        18 pseudo18 0
        19 pseudo19 0
        2 pseudo2 0
        20 pseudo20 0
        21 pseudo21 0
        22 pseudo22 0
        23 pseudo23 0
        24 pseudo24 0
        25 pseudo25 0
        3 pseudo3 0
        4 pseudo4 0
        5 pseudo5 0
        6 pseudo6 0
        7 pseudo7 0
        8 pseudo8 0
        9 pseudo9 0
        MT pseudo26 0
        KN149696.1 pseudo26 16696
        KN149690.1 pseudo26 385433
        ...<lot more scaffolds>

        old chr_list (automatically created)
        pseudo1 58871917
        pseudo2 45574255
        pseudo3 45107271
        pseudo4 49229541
        pseudo5 51780250
        pseudo6 51944548
        pseudo7 47771147
        pseudo8 55381981
        pseudo9 53345113
        pseudo10 51008593
        pseudo11 48790377
        pseudo12 59543403
        pseudo13 55370968
        pseudo14 45895719
        pseudo15 39226288
        pseudo16 46272358
        pseudo17 42251103
        pseudo18 36898761
        pseudo19 62385949
        pseudo20 76625712
        pseudo21 71715914
        pseudo22 60272633
        pseudo23 74082188
        pseudo24 54191831
        pseudo25 56892771
        pseudo26 31392292

        new chr_list (manually corrected)
        pseudo1 58871917
        pseudo2 59543403
        pseudo3 62385949
        pseudo4 76625712
        pseudo5 71715914
        pseudo6 60272633
        pseudo7 74082188
        pseudo8 54191831
        pseudo9 56892771
        pseudo10 45574255
        pseudo11 45107271
        pseudo12 49229541
        pseudo13 51780250
        pseudo14 51944548
        pseudo15 47771147
        pseudo16 55381981
        pseudo17 53345113
        pseudo18 51008593
        pseudo19 48790377
        pseudo20 55370968
        pseudo21 45895719
        pseudo22 39226288
        pseudo23 46272358
        pseudo24 42251103
        pseudo25 36898761
        pseudo26 31392292
        Last edited by rmf; 05-16-2016, 09:02 AM. Reason: added text

        Comment


        • Can you try deleting the cache folder in your assembly folder. SeqMonk might not have recognised that those files have changed and be using an older version.

          If it's still not working for you then drop me an email and I'll set up an FTP site where you can push the files to me and I can take a look.

          Comment


          • Yes! It works now. Thanks a lot.

            Comment


            • Expand annotations

              Hi Simon,
              The annotations overlap a lot and it's hard to read.



              Is there an option to expand annotations like that in IGV?



              Thanks,
              Roy
              Last edited by rmf; 06-08-2016, 09:28 AM. Reason: Typo

              Comment


              • Originally posted by rmf View Post
                Hi Simon,
                The annotations overlap a lot and it's hard to read.
                There's always a trade-off to make in these kinds of display and internally we've tried a few different ways to adjust the layout to try to show more stuff on screen clearly but have kept with the current layout. In the next release we're actuallly down-weighting the amount of space given to the annotation tracks so we can give more to the data, since the trend seems to be for more data, and data tracks get unusuble fairly quickly as they compress too much.

                Whilst the view is somewhat minimal, our aim is to make it more usable through the interactive features (putting your mouse over a feature highlights it and tells you what it is), as this scales much better. Obviously for publications this doesn't help though - but what we do (and would recommend others do) is to use the option to add all labels (Control+L) then export out the SVG. You can then re-organise the layout of the features to make better use of the space you have and to highlight the information which is important for that figure. I should probably make up a video showing this process...

                Comment


                • Expand annotations

                  Its's strange that such a feature is down-weighted. I would think that it is important to see exactly what features your reads are overlapping with. Perhaps not that important when doing a whole-transcriptome dge analysis, but when a user is interested in certain genes or certain regions. As in my example figure, 4-5 layers of overlapping transcripts in one location is extremely hard to access properly even with hover effect.

                  Comment


                  • Originally posted by rmf View Post
                    Its's strange that such a feature is down-weighted. I would think that it is important to see exactly what features your reads are overlapping with. Perhaps not that important when doing a whole-transcriptome dge analysis, but when a user is interested in certain genes or certain regions. As in my example figure, 4-5 layers of overlapping transcripts in one location is extremely hard to access properly even with hover effect.
                    The down-weighting here is simply the proportional space offered to each of the tracks. The point being that for an annotation track with the same layout then once you have around 50px of vertical space then giving it more space than that doesn't make it any clearer so just providing more vertical space without a better layout fix is just wasting more space.

                    I absolutely agree that there is an issue seeing exactly what's going on in regions where lots of features overlap, and that hovering - although it helps, isn't perfect. The problem is that to make a completely non-overlapping feature set takes a huge amount of vertical space in the general case since there are places in the genome where many tens of features overlap and these would take lots of space to show clearly.

                    I'm very happy to hear suggestions for ways in which we could improve the layout we have whilst still keeping the overall vertical space in check.

                    One other little tip which can be useful - where I've wanted to look at lots of annotation for a region I'm looking at in seqmonk it's possible to link up seqmonk with a browser view of the same genome in either UCSC or Ensembl. If you are looking at a chromsome view in seqmonk then selecting Edit > Copy current position (control/command + c) will copy the genomic location into your clipboard. You can then paste this into the UCSC or Ensembl search box to be taken directly to the equivalent region. This is especially useful for tracks which seqmonk doesn't have or can't calculate.

                    Comment


                    • That's a neat little trick. Thanks for that.

                      Comment


                      • How can I save a dataset to a file (.txt or .bed..)?
                        For example, I grouped two datasets to one group, and import it as a _import dataset through File>Impot data>visible data souce. Then I'd like to save this dataset to a file. But I couldn't find any menu to do this.

                        I can export probe data through Reports>Annotated probe report, but I cann't export the dataset. A trick way is to make each position in the dataset as a probe and then export the probes. But this is slow for large dataset. Is there an easier way to do this?

                        Seems a really simple question, I apologize if this has been mentioned in the tutorial or in this thread. I couldn't find the solution.

                        Comment


                        • Is there any way to plot this in Seqmonk ??

                          On the X-axis, the TSS in the centre at 0 flanked by a fixed number of bp decided by the user for example -2000 to +2000 bp

                          while the y axis contains the average binding signal

                          Comment


                          • Hello,

                            I have a problem, that after importing mouse GTF annotation downloaded form Ensembl, Seqmonk does not recognize gene/transcript names. I would like to filter on probe names and see the names of the genes in my plots. I also do not want to use the default annotation as I noticed it´s probably an older version and names of some genes and transcript annotations have changed. Could you please help? Thank you.

                            Comment


                            • Import bedGraph

                              Is it possible to import bedGraph files? Or more generally speaking can I plot any quantitative data as a track with the following info:

                              chr start end value

                              And value being some continuous variable.
                              Thanks,
                              Roy

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
                                by seqadmin


                                ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

                                01-24-2023, 01:19 PM
                              • seqadmin
                                Introduction to Single-Cell Sequencing
                                by seqadmin
                                Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

                                The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
                                ...
                                01-09-2023, 03:10 PM

                              ad_right_rmr

                              Collapse
                              Working...
                              X