Seqanswers Leaderboard Ad

**rmf** · 05-12-2016, 02:05 AM

Custom Pseudo-chromosomes

I am creating a new custom genome. I have 25 chr, 1 mt and a whole lot of scaffolds. I can only see automatic pseudo-chromosome creation and it doesn't do exactly what I want. I would like to group the scaffolds into pseudo-chromosomes in a custom manner. Also I would like to keep mt as a separate chromosome.
Is it possible to select some regions and convert them to a pseudo-chromosome?

**simonandrews** · 05-12-2016, 03:26 AM

There's no built in support for this kind of customisation, but you could build this yourself if you like.

If you have a look in the automated genome you will quickly see how to play around with the way the pseudo chromosomes are made. There are two files which matter here:

chr_list is a text file giving the names and total lengths of the genomes. In a normal build only the pseudo chromosomes would appear in here, but you could add in some individual scaffolds on their own if you like.

aliases.txt is the file which says how the individual sequence files you have map into the chromosomes (or pseudo chromosomes in this case). For each sequence it says which chromsome it maps to and where in that chromosome it starts. If the number is negative then the sequence is assumed to be reverse complemented and inserted at that position.

By editing these two files manually you should be able to group your sequences however you like in the newly built genome.

Let me know how you get on.

**rmf** · 05-16-2016, 09:00 AM

Custom Pseudo-chromosomes

I have tried to modify the aliase.txt and chr_list as shown below. I renamed the names in aliases.txt and moved the chr lengths around in chr_list. But when I reopen and create a new project and load the custom genome, it still looks like the original build.

old aliases.txt (automatically created)
1 pseudo1 0
10 pseudo2 0
11 pseudo3 0
12 pseudo4 0
13 pseudo5 0
14 pseudo6 0
15 pseudo7 0
16 pseudo8 0
17 pseudo9 0
18 pseudo10 0
19 pseudo11 0
2 pseudo12 0
20 pseudo13 0
21 pseudo14 0
22 pseudo15 0
23 pseudo16 0
24 pseudo17 0
25 pseudo18 0
3 pseudo19 0
4 pseudo20 0
5 pseudo21 0
6 pseudo22 0
7 pseudo23 0
8 pseudo24 0
9 pseudo25 0
MT pseudo26 0
KN149696.1 pseudo26 16696
KN149690.1 pseudo26 385433
...<lot more scaffolds>

new aliases.txt (manually corrected)
1 pseudo1 0
10 pseudo10 0
11 pseudo11 0
12 pseudo12 0
13 pseudo13 0
14 pseudo14 0
15 pseudo15 0
16 pseudo16 0
17 pseudo17 0
18 pseudo18 0
19 pseudo19 0
2 pseudo2 0
20 pseudo20 0
21 pseudo21 0
22 pseudo22 0
23 pseudo23 0
24 pseudo24 0
25 pseudo25 0
3 pseudo3 0
4 pseudo4 0
5 pseudo5 0
6 pseudo6 0
7 pseudo7 0
8 pseudo8 0
9 pseudo9 0
MT pseudo26 0
KN149696.1 pseudo26 16696
KN149690.1 pseudo26 385433
...<lot more scaffolds>

old chr_list (automatically created)
pseudo1 58871917
pseudo2 45574255
pseudo3 45107271
pseudo4 49229541
pseudo5 51780250
pseudo6 51944548
pseudo7 47771147
pseudo8 55381981
pseudo9 53345113
pseudo10 51008593
pseudo11 48790377
pseudo12 59543403
pseudo13 55370968
pseudo14 45895719
pseudo15 39226288
pseudo16 46272358
pseudo17 42251103
pseudo18 36898761
pseudo19 62385949
pseudo20 76625712
pseudo21 71715914
pseudo22 60272633
pseudo23 74082188
pseudo24 54191831
pseudo25 56892771
pseudo26 31392292

new chr_list (manually corrected)
pseudo1 58871917
pseudo2 59543403
pseudo3 62385949
pseudo4 76625712
pseudo5 71715914
pseudo6 60272633
pseudo7 74082188
pseudo8 54191831
pseudo9 56892771
pseudo10 45574255
pseudo11 45107271
pseudo12 49229541
pseudo13 51780250
pseudo14 51944548
pseudo15 47771147
pseudo16 55381981
pseudo17 53345113
pseudo18 51008593
pseudo19 48790377
pseudo20 55370968
pseudo21 45895719
pseudo22 39226288
pseudo23 46272358
pseudo24 42251103
pseudo25 36898761
pseudo26 31392292

**simonandrews** · 05-17-2016, 06:50 AM

Can you try deleting the cache folder in your assembly folder. SeqMonk might not have recognised that those files have changed and be using an older version.

If it's still not working for you then drop me an email and I'll set up an FTP site where you can push the files to me and I can take a look.

**rmf** · 05-17-2016, 12:59 PM

Yes! It works now. Thanks a lot.

**rmf** · 06-08-2016, 09:27 AM

Expand annotations

Hi Simon,
The annotations overlap a lot and it's hard to read.

Is there an option to expand annotations like that in IGV?

Thanks,
Roy

**simonandrews** · 06-09-2016, 12:52 AM

Originally posted by rmf View Post

Hi Simon,
The annotations overlap a lot and it's hard to read.

There's always a trade-off to make in these kinds of display and internally we've tried a few different ways to adjust the layout to try to show more stuff on screen clearly but have kept with the current layout. In the next release we're actuallly down-weighting the amount of space given to the annotation tracks so we can give more to the data, since the trend seems to be for more data, and data tracks get unusuble fairly quickly as they compress too much.

Whilst the view is somewhat minimal, our aim is to make it more usable through the interactive features (putting your mouse over a feature highlights it and tells you what it is), as this scales much better. Obviously for publications this doesn't help though - but what we do (and would recommend others do) is to use the option to add all labels (Control+L) then export out the SVG. You can then re-organise the layout of the features to make better use of the space you have and to highlight the information which is important for that figure. I should probably make up a video showing this process...

**rmf** · 06-09-2016, 11:39 PM

Expand annotations

Its's strange that such a feature is down-weighted. I would think that it is important to see exactly what features your reads are overlapping with. Perhaps not that important when doing a whole-transcriptome dge analysis, but when a user is interested in certain genes or certain regions. As in my example figure, 4-5 layers of overlapping transcripts in one location is extremely hard to access properly even with hover effect.

**simonandrews** · 06-10-2016, 02:36 AM

Originally posted by rmf View Post

Its's strange that such a feature is down-weighted. I would think that it is important to see exactly what features your reads are overlapping with. Perhaps not that important when doing a whole-transcriptome dge analysis, but when a user is interested in certain genes or certain regions. As in my example figure, 4-5 layers of overlapping transcripts in one location is extremely hard to access properly even with hover effect.

The down-weighting here is simply the proportional space offered to each of the tracks. The point being that for an annotation track with the same layout then once you have around 50px of vertical space then giving it more space than that doesn't make it any clearer so just providing more vertical space without a better layout fix is just wasting more space.

I absolutely agree that there is an issue seeing exactly what's going on in regions where lots of features overlap, and that hovering - although it helps, isn't perfect. The problem is that to make a completely non-overlapping feature set takes a huge amount of vertical space in the general case since there are places in the genome where many tens of features overlap and these would take lots of space to show clearly.

I'm very happy to hear suggestions for ways in which we could improve the layout we have whilst still keeping the overall vertical space in check.

One other little tip which can be useful - where I've wanted to look at lots of annotation for a region I'm looking at in seqmonk it's possible to link up seqmonk with a browser view of the same genome in either UCSC or Ensembl. If you are looking at a chromsome view in seqmonk then selecting Edit > Copy current position (control/command + c) will copy the genomic location into your clipboard. You can then paste this into the UCSC or Ensembl search box to be taken directly to the equivalent region. This is especially useful for tracks which seqmonk doesn't have or can't calculate.

**rmf** · 06-10-2016, 05:29 AM

That's a neat little trick. Thanks for that.

**xhuister** · 09-26-2016, 01:02 AM

How can I save a dataset to a file (.txt or .bed..)?
For example, I grouped two datasets to one group, and import it as a _import dataset through File>Impot data>visible data souce. Then I'd like to save this dataset to a file. But I couldn't find any menu to do this.

I can export probe data through Reports>Annotated probe report, but I cann't export the dataset. A trick way is to make each position in the dataset as a probe and then export the probes. But this is slow for large dataset. Is there an easier way to do this?

Seems a really simple question, I apologize if this has been mentioned in the tutorial or in this thread. I couldn't find the solution.

**Niranjanks** · 10-06-2016, 02:05 AM

Is there any way to plot this in Seqmonk ??

On the X-axis, the TSS in the centre at 0 flanked by a fixed number of bp decided by the user for example -2000 to +2000 bp

while the y axis contains the average binding signal

**pander** · 08-24-2017, 04:32 AM

Hello,

I have a problem, that after importing mouse GTF annotation downloaded form Ensembl, Seqmonk does not recognize gene/transcript names. I would like to filter on probe names and see the names of the genes in my plots. I also do not want to use the default annotation as I noticed it´s probably an older version and names of some genes and transcript annotations have changed. Could you please help? Thank you.

**rmf** · 08-02-2018, 01:01 PM

Import bedGraph

Is it possible to import bedGraph files? Or more generally speaking can I plot any quantitative data as a track with the following info:

chr start end value

And value being some continuous variable.
Thanks,
Roy

Topics	Statistics	Last Post
Genetic Mapping of Plasmodium knowlesi Identifies Essential Genes and Drug Resistance Mechanisms by seqadmin Started by seqadmin, 02-07-2025, 09:30 AM	0 responses 72 views 0 likes	Last Post by seqadmin 02-07-2025, 09:30 AM
New DNA Sequencing Method Measures Metabolites with High Precision by seqadmin Started by seqadmin, 02-05-2025, 10:34 AM	0 responses 113 views 0 likes	Last Post by seqadmin 02-05-2025, 10:34 AM
AI Model Maps 3D Genome Structures in Minutes by seqadmin Started by seqadmin, 02-03-2025, 09:07 AM	0 responses 90 views 0 likes	Last Post by seqadmin 02-03-2025, 09:07 AM
Long-Read Sequencing Speeds Up Diagnosis of Rare Genetic Diseases by seqadmin Started by seqadmin, 01-31-2025, 08:31 AM	0 responses 49 views 0 likes	Last Post by seqadmin 01-31-2025, 08:31 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News