i got two different trinity.fasta by using two version of trinity

hi guys,
we have RNA-seq data sequenced of an insect in 2012, and assembled them by using one of the Trinity 2011 versions at the time (got the trinity.fasta) . now i analyzed the sequence length distribution in this file , and got the redult as follows:

Code:

kurban@kurban-X550VC:~/Downloads/bbmap$ sh stats.sh in=~/Downloads/gene.fa
stats.sh: 52: stats.sh: Bad substitution
stats.sh: 59: stats.sh: [[: not found
stats.sh: 59: stats.sh: [[: not found
stats.sh: 65: stats.sh: source: not found
stats.sh: 66: stats.sh: parseXmx: not found
A	C	G	T	N	IUPAC	Other	GC	GC_stdev
0.2875	0.2118	0.2067	0.2940	0.0000	0.0000	0.0000	0.4186	0.0894

Main genome scaffold total:         	144777
Main genome contig total:           	144777
Main genome scaffold sequence total:	67.067 MB
Main genome contig sequence total:  	67.067 MB  	0.000% gap
Main genome scaffold N/L50:         	15033/1.075 KB
Main genome contig N/L50:           	15033/1.075 KB
Max scaffold length:                	24.081 KB
Max contig length:                  	24.081 KB
Number of scaffolds > 50 KB:        	0
% main genome in scaffolds > 50 KB: 	0.00%


Minimum 	Number        	Number        	Total         	Total         	Scaffold
Scaffold	of            	of            	Scaffold      	Contig        	Contig  
Length  	Scaffolds     	Contigs       	Length        	Length        	Coverage
--------	--------------	--------------	--------------	--------------	--------
    All 	       144,777	       144,777	    67,066,997	    67,066,997	 100.00%
    100 	       144,777	       144,777	    67,066,997	    67,066,997	 100.00%
    250 	        56,929	        56,929	    53,670,774	    53,670,774	 100.00%
    500 	        30,137	        30,137	    44,518,044	    44,518,044	 100.00%
   1 KB 	        16,207	        16,207	    34,757,505	    34,757,505	 100.00%
 2.5 KB 	         4,183	         4,183	    15,894,549	    15,894,549	 100.00%
   5 KB 	           588	           588	     3,942,668	     3,942,668	 100.00%
  10 KB 	            28	            28	       353,549	       353,549	 100.00%

in the file the min seq. length is 101; the longest one is 22181.

past several days i used the latest trinity version- trinityrnaseq-2.0.6, assembled the raw data once again(after low quality reads teamed of course). this time the length distribution of the file is :

Code:

kurban@kurban-X550VC:~/Downloads/bbmap$ sh stats.sh in=~/Desktop/data_from_server/2015_6_04_assembled_CD_and_CK/Trinity.fasta
stats.sh: 52: stats.sh: Bad substitution
stats.sh: 59: stats.sh: [[: not found
stats.sh: 59: stats.sh: [[: not found
stats.sh: 65: stats.sh: source: not found
stats.sh: 66: stats.sh: parseXmx: not found
A	C	G	T	N	IUPAC	Other	GC	GC_stdev
0.2932	0.2083	0.2114	0.2871	0.0000	0.0000	0.0000	0.4197	0.0823

Main genome scaffold total:         	56130
Main genome contig total:           	56130
Main genome scaffold sequence total:	57.963 MB
Main genome contig sequence total:  	57.963 MB  	0.000% gap
Main genome scaffold N/L50:         	9036/1.861 KB
Main genome contig N/L50:           	9036/1.861 KB
Max scaffold length:                	30.733 KB
Max contig length:                  	30.733 KB
Number of scaffolds > 50 KB:        	0
% main genome in scaffolds > 50 KB: 	0.00%


Minimum 	Number        	Number        	Total         	Total         	Scaffold
Scaffold	of            	of            	Scaffold      	Contig        	Contig  
Length  	Scaffolds     	Contigs       	Length        	Length        	Coverage
--------	--------------	--------------	--------------	--------------	--------
    All 	        56,130	        56,130	    57,962,594	    57,962,594	 100.00%
    100 	        56,130	        56,130	    57,962,594	    57,962,594	 100.00%
    250 	        50,921	        50,921	    56,731,956	    56,731,956	 100.00%
    500 	        29,025	        29,025	    49,248,962	    49,248,962	 100.00%
   1 KB 	        18,003	        18,003	    41,494,038	    41,494,038	 100.00%
 2.5 KB 	         5,541	         5,541	    21,499,015	    21,499,015	 100.00%
   5 KB 	           900	           900	     5,895,754	     5,895,754	 100.00%
  10 KB 	            35	            35	       466,389	       466,389	 100.00%
  25 KB 	             1	             1	        30,733	        30,733	 100.00%

in this second trinity.fasta file the min sequence length is 224; the longest one is 30733.

my questions are :
1. why two assembly results are different,e.g. the former version assembled lots of sequences in length range from 101 to ~200 ? but the minimum length of the assembled sequence by using latest version of trinity is 224?
2. which trinity.fasta file should i use in the following analysis process ? why?

could u please give me little bit detailed explanation ?!
thanks.

Topics	Statistics	Last Post
Mechanical Forces in DNA Transcription Uncovered by Clemson Researchers by seqadmin Started by seqadmin, 10-02-2024, 04:51 AM	0 responses 13 views 0 likes	Last Post by seqadmin 10-02-2024, 04:51 AM
New Epigenetic Clock Links Cheek Cells to Mortality Risk by seqadmin Started by seqadmin, 10-01-2024, 07:10 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-01-2024, 07:10 AM
AI-Powered Blood Test Shows Promise for Early Ovarian Cancer Detection by seqadmin Started by seqadmin, 09-30-2024, 08:33 AM	0 responses 25 views 0 likes	Last Post by seqadmin 09-30-2024, 08:33 AM
Stem Cell Research Suggests Human Cells May Enter Developmental Pause by seqadmin Started by seqadmin, 09-26-2024, 12:57 PM	0 responses 18 views 0 likes	Last Post by seqadmin 09-26-2024, 12:57 PM

Seqanswers Leaderboard Ad

Announcement

i got two different trinity.fasta by using two version of trinity

Latest Articles

ad_right_rmr

News