Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by gwilymh View PostI am analyzing large datasets in R. To analyze data, my current practice is to import the entire dataset into the R workspace using the read.table() function. Rather than importing the entire dataset, however, I was wondering if it is possible to import, analyze and export each line of data individually so that the analysis would take up less computer memory.
Can this be done? And if so, how?
Code:totlines<- 10000000 ## Number of lines in your big input. Get it from wc -l skip<- 0 chunkLines= 10000 ## No. of lines to read in one go. Set to 1 to really read one line at a time. while (skip < totlines){ df<- read.table(myinput, skip= skip, nrows= chunkLines, stringsAsFactors= FALSE) skip<- skip + chunkLines [...do something with df...] }
A better alternative might be to use packages designed for dealing with data larger than memory, ff (http://cran.r-project.org/web/packages/ff/index.html) is one of them.
Hope this helps!
Dario
Leave a comment:
-
Importing and processing data in R line by line
I am analyzing large datasets in R. To analyze data, my current practice is to import the entire dataset into the R workspace using the read.table() function. Rather than importing the entire dataset, however, I was wondering if it is possible to import, analyze and export each line of data individually so that the analysis would take up less computer memory.
Can this be done? And if so, how?Tags: None
Latest Articles
Collapse
-
by seqadmin
The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...-
Channel: Articles
08-27-2024, 04:44 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 08:02 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:02 AM
|
||
Started by seqadmin, 09-03-2024, 08:30 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
09-03-2024, 08:30 AM
|
||
Started by seqadmin, 08-27-2024, 04:40 AM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
08-27-2024, 04:40 AM
|
||
New Single-Molecule Sequencing Platform Introduces Advanced Features for High-Throughput Genomics
by seqadmin
Started by seqadmin, 08-22-2024, 05:00 AM
|
0 responses
361 views
0 likes
|
Last Post
by seqadmin
08-22-2024, 05:00 AM
|
Leave a comment: