I want to write 2TB data into one file, in the future it might be a petabyte.
The data is composed of all '1'
. For example, 2TB data consisting of "1111111111111......11111"
(each byte is represented by '1').
Following is my way:
File.open("data",File::RDWR||File::CREAT) do |file|
2*1024*1024*1024*1024.times do
file.write('1')
end
end
That means, File.write
is called 2TB times. From the point of Ruby, is there a better way to implement it?
You have a few problems:
File::RDWR||File::CREAT
always evaluates to File::RDWR
. You mean File::RDWR|File::CREAT
(|
rather than ||
).
2*1024*1024*1024*1024.times do
runs the loop 1024 times then multiplies the result of the loop by the stuff on the left. You mean (2*1024*1024*1024*1024).times do
.
Regarding your question, I get significant speedup by writing 1024 bytes at a time:
File.open("data",File::RDWR|File::CREAT) do |file|
buf = "1" * 1024
(2*1024*1024*1024).times do
file.write(buf)
end
end
You might experiment and find a better buffer size than 1024.