I am working with large files and I am using MappedByteBuffer to read and write operations. I have a little lack of knowledge so I am wondering somethings about it.
MappedByteBuffer buf = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, offset, size);
I know that ByteBuffer limit is Integer.MAX_VALUE for size so how should I set my size for MappedByteBuffer? Should I use small pieces or Integer.MAX_VALUE?
So If I increase my mappingsize is my applications reading and writing performance also increasing?
While this size increases is my memory usage also increasing at a time? I am wondering this because I am creating multiple files to read and write. So maybe If one file allocate 2gb of memory and If I have 6 files I need 12gb memory or my idea is completely wrong.
Is it related to JVM -Xmx or my physical memory?
This is my usage:
List<MappedByteBuffer> mappings = new ArrayList<MappedByteBuffer>();
int mSize = 25;
long MAPPING_SIZE = 1 << mSize;
File file = File.createTempFile("test", ".dat");
RandomAccessFile raf = new RandomAccessFile(file, "rw");
ByteOrder byteOrder = java.nio.ByteOrder.nativeOrder(); // "LITTLE_ENDIAN";
try {
long size = 8L * width * height;
for (long offset = 0; offset < size; offset += MAPPING_SIZE) {
long size2 = Math.min(size - offset, MAPPING_SIZE);
MappedByteBuffer buf = raf.getChannel().map(FileChannel.MapMode.READ_WRITE, offset, size2);
buf.order(byteOrder);
mappings.add(buf);
}
}
Short answer yes, if you know your files are going to be quite a bit larger than 2g. The only drawback is your disk space usage: if you use a large increment the amount of wasted disk space is going to be larger if size
is not a multiple of MAPPING_SIZE
.
Only your virtual memory usage is increasing. Unless you are on a 32bits machines this shouldn't be a problem. Max virtual memory is 128TiB on Linux, so you have some room to go. If you need more virtual memory than that, you will need to go for another solution. Memory mapped files use the page cache: the OS is only going to load the files page [1] by page in physical memory as they are used, and unload those pages as available physical RAM gets tight.
Nope. See 2.
For some extra resources, here is a pretty good summary of how the page cache works: Page Cache, the Affair Between Memory and Files
[1]: A page is an OS level memory unit, generally of 4KiB