.netarrayslimits

Why is the max size of byte[] 2 GB - 57 B?


On my 64-bit machine, this C# code works:

new byte[2L * 1024 * 1024 * 1024 - 57]

but this one throws an OutOfMemoryException:

new byte[2L * 1024 * 1024 * 1024 - 56]

Why?

I understand that the maximum size of a managed object is 2 GB and that the array object I'm creating contains more than the bytes I want. Namely, there is 4 bytes (or 8?) for the syncblock number, 8 bytes for MethodTable reference and 4 bytes for the size of the array. That's 24 bytes including padding, so why can't I allocate an array with 2G - 24 bytes? Is the maximum size really exactly 2 GB? If that's the case, what is the rest of 2 GB used for?

Note: I don't actually need to allocate an array with 2 million of bytes. And even if I did, 56 bytes is negligible overhead. And I could easily work around the limit using custom BigArray<T>.


Solution

  • You need 56 bytes of overhead. It is actually 2,147,483,649-1 minus 56 for the maximum size. That's why your minus 57 works and minus 56 does not.

    As Jon Skeet says here:

    However, in practical terms, I don't believe any implementation supports such huge arrays. The CLR has a per-object limit a bit short of 2GB, so even a byte array can't actually have 2147483648 elements. A bit of experimentation shows that on my box, the largest array you can create is new byte[2147483591]. (That's on the 64 bit .NET CLR; the version of Mono I've got installed chokes on that.)

    See also this InformIT article on the same subject. It provides the following code to demonstrate the maximum sizes and overhead:

    class Program
    {
      static void Main(string[] args)
      {
        AllocateMaxSize<byte>();
        AllocateMaxSize<short>();
        AllocateMaxSize<int>();
        AllocateMaxSize<long>();
        AllocateMaxSize<object>();
      }
    
      const long twogigLimit = ((long)2 * 1024 * 1024 * 1024) - 1;
      static void AllocateMaxSize<T>()
      {
        int twogig = (int)twogigLimit;
        int num;
        Type tt = typeof(T);
        if (tt.IsValueType)
        {
          num = twogig / Marshal.SizeOf(typeof(T));
        }
        else
        {
          num = twogig / IntPtr.Size;
        }
    
        T[] buff;
        bool success = false;
        do
        {
          try
          {
            buff = new T[num];
            success = true;
          }
          catch (OutOfMemoryException)
          {
            --num;
          }
        } while (!success);
        Console.WriteLine("Maximum size of {0}[] is {1:N0} items.", typeof(T).ToString(), num);
      }
    }
    

    Finally, the article has this to say:

    If you do the math, you’ll see that the overhead for allocating an array is 56 bytes. There are some bytes left over at the end due to object sizes. For example, 268,435,448 64-bit numbers occupy 2,147,483,584 bytes. Adding the 56 byte array overhead gives you 2,147,483,640, leaving you 7 bytes short of 2 gigabytes.

    Edit:

    But wait, there's more!

    Looking around and talking with Jon Skeet, he pointed me to an article he wrote on Of memory and strings. In that article he provides a table of sizes:

    Type            x86 size            x64 size
    object          12                  24
    object[]        16 + length * 4     32 + length * 8
    int[]           12 + length * 4     28 + length * 4
    byte[]          12 + length         24 + length
    string          14 + length * 2     26 + length * 2
    

    Mr. Skeet goes on to say:

    You might be forgiven for looking at the numbers above and thinking that the "overhead" of an object is 12 bytes in x86 and 24 in x64... but that's not quite right.

    and this:

    • There's a "base" overhead of 8 bytes per object in x86 and 16 per object in x64... given that we can store an Int32 of "real" data in x86 and still have an object size of 12, and likewise we can store two Int32s of real data in x64 and still have an object of x64.

    • There's a "minimum" size of 12 bytes and 24 bytes respectively. In other words, you can't have a type which is just the overhead. Note how the "Empty" class takes up the same size as creating instances of Object... there's effectively some spare room, because the CLR doesn't like operating on an object with no data. (Note that a struct with no fields takes up space too, even for local variables.)

    • The x86 objects are padded to 4 byte boundaries; on x64 it's 8 bytes (just as before)

    and finally Jon Skeet responded to a question I asked of him in another question where he states (in response to the InformIT article I showed him):

    It looks like the article you're referring to is inferring the overhead just from the limit, which is silly IMO.

    So to answer your question, actual overhead is 24 bytes with 32 bytes of spare room, from what I gather.