d dmd gdc

memcmp in DMD v.s GDC AND std.parallelism: parallel

I'm implementing a struct with a pointer to some manually managed memory. It all works great with DMD, but when I test it with GDC, it fails on the opEquals operator overload. I've narrowed it down to memcmp. In opEquals I compare the pointed to memory with memcmp, which behaves as I expect in DMD but fails with GDC.

If I go back and write the opEquals method by comparing each value stored in the manually managed memory 1 at a time using == on the builtin types, it works in both compilers. I prefer the memcmp route because it was shorter to write and seems like it should be faster (less indirection, iteration, etc).

Why? Is this a bug?

(My experience with C was 10 years ago, been using python/java since, I never had this kind of problem in C, but I didn't use it that much.)

Edit:

The memory I'm comparing represents a 2-D array of 'real' values, I just wanted it to be allocated in one chunk so I didn't have to deal with jagged arrays. I'll be using the structs a lot in tight loops. Basically I'm rolling my own matrix struct that will (eventually) cache some frequently used values (trace, determinant) and offers an alternate read only view into the transpose that doesn't require copying it. I plan to work with matrices of about 10x10 to about 1000x1000 (though not always square).

I also plan on implementing a version that allocates memory with the GC via a ubyte[] and profiling the two implementations.

Edit 2:

Ok, I tried a couple of things. I also have some parallel loops, and I had a hunch that might be the problem. So I added some version statements to make a parallel and non-parallel version. In order to get it to work with GDC, I had to use the non-parallel version AND change real to double.

All cases compiled under GDC. But the unit tests failed, not always consistently on the same line, but consistently at an opEquals call when I used real or parallel. In DMD all cases compiled and ran no problem.

Thanks,

Solution

real has a bit of a strange size: it is 80 bits of data, but if you check real.sizeof, you'll see it is bigger than that (at least on Linux, I think it is 10 bytes on Windows, I betcha you wouldn't see this bug there). The reason is to make sure it is aligned on a word boundary - a multiple of four bytes - for the processor to load more efficient in arrays.

The bytes between each data element are called padding, and their content is not always defined. I haven't confirmed this myself, but @jpf's comment on the question said the same thing my gut does, so I'm posting it as answer now.

The is operator in D does the same as memcmp(&data1, &data2, data.sizeof), so @jpf's comment and your memcmp would be the same thing. It checks the data AND the padding, whereas == only checks the data (and does a bit special for floating types btw because it also compares for NaN, so the exact bit pattern is important to those checks; actually, my first gut when I saw the question title was that it was NaN related! but not the case)

Anyway, apparently dmd initializes the padding bytes as well, whereas gdc doesn't, leaving it as garbage which doesn't always match.