cmmapmemory-alignment

Segmentation fault when access mapped memory


I'm writing a simple program that encrypts a file by xor'ing its bytes with a given key.

/* ... */

void encrypt_file_block(
    int fd, unsigned long *file_len, unsigned long curr_file_pos,
    unsigned int block_size, unsigned long long_encryption_key
)
{
    unsigned long *p;
    unsigned int i;
    int res;
    p = mmap(
        NULL, block_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, curr_file_pos
    );
    if (p == MAP_FAILED)
        print_diagnostic_msg_and_exit();
    for (
        i = 0;
        (i < block_size) && (*file_len);
        i+=sizeof(long_encryption_key), (*file_len)-=sizeof(long_encryption_key)
    )
        p[i] ^= long_encryption_key;
    res = msync(p, block_size, MS_ASYNC);
    map_error_handling(res);
    res = munmap(p, block_size);
    map_error_handling(res);
}

int main(int argc, char **argv)
{
    /* ... */
    page_size = getpagesize();
    /* minimal number greater than or equal to initial block size
    and multiple of page size */
    optimized_block_size = ((block_size - 1) / page_size + 1) * page_size;
    /* make sure the final size is a multiple of `unsigned long` */
    while (optimized_block_size % sizeof(unsigned long) != 0)
        optimized_block_size += page_size;
    for (; file_len; curr_file_pos += optimized_block_size) {
        encrypt_file_block(
            fd, &file_len, curr_file_pos,
            optimized_block_size, long_encryption_key
        );
    }
    /* ... */
    return 0;
}

At some random iteration, the p[i] ^= long_encryption_key; part causes a Segmentation fault.

The mmap arguments during its call detected by strace:

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f1b0204b000

The valgrind diagnostic:

==115486== Process terminating with default action of signal 11 (SIGSEGV)
==115486==  Bad permissions for mapped region at address 0x402D000

ChatGPT suspected that the problem is caused by unaligned memory access while treating the memory returned by mmap as an array of unsigned long, and suggested switching to working on the mmap'ed memory as an array of char.

Indeed, this approach worked:

void encrypt_file_block(
    int fd, unsigned long *file_len, unsigned long curr_file_pos,
    unsigned int block_size, unsigned int encryption_key
)
{
    char *p;
    unsigned int i;
    int res;
    p = mmap(
        NULL, block_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, curr_file_pos
    );
    if (p == MAP_FAILED)
        print_diagnostic_msg_and_exit();
    for (i=0; (i < block_size) && (*file_len); i++, (*file_len)--)
        p[i] ^= ((char*)&encryption_key)[i % sizeof(unsigned int)];
    res = msync(p, block_size, MS_ASYNC);
    map_error_handling(res);
    res = munmap(p, block_size);
    map_error_handling(res);
}

int main(int argc, char **argv)
{
    /* ... */
    page_size = getpagesize();
    /* minimal number greater than or equal to initial block size
    and multiple of page size */
    optimized_block_size = ((block_size - 1) / page_size + 1) * page_size;
    /* make sure the final size is a multiple of `unsigned int` */
    while (optimized_block_size % sizeof(unsigned int) != 0)
        optimized_block_size += page_size;
    for (; file_len; curr_file_pos += optimized_block_size) {
        encrypt_file_block(
            fd, &file_len, curr_file_pos, optimized_block_size, encryption_key
        );
    }
    /* ... */
    return 0;
}

I wonder whether the ChatGPT solution is correct and complete, or whether it was just a coincidence and the real problem was somewhere else.

I've found no mention of possible memory alignment problems on the mmap man page or elsewhere on the Internet.

In my case, xor'ing char's is 8 times slower than xor'ing long's.


Solution

  • Your code failed because it overran the mapped memory.

    p = mmap(NULL, block_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, curr_file_pos); maps block_size bytes of memory.

    for (i = 0; (i < block_size) && (*file_len); i+=sizeof(long_encryption_key),… iterates i in units of bytes through values from 0 to nearly block_size.

    p[i] ^= long_encryption_key; attempts to operate on element i of an array starting at p. Because the type of p is unsigned long *, i in p[i] is taken as number of unsigned long, not a number of bytes.

    At some point in the loop, i grows so big that p[i] refers to an element that would be beyond the mapped memory. Once i gets too big, your code attempts to access unmapped memory and crashes.

    Changing p to char *p; fixed the code to access only block_size bytes.

    You could also have fixed this by changing p[i] ^= long_encryption_key; to p[i / sizeof *p] ^= long_encryption_key;, converting the subscript to a number of unsigned long instead of a number of bytes. Alternately, you could have changed the for loop to manage i in units of unsigned long instead of in bytes, as with for (i = 0; i < block_size / sizeof *p && *file_len); i += 1, *file_len -= sizeof *p).

    However, then you would have an issue of what to do if the file length were not a whole number of unsigned long. You would need to write additional code to handle a few remaining bytes.

    Additionally, you should be aware of issues about how types are represented in memory. In one C implementation, unsigned long may be stored in memory as four bytes with the most significant byte earliest in memory. Another C implementation might store unsigned long in memory as four bytes with the least significant byte earliest in memory. And another implementation might store unsigned long as eight bytes. These different C implementations would give different outputs for the same input to your routine.

    ChatGPT suspected that the problem is caused by unaligned memory access…

    It was wrong. The memory returned by mmap is well aligned, and your code did not do anything to access it with incorrect alignment. The problem was with the amount of memory being accessed, not its alignment.

    There are also technical issues with what types may be used to access in memory in C, but those were not a problem here because dynamically allocated memory may be accessed, initially, as any object type.