clangc11anonymous-struct

Nested Anonymous Structs in c11


I'm writing a CHIP-8 interpreter in c11 for fun, and I thought it would be cool to decode the opcodes using anonymous structures.

Ideally I'd have a type where if I have the opcode opcode_t code = {.bits = 0xABCD}

it should have the following properties:

code.I   == 0xA
code.X   == 0xB
code.Y   == 0xC
code.J   == 0xD
code.NNN == 0xBCD
code.KK  == 0xCD

The structure that I came up with is:

typedef union
{
    uint16_t bits : 16;
    struct
    {
        uint8_t I : 4;
        union
        {
            uint16_t NNN : 12;
            struct
            {
                uint8_t X : 4;
                union
                {
                    uint8_t KK : 8;
                    struct
                    {
                        uint8_t Y : 4;
                        uint8_t J : 4;
                    };
                };
            };
        };
    };
} opcode_t;

However, when I run the following code to test my struct

opcode_t test_opcode = { .bits = 0xABCD };

printf(
        "I = %x, X = %x, Y = %x, J = %x, NNN = %x, KK = %x \n",
        test_opcode.I,
        test_opcode.X,
        test_opcode.Y,
        test_opcode.J,
        test_opcode.NNN,
        test_opcode.KK
);

The output is I = d, X = 0, Y = 0, J = 0, NNN = 0, KK = 0

I am compiling this code in Apple LLVM version 8.1.0 (clang-802.0.42)

using the following CMakeLists.txt:

cmake_minimum_required(VERSION 3.9)

project (Chip8)

set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib)
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
set (CMAKE_CXX_STANDARD 11 REQUIRED)

find_package(Curses REQUIRED)
include_directories(${CURSES_INCLUDE_DIR}/src)

add_executable (Chip8 src/main.c src/Chip8State.c)

target_link_libraries(Chip8 ${CURSES_LIBRARIES})

Why is test_opcode.I == 0xD, and why are the rest of the members 0x0?

I'm assuming it's because I'm using uint8_t when I only need a 4 bit number, but I thought using a bitfield would solve that issue.

Is there a way I can modify my typedef to have the desired properties above?

(I understand I could use masking and bit shifting to get the desired values, I'm just think this syntax would be much nicer)

Thanks in advance!

EDIT: I changed my CMakeList to have set(CMAKE_C_STANDARD_REQUIRED 11) instead, as I meant to have a C project not c++, however my code still isn't working.


Solution

  • I'd skip everything called bit-fields, since they are non-standard and non-portable. What will happen when you use bit-fields on 8 or 16 bit stdint.h types, nobody knows. In addition you get padding issues because of the structs. And your code will be endianess-dependent. Overall a bad idea (but of course ok just for hobbyist purposes).

    Instead, I would just define the type as:

    typedef uint16_t opcode_t;
    

    And then cook up some access macros:

    #define I(op) ((op & 0xF000u) >> 12)
    #define X(op) ((op & 0x0F00u) >>  8)
    #define Y(op) ((op & 0x00F0u) >>  4)
    #define NNN(op) (op & 0x0FFFu)
    #define KK(op)  (op & 0x00FFu)
    

    This will translate to the best possible machine code and is 100% portable even across endianess.

    You can even invent some higher level macro for generic access and type safety:

    #define GET(op, type) _Generic(op, opcode_t: type(op))
    

    Full example:

    #include <stdint.h>
    #include <stdio.h>
    #include <inttypes.h>
    
    typedef uint16_t opcode_t;
    
    #define I(op) ((op & 0xF000u) >> 12)
    #define X(op) ((op & 0x0F00u) >>  8)
    #define Y(op) ((op & 0x00F0u) >>  4)
    #define NNN(op) (op & 0x0FFFu)
    #define KK(op)  (op & 0x00FFu)
    
    
    #define GET(op, type) _Generic(op, opcode_t: type(op))
    
    
    int main (void)
    {
      opcode_t op = 0xABCD;
    
      printf("I\t0x%"PRIX16 "\n", GET(op, I));
      printf("X\t0x%"PRIX16 "\n", GET(op, X));
      printf("Y\t0x%"PRIX16 "\n", GET(op, Y));
      printf("NNN\t0x%"PRIX16 "\n", GET(op, NNN));
      printf("KK\t0x%"PRIX16 "\n", GET(op, KK));
    }
    

    Output:

    I       0xA
    X       0xB
    Y       0xC
    NNN     0xBCD
    KK      0xCD