c++image-compression direct3d11 dds-format

How to decompress a BC3_UNORM DDS texture format?

I've read a lot of articles and code but I still cannot get this to work, I've read all the 128 bytes of the header in my texture and them read 65536 bytes of compressed data of the actual texture(the texture's resolution is 256x256 and each compressed pixel uses 1 byte). I've tried to create my decompression algorithm with no success, them I've decided to use someone's else, so I found this code here. This is the arguments I was trying to pass to it so it would decompress my DDS texture.BlockDecompressImageDXT5(textureHeader.dwWidth, textureHeader.dwHeight, temp, packedData) Note: textureHeader is a valid struct with the DDS texture's header data loaded into it, temp is a unsigned char array holding all the DDS data that was read from the DDS texture and packedData is a unsigned long array I was expecting to receive the final decompressed data. So in the code I've linked, the RGBA channels for each pixel were packed in the PackRGBA function, one byte for each color in the packedData. Before pointing the data to the texture's data at D3D11_SUBRESOURCE_DATApSysMem, I've distributed each byte from the unsigned long packedData to 4 different unsigned char m_DDSData this way:

for (int i{ 0 }, iData{ 0 }; i < textureHeader.dwPitchOrLinearSize; i++, iData += 4) //dwPitchOrLinearSize is the size in bytes of the compressed data.
{
    m_DDSData[iData] = ((packedData[i] << 24) >> 24); //first char receives the 1st byte, representing the red color.
    m_DDSData[iData + 1] = ((packedData[i] << 16) >> 24); //second char receives the 2nd byte, representing the green color.
    m_DDSData[iData + 2] = ((packedData[i] << 8) >> 24); //third char receives the 3rd byte, representing the blue color.
    m_DDSData[iData + 3] = (packedData[i] >> 24); //fourth char receives the 4th byte, representing the alpha color.
}

Note: m_DDSData should be the final data array used by D3D11_SUBRESOURCE_DATA to point to the texture's data, but when I use it this is the kind of result I get, only a frame with random colors instead of my actual texture. I also have algorithm's to other type of textures and they work properly so I can assure the problem is only in the DDS compressed format. EDIT: Another example, this is a model of a chest and the program should be rendering the chest's texture: https://prnt.sc/11b62b6

Solution

For a full description of the BC3 compression scheme, see Microsoft Docs. BC3 is just the modern name for DXT4/DXT5 compression a.k.a. S3TC. In short, it compresses a 4x4 block of pixels at a time into the following structures resulting in 16 bytes per block:

struct BC1
{
    uint16_t    rgb[2]; // 565 colors
    uint32_t    bitmap; // 2bpp rgb bitmap
};

static_assert(sizeof(BC1) == 8, "Mismatch block size");

struct BC3
{
    uint8_t     alpha[2];   // alpha values
    uint8_t     bitmap[6];  // 3bpp alpha bitmap
    BC1         bc1;        // BC1 rgb data
};

static_assert(sizeof(BC3) == 16, "Mismatch block size");

CPU decompression

For the color portion, it's the same as the "BC1" a.k.a. DXT1 compressed block. This is pseudo-code, but should get the point across:

auto pBC = &pBC3->bc1;
clr0 = pBC->rgb[0]; // 5:6:5 RGB
clr0.a = 255;

clr1 = pBC->rgb[1]; // 5:6:5 RGB
clr1.a = 255;

clr2 = lerp(clr0, clr1, 1 / 3);
clr2.a = 255;

clr3 = lerp(clr0, clr1, 2 / 3);
clr3.a = 255;

uint32_t dw = pBC->bitmap;

for (size_t i = 0; i < NUM_PIXELS_PER_BLOCK; ++i, dw >>= 2)
{
    switch (dw & 3)
    {
        case 0: pColor[i] = clr0; break;
        case 1: pColor[i] = clr1; break;
        case 2: pColor[i] = clr2; break;
        case 3: pColor[i] = clr3; break;
    }
}

Note while a BC3 contains a BC1 block, the decoding rules for BC1 are slightly modified. When decompressing BC1, you normally check the order of the colors as follows:

if (pBC->rgb[0] <= pBC->rgb[1])
{
    /* BC1 with 1-bit alpha */
    clr2 = lerp(clr0, clr1, 0.5);
    clr2.a = 255;

    clr3 = 0; // alpha of zero
}

BC2 and BC3 already include the alpha channel, so this extra logic is not used, and you always have 4 opaque colors.

For the alpha portion, BC3 uses two alpha values and then generates a look-up table based on those values:

alpha[0] = alpha0 = pBC3->alpha[0];
alpha[1] = alpha1 = pBC3->alpha[1];

if (alpha0 > alpha1)
{
    // 6 interpolated alpha values.
    alpha[2] = lerp(alpha0, alpha1, 1 / 7);
    alpha[3] = lerp(alpha0, alpha1, 2 / 7);
    alpha[4] = lerp(alpha0, alpha1, 3 / 7);
    alpha[5] = lerp(alpha0, alpha1, 4 / 7);
    alpha[6] = lerp(alpha0, alpha1, 5 / 7);
    alpha[7] = lerp(alpha0, alpha1, 6 / 7);
}
else
{
    // 4 interpolated alpha values.
    alpha[2] = lerp(alpha0, alpha1, 1 / 5);
    alpha[3] = lerp(alpha0, alpha1, 2 / 5);
    alpha[4] = lerp(alpha0, alpha1, 3 / 5);
    alpha[5] = lerp(alpha0, alpha1, 4 / 5);
    alpha[6] = 0;
    alpha[7] = 255;
}

uint32_t dw = uint32_t(pBC3->bitmap[0]) | uint32_t(pBC3->bitmap[1] << 8)
    | uint32_t(pBC3->bitmap[2] << 16);

for (size_t i = 0; i < 8; ++i, dw >>= 3)
    pColor[i].a = alpha[dw & 0x7];

dw = uint32_t(pBC3->bitmap[3]) | uint32_t(pBC3->bitmap[4] << 8)
    | uint32_t(pBC3->bitmap[5] << 16);

for (size_t i = 8; i < NUM_PIXELS_PER_BLOCK; ++i, dw >>= 3)
    pColor[i].a = alpha[dw & 0x7];

DirectXTex includes functions for doing all the compression/decompression for all BC formats.

If you want to know what the pseudo-function lerp does, see wikipedia or HLSL docs.

Rendering with a compressed texture

If you are going to be rendering with Direct3D, you do not need to decompress the texture. All Direct3D hardware feature levels include support for BC1 - BC3 texture compression. You just create the texture with the DXGI_FORMAT_BC3_UNORM format and create the texture as normal. Something like this:

D3D11_TEXTURE2D_DESC desc = {};
desc.Width = textureHeader.dwWidth;
desc.Height = textureHeader.dwHeight;
desc.MipLevels = desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_BC3_UNORM;
desc.SampleDesc.Count = 1;
desc.Usage = D3D11_USAGE_DEFAULT;
desc.BindFlags = D3D11_BIND_SHADER_RESOURCE;


D3D11_SUBRESOURCE_DATA initData = {}; 
initData.pSrcBits = temp;
initData.SysMemPitch = 16 * (textureHeader.dwWidth / 4);
    // For BC compressed textures pitch is the number of bytes in a ROW of blocks

Microsoft::WRL::ComPtr<ID3D11Texture2D> pTexture;
hr = device->CreateTexture2D( &desc, &initData, &pTexture );
if (FAILED(hr))
    // error

For a full-featured DDS loader that supports arbitrary DXGI formats, mipmaps, texture arrays, volume maps, cubemaps, cubemap arrays, etc. See DDSTextureLoader. This code is included in DirectX Tool Kit for DX11 / DX12. There's standalone versions for DirectX 9, DirectX 10, and DirectX 11 in DirectXTex.

If loading legacy DDS files (i.e. those that do not map directly to DXGI formats), then use the DDS functions in DirectXTex which does all the various pixel format conversions required (3:3:2, 3:3:2:8, 4:4, 8:8:8, P8, A8P8, etc.)