hashgraphicsvulkan

Why do we use hashes when comparing memory is better for things like pipeline state?


When people make a renderer using a graphics API such as Vulkan a common thing to do is to hash the pipeline object state so that you reuse the one already created instead of unnecessarily creating a new one if an identical one already exists. Hashing, although ideally they should yield different values for different byte/data arrays, are not guaranteed to be free of collisions, so that two different sets of bytes can yield the same hash value. Comparing the two byte arrays directly does not allow for a false positive in ensuring the data is identical. When sending data from one location to another, say over the internet, it's not possible to compare with the original/source data because you don't have it, however when creating a renderer and comparing two pipeline states you do have both sets of data, so why isn't it better to do a memory comparison instead of hashing?


Solution

  • The main optimization is finding the matching pipeline, not the cost of a single memory compare.

    If you rely on linear memory compares then you need to search through the whole list of available pipelines to find the pipeline that matches the current state. This is an O(num_of_pipelines) list search, with an O(sizeof_pipeline) memory compare per comparison. So total complexity is O(num_pipelines) * O(sizeof_pipeline).

    With a hash-map you pay O(sizeof_pipeline) cost to generate the initial hash, but with a good hash and hash map data structure you should get close to an O(1) lookup (if you have a good hash function and enough root bins). So cost is one hash generation and one comparison unless you have collisions, so "big O" cost is O(sizeof_pipeline).

    Given the number of states and shaders for a game is finite it's also often possible to prove that collisions won't happen for your game content set, so you can even avoid the actual byte-by-byte compare on hit most of the time.