c++openglmatrixcolladaskeletal-animation

How do I calculate the start matrix for each bone(t-pose) (using collada and opengl)


I have already loaded vertices, materials, normals, weights, joint IDs, joints themselves and the parent children info (hierarchy), I have also managed to render it all, and when I rotate or translate one of the joints, the children rotate with the parent. My problem is, the parent rotates on a wrong point or offset (hopefully you understand what I mean), this means, that I've gotten the initial offsets wrong, right? To get the start t-pose, I'm guessing I don't need rotation or translation, only the offset of the position of the joint, but I have no clue of how to get it, been stuck for ages. In the Collada file, there is a transform for each joint, I've loaded that one also, but I don't know how to implement it correctly, my 3d model gets deformed and looks wrong. If you answer this question please make it as if you where explaining it to a monkey (me), and step by step if possible, I'm unfamiliar with these bind and inverse bind terms, and very confused. I think that if i manage to get this, I'll eventually figure out the rest of skeletal animation myself, so it's just this little thing.


Solution

  • I've recently gotten bones, joints and nodes working, so I'll try to explain exactly how I achieved it. Do note, I am using Assimp to import my DAE files, but as far as I know, Assimp doesn't do any processing on the data, so this explanation should directly relate to the data in the Collada file.

    I'm just learning all this myself, so I may get things wrong. If I do, anyone, please tell me and I will update this answer accordingly.

    Semantics

    A mesh is a set of vertices, normals, texture coordinates and faces. The points stored in a mesh are in a bind pose, or a rest pose. This is often, but not always, a T-pose.

    A skin is a controller. It refers to a single mesh, and contains the list of bones that will modify that mesh (this is where the bones are stored). You can think of the skin element as the actual model (or part of the model) that will be rendered.

    A bone is a flat list of names and associated matrices. There is no hierarchical data here, it is simply a flat list. The hierarchy is provided by the nodes that refer to the bones.

    A node, or joint, is a hierarchical data element. They are stored in a hierarchy, with a parent node having zero or more child nodes. A node may be linked to zero or more bones, and may be linked to zero or more skins. There should only be one root node. A joint is the same as a node, so I will refer to joins as nodes.

    Do note that nodes and bones are separate. You do not modify a bone to animate your model. Instead, you modify a node, which gets applied to the bone when the model is rendered.

    Skin

    A skin is the thing you will render. A skin always refers to one single mesh. You can have multiple skins in a DAE file, as part of the same model (or, scene). Sometimes, a model will reuse meshes by transforming them. For instance, you may have a mesh for a single arm, and reuse that arm, mirrored, for the other side of the body. I believe that is what the bind_shape_matrix value of a skin is used for. So far, I haven't used this, and my matrices are always identity, so I cannot speak as to it's usage.

    Bone

    A bone is what applies transformations to your model. You do not modify bones. Instead, you modify the nodes that control the bones. More on this later.

    A bone consists of the following:

    Node

    A node is a hierarchical data element, describing how the model gets transformed when rendered. You will always start with one root node, and travel up the node tree, applying transforms in sequence. I use a depth-first algorithm for this.

    The node tells how the model, skins, and bones should be transformed when rendering or animating.

    A node may refer to a skin. This means that skin will be used as part of the render for this model. If you see a node refer to a skin, it gets included when rendering.

    A node consists of the following:

    GlobalInverseTransform Matrix

    The GlobalInverseTransform matrix is calculated by taking the Transform matrix of the first node, and inverting it. Simple as that.

    The Algorithm

    Now we can get to the good bits - the actual skinning and rendering.

    Calculating a node's LocalTransform

    Each node should have a matrix, called the LocalTransform matrix. This matrix isn't in the DAE file, but is calculated by your software. It is basically the accumulation of the Transform matrices of the node, and all its parents.

    First step is to traverse the node hierarchy.

    Start at the first node, and calculate the LocalTransform for the node, using the Transform matrix of the node, and the LocalTransform of the parent. If the node has no parent, use an identity matrix as the parent's LocalTransform matrix.

    Node.LocalTransform = ParentNode.LocalTransform * Node.Transform
    

    Repeat this process recursively for every child node in this node.

    Calculating a bone's FinalTransform matrix

    Just like a node, a bone should have a FinalTransform matrix. Again, this is not stored in the DAE file, it is calculated by your software as part of the render process.

    For each mesh used, for each bone in that mesh, apply the following algorithm:

    For each mesh used:
        For each bone in mesh:
            If a node with the same name exists:
                Bone.FinalTransform = Bone.InverseBind * Node.LocalTransform * GlobalInverseTransform
            Otherwise:
                Bone.FinalTransform = Bone.InverseBind * GlobalInverseTransform
    

    We now have the FinalTransform matrix for each bone in the model.

    Calculating a vertex's position

    Once we have all the bones calculated, we can then transform the mesh's points into their final render locations. This is the algorithm I use. This is not the "correct" way to do this, as it should be calculated by a vertex shader on-the-fly, but it works to demonstrate what's happening.

    From the root node:
        For each mesh referred to by node:
            Create an array to hold the transformed vertices, the same size as your source vertices array.
            Create an array to hold the transformed normals, the same size as your source vertices array (normals and vertices arrays should be the same length at the beginning.
    
            If the mesh has no bones:
                Copy source vertices and source normals to output arrays - mesh is not skinned
            Otherwise:
                For every bone in the mesh:
                    For every weight in the bone:
                        OutputVertexArray(Weight.VertexIndex) = Mesh.InputVertexArray(Weight.VertexIndex) * Bone.FinalTransform * Weight.TransformWeight
                        OutputNormalArray(Weight.VertexIndex) = Normalize(Mesh.InputNormalArray(Weight.VertexIndex) * Bone.FinalTransform * Weight.TransformWeight)
            
            Render the mesh, using OutputVertexArray, OutputNormalArray, Mesh.InputTexCoordsArray and the mesh's face indices.
    
        Recursively call this process for each child node.
    

    This should get you a correctly rendered output.

    Note that with this system, it is possible to re-use a mesh more than once.

    Animating

    Just a quick note on animating. I haven't done much with this, and Assimp hides much of the gory details of Collada (and introduces its own form of gore), but to use predefined animations from your file, you do some interpolation of translations, rotations and scales to come up with a matrix that represents a node's animated state at as single point in time.

    Remember, matrix construction follows the TRS (translate, rotate, scale) convention, where translations happen first, then rotations, then scale.

    AnimatedNodeTransform = TranslationMatrix * RotationMatrix * ScaleMatrix
    

    The generated matrix completely replaces the node's Transform matrix - it is not combined with the matrix.

    I am still trying to work out how to perform on-the-fly animation (think Inverse Kinematics) correctly. For some models I try, it works great. I can apply a quaternion to the node’s Transform matrix and it will work. However, some other models will do strange things, like rotate the node around the origin, so I think I’m still missing something there. If I finally solve this, I will update this section to reflect what I discover.

    Hope this helps. If I've missed anything, or gotten anything wrong, anyone please feel free to correct me. I am only learning this stuff myself. If I notice any mistakes, I will edit the answer.

    Also, be aware that I use Direct3D, so my matrix multiplication order is probably reversed from yours. You will likely need to flip the multiplication order of some of the operations in my answer.