c++, Kinect, windows store, WinRT

Avateering with Kinect V2 – Joint Orientations

For my own learning I wanted to understand the process of using the Kinect V2 to drive the real-time movement of a character made in 3D modelling software. This post is the first part of that learning which is taking the joint orientations data provided by the Kinect SDK and using that to position and rotate ‘bones’ which I will represent by rendering cubes since this is a very simple way to visualise the data. (I won’t cover smoothing the data or modelling/rigging in this post). So the result should be something similar to the Kinect Evolution Block Man demo which can be discovered using the Kinect SDK browser.

blockman

To follow this along you would need a working Kinect V2 sensor with USB adapter, a fairly high-specced machine running Windows 8.0/8.1 with USB3 and a DirectX11-compatible GPU and also the Kinect V2 SDK installed. Here are some instructions for setting up your environment. 

To back up a little there are two main ways to represent body data from the Kinect; the first being to use the absolute positions provided by the SDK which are values in 3D Camera-space which are measured in metres, the other is to use the joint orientation data to rotate a hierarchy of bones. The latter is the one we will look at here. Now, there is an advantage in using joint orientations and that is, as long as your model has the same overall skeleton structure as the Kinect data then it doesn’t matter so much what the relative sizes of the bones are which frees up the modelling constraints. The SDK has done the job of calculating the rotations from the absolute joint positions for us so let’s explore how we can apply those orientations in code.

Code

I am going to program this by starting with the DirectX and XAML C++ template in Visual Studio which provides a basic DirectX 11 environment, with XAML integration, basic shaders and a cube model described in code.

vstemplate

Body Data

Let’s start by getting the body data into our program from the sensor. As always we start with getting a KinectSensor object which I will initialise in the Sample3DSceneRenderer class constructor, then we open a BodyFrameReader on the BodyFrameSource, for which there is a handy property on the KinectSensor object. We hold the sensor object and the reader object as class variables as we don’t want those to fall out of scope. Additionally, we need to create a vector of type Body to store the data supplied by the sensor. Once we have the opened reader object we can use it to pull the latest frame of body data from within out render loop. I’m not modifying the structure of the project template so I am using Sample3DSceneRenderer class and inserting my code into the Render function. So to initialise:

  1. _sensor = KinectSensor::GetDefault();
  2. _reader = _sensor->BodyFrameSource->OpenReader();
  3. _bodies = ref new Vector<Body^>(_sensor->BodyFrameSource->BodyCount);
  4. _sensor->Open();

and from within the Render function:

  1. {
  2.     auto bodyFrame = _reader->AcquireLatestFrame();
  3.     if (bodyFrame != nullptr)
  4.     {
  5.         bodyFrame->GetAndRefreshBodyData(_bodies);
  6.         updated = true;
  7.     }
  8. }

Note the use of scoping to ensure that the body frame gets closed as soon as possible. Then we can write a loop like this to process the body data:

  1. for (auto body : _bodies)
  2. {
  3.     if ((Body^) body == nullptr || !body->IsTracked)
  4.         continue;
  5.  
  6.     // do stuff here…
  7.  
  8. }

I read through quite a few discussions on the Kinect SDK forums here but I didn’t find anything that I felt provided a clear description of how you could use the joint orientations. The best source was the code for the Block Man demo since it worked well but often key concepts and assumptions can get hidden inside working code so I set about to clear that up in my mind. I felt that I needed a few additional things to help me explore the scenario: an orbit camera to allow orbiting and zooming in a scene, a floor plane grid and some positional markers. I find it really helpful to be able to explore a 3D scene from different angles and also to be able to draw markers at key locations.

scenegrid    

Kinect Joint Hierarchy

The first subject to consider is how the Kinect joint hierarchy is constructed as it is not made explicit in the SDK. Each joint is identified by one of the following enum values:

jointenum

Starting with the SpineBase which can be considered as the root of the hierarchy we end up with something like this:

hierarchy

Which corresponds to the following skeleton:

kinectskeleton

I added some utility code to the project to represent this hierarchy – two functions; CreateBoneHierarchy and TraverseBoneHierarchy. The first creates an in-memory representation of the parent-child relationships between the joints and the second does a depth-first traversal of the hierarchy allowing a function/lambda to be called as each node is traversed. I will use the traversal method to draw and transform each bone in the skeleton.

Bones

To draw each separate bone I modified the original cube model that was supplied with the default project template. I modified the coordinates of the original cube so that one end was at the origin and the other was 4 units in the y-direction; so when rendered without an additional transform it looks like the orange cube below and when rotated 90 degrees looks like the dark blue cube. The point being that the model is not centred on the origin and so won’t rotate about its centre but about its end.

ROTATEDBONE

QuaTernions

I’m not going to delve into quaternions here suffice to say that they are a way to describe an orientation in 3d space and are used to avoid gimbal-lock related problems which arise from using Euler angles for rotation. They provide a great way to store and animate rotations but ultimately are converted back to matrix form and your graphics programming environment most-likely provides functions to do this. See this for further information.

Transforming and Rendering

As we traverse the joint hierarchy we need to position our local origin at the end of our parent bone before we apply our local model matrix which will in turn apply the joint orientation rotations and also scale the cube according to which bone it represents. To illustrate this let’s look at the first three bones drawn – note that I also draw a marker at each local origin.

threebonesmu

Here are the main steps in the render function in pseudocode:

  1. // Lookup joint orientation data
  2. orientation = body->JointOrientations->Lookup(jointType)
  3.  
  4. // create rotation matrix
  5. rotationMatrix = matrixFromQuaternion(orientation)
  6.  
  7. // get our local origin
  8. origin = parent->transformed()
  9.  
  10. // Draw marker transformed to local origin
  11. DrawMarker(origin)
  12.  
  13. // create model matrix
  14. model = scale * translate * rotate
  15.  
  16. // transform position of child local orign and store it
  17. transformed = model * endOfBone
  18.  
  19. // Draw bone
  20. DrawBone(model)

In the actual code this is complicated a little by the following:

– The leaf joint orientations are set to zero so the leaf bones just take their parent orientations – this is the same in the Block Man implementation.

– If we are at the root we need to transform to the absolute position of the joint (this will position the whole body in camera space)

Here is the code I used for drawing each bone:

  1. // Lookup the joint orientation for this joint
  2. t->_orientation = body->JointOrientations->Lookup(t->JointType());
  3.  
  4. // if orientation is zero use parent orientation. (Some of the leaf joint orientations
  5. // are zero)
  6. JointOrientation orientation = t->_orientation;
  7.  
  8. auto v4 = XMFLOAT4(t->_orientation.Orientation.X,
  9.     t->_orientation.Orientation.Y,
  10.     t->_orientation.Orientation.Z,
  11.     t->_orientation.Orientation.W);
  12.  
  13. auto parent = t->Parent();
  14. if (XMVector4Equal(XMLoadFloat4(&v4), XMVectorZero()) && parent != nullptr)
  15. {
  16.     orientation = parent->_orientation;
  17. }
  18. // Create a rotation matrix from the orientation quaternion. If we are at the root start with a transform
  19. // to take us to the absolute position of the whole body. If we are not at the root start with the
  20. // parent's transform.
  21. auto f4 = XMFLOAT4(orientation.Orientation.X, orientation.Orientation.Y,
  22.                    orientation.Orientation.Z, orientation.Orientation.W);
  23. auto rotMatrix = XMMatrixRotationQuaternion(XMLoadFloat4(&f4));
  24. if (parent != nullptr)
  25. {
  26.     transformed = parent->_transformed;
  27. }
  28. else
  29. {
  30.     // We are at the root so transform to the absolute position (this transform will affect all bones in
  31.     // the hierarchy)
  32.     auto pos = body->Joints->Lookup(t->JointType()).Position;
  33.     auto v3 = XMFLOAT3(FACTOR * pos.X, FACTOR * pos.Y, FACTOR * pos.Z);
  34.     transformed = XMLoadFloat3(&v3);
  35. }
  36.  
  37. // Convert the vector into a transform matrix and store into the model matrix
  38. auto translatedOrigin = XMMatrixTranslationFromVector(transformed);
  39. XMStoreFloat4x4(&m_constantBufferData.model, XMMatrixTranspose(translatedOrigin));
  40.  
  41. // draw a marker here so we can see that we are in the right place (this should be at the end of the
  42. // parent bone)
  43. DrawAxis(context, _axis.get());
  44.  
  45. auto translated = XMMatrixTranslation(0.0f, boneLength, 0.0f);
  46. auto scaleMat = XMMatrixScaling(1.0f, t->BoneLength(), 1.0f);
  47. auto mat = scaleMat * rotMatrix * translatedOrigin;
  48. XMStoreFloat4x4(&m_constantBufferData.model, XMMatrixTranspose(mat));
  49. auto f3 = XMFLOAT3(0.0f, boneLength, 0.0f);
  50. t->_transformed = XMVector3TransformCoord(XMLoadFloat3(&f3), mat);
  51.  
  52. if (parent != nullptr)
  53. {
  54.     // draw…
  55.     DrawBone(context, t->getColour());
  56. }

and this shows the end result:

finalskeleton

The sample code for this post is on Github here https://github.com/peted70/kinectv2-avateer-jointorientations

Tagged , , , , , ,

6 thoughts on “Avateering with Kinect V2 – Joint Orientations

  1. I haven’t been able to capture nods no and yes with KinectV2fW — it seems the sensor just doesn’t detect rotation of the head joint around the Y axis, nor tilting forward of the neck bone unless it’s very exaggerated. Can you suggest a way that these conversational gestures could be avateered?

    1. I’m fairly sure you can get data about head pivot rotation from the HD Face tracking data – a face would need to be detected for this to work.

  2. Hi Pete,

    I use kinect v1 for avateering, but I find the bone orientations calculated by the sdk only guarantee the y direction of each bone ( cause y direction is the bone’s direction) and the x and y direction is not accurate (will have some kind of undesired rotation). Such situation will cause serious artifacts in the skinning results, Does Kinect v2 have the same problem (for example. when you just raise your arm with no rotation, the avatar being animated will raise its arm and also rotate its arm)?

    Thank you very much!

  3. Hi. Is there an app in the sdk suite for v2 that I can use to achieve the Blackman style images or do I need to program?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.