3D FaceViewer

0
45

Introduction

This article put together various technologies in facial image processing, 3D visualization, image manipulation and computer vision. It involves using 3D face mesh created by Kinect 2.0, .NET classes in System.Windows.Media.Media3D, Aforge .NET Framework and OpenCV.  

Background

Facial recognition is currently one of the most popular biometric identification methods. Face capturing is as simple as just taking the photo of a person’s face. However for the facial photo to be useful for facial recognition, certain facial image specifications must be met. 

  • The face must be in frontal view
  • The image must not be stretched unevenly 
  • The face should be evenly lighted
  • The subject must be in a neutral expression state with eye open and mouth closed

Photos that met these specifications would qualify as suitable registable photos. For use with an automated Facial Recognition System, these photo would be the source for facial feature file generation. The feature files are mostly unique to the source photo and store summarizied facial features that can be used for comparision.

When two feature files match, there is a high likelihood that the subjects to the orginating photos are the same person.

One of the main challenges facing Facial Recognition System is the difficulty in getting good frontal images for matching. The camara should be suitably placed to get full frontal view of the subject. However, people varies in height and also their inclination to look slightly off-frontal to the camera. The camera could be looking slightly sideway, top down, bottom up or at some angle such that the captured photo would not be ideally full frontal. 

Depending on the Facial Recognition System, internal automated correction could be done by the system before feature file is generated, but matching score would be affected.

The other challenge is lighting. Normally most registered photos are of reasonably good quality as these are captured in mostly controlled enivronment such as in a specialized photo booth. Photos that are captured for matching in most cases are taken in environment that could have changing lighting condition, such as a Facial Door Access Unit near a window.

For testing the accuracy and reliabilty of a Facial Recognition System, test cases would be needed with subjects having their faces taken at various angles and lighting conditions. These images would be used to test against controlled images of the same subject. It is a time consuming and tedious process and requires active participation of dedicated test subjects or alternatively we need images captured from a live system.

The motivation behind the idea to this article is to come up with test cases that can be recreated from a frontal image, varying the lighting conditions and camera angle.

The Technologies

Even with sophisticated commercial tool, creating a realistic 3D Facial Model is not a trival task.

With the release of Kinect X-Box One, we found just the right technology to create a realistic 3D Facial Model without much difficulties. The Kinect 2.0 sensor is a highly sophisticated equipment. It has a 1920X1080 Full HD resolution camera that capture image of reasonably good quality. There is also an Infra-red depth sensor that can output depth image in 640X480 resolution. The depth info from this depth sensor is probably the best currently commercially available. Basically, there are only two types of raw frames (each at 30 frames per seconds) from each of the camera/sensor. There are however other computed frames available (also at 30 frames per second). These are Body Index, Body, Face Basic and HDFace frames.

Especially interesting is the HDFace frames. These are 3D World coordinates of tracked face(s). Each face has 1347 vertices. 3 points connected up would make a 3D triangle surface. Kinect 2.0 uses a set of standard triangle indices reference with 2630 triangles. With 2630 surfaces, the face model would indeed be realistic.

The 3D model created from the HDFace frame can be be rendered by the .NET System.Windows.Media.Media3D classes, using MeshGeometry3D for modelling, Viewport3D for viewing in WPF window, PerspectiveCameraAmbientLight and DirectionalLight for rendering the model at various lights and view angles.

AForge .NET classes provides the filters to process source image, varying the brightness and contrast.

OpenCV provides HaarClassifier for finding face and eyes in face input images.

System.Drawing .NET classes are used for GDI+ image manipulation such as stretching and rotation.

System.Windows.Media classes are used for presentation in WPF window.

 

Mesh Files

The GeometryModel3D class for 3D image processing consists are 2 basic components:

  • GeometryModel3D.Geometry
  • GeometryModel3D.Material

The GeometryModel3D.Geometry class requires the following information to be defined

  • Position
  • TriangleIndices
  • TextureCoordinates

Position consists of vertices defined in 3D World coordinates. Each vertex is refered by their order in the Position vertices collection. For example, to define a cube, we need 8 vertices, one at each corner of the cube. The order we input the coordinates to the Position collection is important. 

Positions="-0.05,-0.1,0 0.05,-0.1,0 -0.05,0,0 0.05,0,0 -0.05,-0.1,-0.1 0.05,-0.1,-0.1 -0.05,0,-0.1 0.05,0,-0.1"

In the above example -0.05,-0.1,0 is the first vertex and will have index 0.

TriangleIndices refers to the list of indices in groups of 3 that define each surface in the 3D model.

TriangleIndices="0,1,2 1,3,2 0,2,4 2,6,4 2,3,6 3,7,6 3,1,5 3,5,7 0,5,1 0,4,5 6,5,4 6,7,5"/>

In the above example, we have defined 16 triangles each with 3 vertices, collectively defining all the surfaces of a cube. The first 3 indices 0,1,2 refer to Position vertices with index 0, 1 and 2. This will make up a surface in the 3D model.

For the surface to be rendered,  we can define a painting brush using the GeometryModel3D.Material class. But the brush would need to know what texture/color to apply to the surface. This is the role of the TextureCoordinate

TextureCoordinates="1,1 0,0 0,0 1,1 1,1 0,0 0,0 1,1"

There are 8 coordinates above, each to be applied to one vertex of the cube. The first coordinate is (1,1). How does this define a color or texture? These number only make sense when we refer to the brush to paint the surface.

For this example, we will refer to a LinearGradientBrush. This brush allows a gradient to be defined from a StartPoint to an EndPoint

<LinearGradientBrush StartPoint="0,0" EndPoint="1,1"> <GradientStop Color="AliceBlue" Offset="0" /> <GradientStop Color="DarkBlue" Offset="1" /> </LinearGradientBrush>

In the example above, imagine an area 1 unit by 1 unit. The 2 opposite corners will have coordinates (0,0) and (1,1). A value of 0 refer to the color AliceBlue, and 1 DarkBlue. Any number in between is a shade between these 2 colors. Defining colors this way allows us to get a continuous gradient map. Each point in the 1 X 1 area will be mapped to a defined color. 

Going back to our TextureCoordinate value (1,1) for vertex index 0 for the cube, the color would be DarkBlue based on this LinearGradientBrush brush.

Similarly, we can have an ImageBrush. An ImageBrush would have an image source. The image of the image source would be used to define texture/color. Again the value range for the coordinates would be (0,0) to (1,1). For instance, if the image size is 1920X1080, then (0,0) would refer to point (0,0) on the image and (1,1) refer to (1920-1, 1080-1), the right bottom corner of the image. A TextureCoordinate (0.5,0.5) would be mapped to (0.5X1920-1, 0.5X1080-1). In this way, we would be able to get the image point for any TextureCoordinate in the range of (0,0) -(1,1).

The mesh file for the face model consists of 1347 Position vertices, follows by 1347 TextureCoordinates points. The texture file is a 1920X1080 image that would be used as the image source for the ImageBrush for rendering the 3D model.

The triangle indices defining all the surfaces in the face model are the same for all face models generated by Kinect 2.0.  I have included these indices in the file tri_index.txt. There are 2630 surfaces defined by 7890 indices.

There is also another file that lists out the image coordinates of these face points::

  • RightEye
  • LeftEye
  • Nose
  • Mouth
  • Chin

The information in these files are all derived from data generated using Kinect 2.0 sensor with the Kinect 2.0 SDK API. In this article, I would not be covering Kinect specific areas. If you are interested, refer to the HDFace example in the Kinect 2.0 SDK.

The HDFace points are not very well documented, possibily because there are just too many of them. It would not be easy to come up with names for each of the 1347 points. However, from my investigation, the face points of interest to us are

  • 328-1105 (right eye is between these 2 points)
  • 883-1092 (left eye is between these 2 points)            
  • 10 (center of base of upper lip)
  • 14 (nose base)
  • 0 (chin)

Face Images

The idea behind this article is based on the Kinect 2.0 SDK HDFace example.  In that example, HDFace frames and Color frames are processed in sync. Using Kinect 2.0 Coodinate Mapper, each HDFace point can be mapped to its corresponding color coordinate in the Color frame, and TextureCordinate can be generated accurately on the fly using the Color frame as the image source for the ImageBrush.

I took snapshots of a set of synchronized HDFace and Color frame, record the TextureCoordinate generated, and the standard TriangleIndices. This would provide all the information I need to reproduce the 3D face model without the Kinect 2.0 sensor.

However the model can only be used for that specific texture (Color frame) saved. The Color frame is 1920X1080, but the face image only occupies an area of about 500X500. If we can replace that area with another face, we could possibly have that face rendered on the 3D face model!

This is like putting on a face mask. But it has to be accurately mounted. 

For precise replacement of the face area, we would need to know the orientation of that face. Is it looking sideways, up or down, and are the eyes level and open, is the mouth closed? To get a replacement face of the same orientation would be difficult. 

We will need a standard for face images. The original Color frame recorded for each of the 3D model has the face in the standard specification. Essentially, it is the same standard adopted for ID photos: full facial frontal, neutral expression, eye open, mouth closed. Image resolution should be kept at about 400X400 to 800X800.

For the replacment face, we need to adhere to the same standard.

Face Fitting

Bad Fit

Good Fit

The original texture is 1920X1080. The face image may be located somewhere in the center. To replace that face, we would need to anchor the new face at some invariant points. The points chosen should be at the main features area of the face. Most Facial Recognition System defines the eyes, nose and mouth as important features. Some also include cheeks and eye-browses. 

For this article, I identify 5 points: right eye, left eye, nose-base, upper-lip base and chin

The new face would not fit similarly well to each of the face models. We need a way to calculate the goodness of fit. The best fit would be the ones where the 5 points all aligned after a uniform stretching horizontally and vertically, (ie an enlargement transformation). If we have sufficient face models, we may be able to find one or more ideal ones that fit. With just 6 face models, we may not get an ideal fit.

The face fitting algorithm:

  • 1) Find the distance between the 2 eyes in the reference (the original face) image
  • 2) Find the distance between the 2 eyes in the new face image
  • 3) Find the distance of the nose-base to the mid points of the eyes for the reference image
  • 4) Find the distance of the nose-base to the mid points of the eyes for the new image
  • 5) Stretch the new image horizontally by the factor obtained by dividing 1) by 2)
  • 6) Stretch the new image vertically by the factor obtained by dividing 3) by 4)
  • 7) For the new stretched image, find the vertical distance from nose-base to upper-lip base
  • 8) For the reference image, find the vertical distance from nose-base to upper-lip base
  • 9) For the new image, stretch( or compress) from the nose-base down vertically by the factor obtained by dividing 8) by 7)
  • 10) Now the mouth would be aligned. For the new re-stretched image, find the vertical distance from the upper-lip base to the chin
  • 11) For the reference image, find the vertical distance from the upper-lip base to the chin
  • 12) Final step: Stretch (or compress) from upper-lip base down vertically by the factor obtained by dividing 11) by 12)
  • Now all face points would be aligned

However for some face models, the resulting new image may be significantly deformed, see the figure labeled Bad Fit above.

For the measurement of goodness of fit, I devise the calculation for the goodness of fit. The error calculation is based on stretch factors. There are 4 stretch factors involved:

  • 1) factor1 =Eye to eye 
  • 2) factor2= Nose-base to eyes mid point
  • 3) factor3=Nose-base to upper-lip base
  • 4) factor4=Upper-lip base to chin

For 1) and 2) we want these values to be as similar as possible, we calculate the absolute ratio (factor1-factor2)/(factor1). Let’s call it eye-nose error.

For 3), we want to keep to the factor as close to 1.00 as possible, we use the absolute ratio (factor3-1). Let’s call it nose-mouth error

For 4), we also want to keep the factor as close to 1.00 as possible, we use the absolute ratio (factor4-1). Let’s call it mouth-chin error

Since the eye-nose streching is operated on the entire face, it is assigned a higer weightage. Similarly the nose-mouth streching involves stretching from the nose down, and the mouth-chin stretching involves only stretching from mouth down, it is assigned a weightage smaller than the eye-nose error, but larger than mouth-chin error.

The current weightage is 4 for eye-nose, 2 for nose-mouth and 1 for mouth-chin

Camera, Lights, Action

Figure 1: The Setup

Figure 2: Lighting up a cube

Figure 3: View changes with Mesh Translation

Figure 4: View changes with Camera Rotation

The Xaml markup code for the Viewport3D set up:

<Viewport3D  HorizontalAlignment="Stretch" VerticalAlignment="Stretch" Width="Auto" Height="Auto" x:Name="viewport3d" RenderTransformOrigin="0.5,0.5" MouseDown="viewport3d_MouseDown" MouseRightButtonDown="viewport3d_MouseRightButtonDown" > <Viewport3D.RenderTransform> <ScaleTransform ScaleX="1" ScaleY="1"/> </Viewport3D.RenderTransform>  <Viewport3D.Camera>  <PerspectiveCamera
            Position = "0, -0.08, 0.5"
            LookDirection = "0, 0, -1"
            UpDirection = "0, 1, 0"
            FieldOfView = "70"> <PerspectiveCamera.Transform> <Transform3DGroup> <RotateTransform3D> <RotateTransform3D.Rotation> <AxisAngleRotation3D
                                Axis="0 1 0"
                                Angle="{Binding Value, ElementName=hscroll}" /> </RotateTransform3D.Rotation> </RotateTransform3D> <RotateTransform3D> <RotateTransform3D.Rotation> <AxisAngleRotation3D
                                Axis="1 0 0"
                                Angle="{Binding Value, ElementName=vscroll}" /> </RotateTransform3D.Rotation> </RotateTransform3D> <RotateTransform3D> <RotateTransform3D.Rotation> <AxisAngleRotation3D
                                Axis="0 0 1"
                                Angle="{Binding Value, ElementName=vscrollz}" /> </RotateTransform3D.Rotation> </RotateTransform3D> </Transform3DGroup> </PerspectiveCamera.Transform> </PerspectiveCamera> </Viewport3D.Camera>   <ModelVisual3D> <ModelVisual3D.Content> <Model3DGroup> <AmbientLight x:Name="amlight" Color="White"/>  <DirectionalLight x:Name="dirlight" Color="White" Direction="0,0,-0.5" > <DirectionalLight.Transform> <Transform3DGroup> <TranslateTransform3D OffsetZ="0" OffsetX="0" OffsetY="0"/> <ScaleTransform3D ScaleZ="1" ScaleY="1" ScaleX="1"/> <TranslateTransform3D OffsetZ="0" OffsetX="0" OffsetY="0"/> <TranslateTransform3D OffsetY="-0.042" OffsetX="0.469" OffsetZ="-0.103"/> </Transform3DGroup> </DirectionalLight.Transform> </DirectionalLight> </Model3DGroup> </ModelVisual3D.Content> </ModelVisual3D> <ModelVisual3D> <ModelVisual3D.Content> <GeometryModel3D>  <GeometryModel3D.Geometry> <MeshGeometry3D x:Name="theGeometry"
                        Positions="-0.05,-0.1,0 0.05,-0.1,0 -0.05,0,0 0.05,0,0 -0.05,-0.1,-0.1 0.05,-0.1,-0.1 -0.05,0,-0.1 0.05,0,-0.1"
                        TextureCoordinates="0,1 1,1 0,0 1,0 0,0 1,0 0,1 1,1"
                        TriangleIndices="0,1,2 1,3,2 0,2,4 2,6,4 2,3,6 3,7,6 3,1,5 3,5,7 0,5,1 0,4,5 6,5,4 6,7,5"/> </GeometryModel3D.Geometry>  <GeometryModel3D.Material> <MaterialGroup> <DiffuseMaterial x:Name="theMaterial"> <DiffuseMaterial.Brush> <LinearGradientBrush StartPoint="0,0" EndPoint="1,1"> <GradientStop Color="AliceBlue" Offset="0" /> <GradientStop Color="DarkBlue" Offset="1" /> </LinearGradientBrush> </DiffuseMaterial.Brush> </DiffuseMaterial> </MaterialGroup> </GeometryModel3D.Material> </GeometryModel3D> </ModelVisual3D.Content> </ModelVisual3D>
</Viewport3D>

The Viewport3D contains the following elements:

  • Viewport3D.Camera
  • ModelVisual3D

The Viewport3D.Camera contains PerspectiveCamera. The camera specifications are

<PerspectiveCamera Position = "0, -0.08, 0.5" LookDirection = "0, 0, -1" UpDirection = "0, 1, 0" FieldOfView = "70">

The camera is positioned at World coordinates (0,-0.08,0.5). Refer to the Figure 1: The Setup. LookDirection (0,0,-1) means that the camera is looking in the negaive Z direction.

This camera is set up with Rotational Transformation feature about X, Y and Z axis:

<PerspectiveCamera.Transform> <Transform3DGroup> <RotateTransform3D> <RotateTransform3D.Rotation> <AxisAngleRotation3D
                    Axis="0 1 0"
                    Angle="{Binding Value, ElementName=hscroll}" /> </RotateTransform3D.Rotation> </RotateTransform3D> <RotateTransform3D> <RotateTransform3D.Rotation> <AxisAngleRotation3D
                    Axis="1 0 0"
                    Angle="{Binding Value, ElementName=vscroll}" /> </RotateTransform3D.Rotation> </RotateTransform3D> <RotateTransform3D> <RotateTransform3D.Rotation> <AxisAngleRotation3D
                    Axis="0 0 1"
                    Angle="{Binding Value, ElementName=vscrollz}" /> </RotateTransform3D.Rotation> </RotateTransform3D> </Transform3DGroup>
</PerspectiveCamera.Transform>

The angles of rotation are all bound to values from the ScrollBars: Rotation about X Axis=”1 0 0″ bound to vscroll, Y Axis “0 1 0” bound to hscroll, and Z Axis “0 0 1” bound to vscrollz. Valid angle is from -180 to 180 degree. Scrolling these scrollbars will cause camera position to change. The resulting view would seem like the viewed object has been rotated. See Figure 4: View changes with Camera Rotation.

There are 2  ModelVisual3D.Content for ModelVisual3D:

One includes the Model3DGroup which contains the light sources. There are 2 light sources:

<AmbientLight x:Name="amlight" Color="White"/>
<DirectionalLight x:Name="dirlight" Color="White" Direction="0,0,-0.5" >

The default color for the lights are all set to White. All objects will be illuminated by white light.

In the source code below, we change the colors for these lights based on values from the scrollbars

private void sliderColor_ValueChanged(object sender, RoutedPropertyChangedEventArgs e)
{ if (sliderRed!= null && sliderGreen!= null && sliderBlue!= null && sliderAmb!=null) { Color color = Color.FromArgb(255, (byte)sliderRed.Value, (byte)sliderGreen.Value, (byte)sliderBlue.Value); if (labelColor != null) { labelColor.Content = color.ToString(); labelColor.Background = new SolidColorBrush(color); } if (dirlight != null) dirlight.Color = color; Color amcolor = Color.FromArgb(255, (byte)sliderAmb.Value, (byte)sliderAmb.Value, (byte)sliderAmb.Value); if (amlight != null) amlight.Color = amcolor; } }

We can also change the direction of the Directional lights:

void dispatcherTimer2_Tick(object sender, EventArgs e)
{ var dir = dirlight.Direction; if (dir.Y > 5 || dir.Y < -5) deltaYdir = -1 * deltaYdir; dir.Y += deltaYdir; dirlight.Direction = new Vector3D(dir.X, dir.Y, dir.Z);
} void dispatcherTimer_Tick(object sender, EventArgs e)
{ var dir = dirlight.Direction; if (dir.X > 5 || dir.X<-5) deltaXdir = -1 * deltaXdir; dir.X += deltaXdir; dirlight.Direction = new Vector3D(dir.X, dir.Y, dir.Z); }

Changing the lights’ color and direction will cause the object to be viewed with different colors and shades. See Figure 2: Lighting a cube.

The other ModelVisual3D.Content includes the Model3DGroup which contains the  GeometryModel3D.Geometry and GeometryModel3D.Material.

The GeometryModel3D.Geometry specifies the Mesh details: Position, TextureCoordinates, and TriangleIndices. The GeometryModel3D.Material specifes the Brush for rendering the objects. The original object is a cube, and the brush, a simple gradient map.

<GeometryModel3D.Geometry> <MeshGeometry3D x:Name="theGeometry"
        Positions="-0.05,-0.1,0 0.05,-0.1,0 -0.05,0,0 0.05,0,0 -0.05,-0.1,-0.1 0.05,-0.1,-0.1 -0.05,0,-0.1 0.05,0,-0.1"
        TextureCoordinates="0,1 1,1 0,0 1,0 0,0 1,0 0,1 1,1"
        TriangleIndices="0,1,2 1,3,2 0,2,4 2,6,4 2,3,6 3,7,6 3,1,5 3,5,7 0,5,1 0,4,5 6,5,4 6,7,5"/>
</GeometryModel3D.Geometry> 
<GeometryModel3D.Material> <MaterialGroup> <DiffuseMaterial x:Name="theMaterial"> <DiffuseMaterial.Brush> <LinearGradientBrush StartPoint="0,0" EndPoint="1,1"> <GradientStop Color="AliceBlue" Offset="0" /> <GradientStop Color="DarkBlue" Offset="1" /> </LinearGradientBrush> </DiffuseMaterial.Brush> </DiffuseMaterial> </MaterialGroup>
</GeometryModel3D.Material>

To change the position of the mesh, we perform translations on its vertices

private void UpdateMesh(float offsetx,float offsety,float offsetz)
{ var vertices = orgmeshpos; for (int i = 0; i < vertices.Count; i++) { var vert = vertices[i]; vert.Z += offsetz; vert.Y += offsety; vert.X += offsetx; this.theGeometry.Positions[i] = new Point3D(vert.X, vert.Y, vert.Z); }
}

We load in the startup positions (stored in orgmeshpos) and then apply the translations which we have altered via the X, Y and Z sliders. See Figure 3: View changes with Mesh Translation.

 

User Interface

At startup, the UI is as shown in Figure 5.

Figure 5: Startup

This is the front view of the startup cube. You can slide on the Camera Rotational sliders located at the left, right and bottom to get a different view of the cube.

Right click on the cube to toggle it between a face cube and a gradient color cube.

Figure 6: Face cube

Figure 6 shows the face cube rendered by an ImageBrush. You can use the Translation setting sliders to change the position of the cube. Changing the values of the Camera Rotation sliders will cause the cube to be viewed at different angles. The ambient light shade and directional color setting effect a change in color and intensity of the lights. This will causes the object to appear lighted up differently. The direction of the direction lights can be changed by clicking on the buttons captioned and ^--v. The direction of the directional lights will change continously, causing different shades and shadows on the object. Clicking on these buttons again locks the direction of the lights at that time. Right click these buttons to reset the direction of the lights.

Figure 7: Face Model

There are 6 face models to choose from. Click on any one of the 6 face images on the right. The selected face model would be loaded at the default position, but with the current camera rotation settings. To get to all default positions and camera rotation settings, click on the Reset button, or double click on the rendered face model.

To take a snapshot of current view of the model, click on the Snap button. A snapshot would be recorded, stored in memory, and displayed within the vertical column on the left. The column displays the last 6 snapped images. To scroll for other images, move the mouse over any of the images and roll the mouse wheel.

Click on the image to view/save it to file.

When the face model is loaded, the texture file for the model would be displayed at the left top corner. Click on the texture image to select and load a new face file.

Figure 8: Face Fitting

When a face image is selected and loaded, the program would attempt to locate the eyes. Eye detection is done using OpenCV HaarClassifier. Note at the center of active face feature locator (the red circle), there is a small box. This is for pin point location of the feature point. Also on the top left, there would be a magnifier showing the content inside the active face feature locator

The 5 points to locate are 2 eyes, nose base, upper lip base and chin. To finely move the locator, click to select it, then use the arrow keys and at the same time, view the content at the center of the magnifier, to precisely locate the face point.

Check the Aligned Eye checkbox  if the input face image’s eyes are not quite level. Note that, most people are not able to keep their eyes absolutely level in a frontal photo.

Click on Best Fit button to get the face model that best fits the new face. Click Update to use the currently selected face model.

  

Figure 9: Face Fitting Evaluation

After you have chosen and updated the face model with the new face, take note of the following:

  • Top right image: Stretched face to be used as texture
  • Bottom right: The aligned fitting of the stretched face to the face model
  • Fitting Error: Shows 4 numbers :::

For good fitting, the stretched face should not be too deformed. See the figure labeled Bad Fit earlier in the article for an example of a badly fitted face image.

The Fitting Error would help you to re-adjust the face points. Take for instance, the value -10:15:-10:80. In this case the eye-nose error=-10, the nose-mouth error=15, mouth-chin erorr=-10 and overall error calculated with the assigned weightage 4 for eye-nose, 2 for nose-mouth and 1 for mouth-chin, would be 80.

To compensate for negative eye-nose error, bring the 2 eyes locators closer together and and/or lower the nose locator. Like wise, to compensate for a positive eye-nose error, move the eyes locators further apart and/or bring the nose locator higher. 

To compensate for negative nose-mouth error, move the mouth and nose locators further apart. For positive error, bring these locators closer to one another.

To compensate for negative mouth-chin error, move the mouth and chin locators further apart. For positive error, bring these locators closer to one another.

In our example for the value  -10:15:-10:80, we move the eye locators closer, the nose locator down to compensatie for -10 eye-nose error. For the nose-mouth error correction of 15, bring the mouth locator up. And for the mouth-chin error of -10, bring the the chin locator further down. 

Note that the relocation of the face points locators to compensate for the fitting error will result in a stretched face image with proportion more similar to the original image, but if there are too much corrections, some face points may not aligned with corresponding face points on the face model. Click Update button and then check the alighment-fitting image at the bottom right. 

This is an iterative process and may take some practise to be proficient. However some face images may just not be satisfactorily fitted at all if their face points configurations are significantly out of proportion compared to any of our 6 face models. Most faces that I have tried fitting can somehow be fitted to a maximum overall error of 100.

After face fitting, you can perform translations and camera rotations to get the desired view. Then click the Snap button to take a snapshot.

To remove the grid lines on the face, uncheck the ShowGrid checkbox on the top right.

Sometimes, the face image cannot totally covers the face model, especially around the edges of the face. 

Figure 10: Face texture insufficent

Figure 10 shows that we are not able to render the side of the face effectively as there are insufficent face texture. The edge of the face is distorted as it is using part of the the ear and hair for rendering. To handle such cases, I have devised a method to patch the side of the new face with face texture nearer to the cheek and side of the eyes. Uncheck the No-Stretching checkbox on the top left to enable this feature.

 

Figure 11: Patched Face

Figure 11 shows that the side of the face has been patched by texture extended from the inner part of the face.

 

Code Highlights

OpenCV: Finding Face and Eyes in C# without Emgucv. Orginally I wanted to use Emgucv, but the footprint is just too large, and is not ideal for distribution in this article. The code here uses a wrapper for Opencv 2.2. The code for the wrapper is in DetectFace.cs. The code below makes use of methods in this warpper to do face and eye detection. The wrapper codes are modified from detectface.cs from https://gist.github.com/zrxq/1115520/fc3bbdb8589eba5fc243fb42a1964e8697c70319

public static void FindFaceAndEyes(BitmapSource srcimage, out System.Drawing.Rectangle facerect, out System.Drawing.Rectangle[] eyesrect) { String faceFileName = AppDomain.CurrentDomain.BaseDirectory + "haarcascade_frontalface_alt2.xml"; String eyeFileName = AppDomain.CurrentDomain.BaseDirectory + "haarcascade_eye.xml"; IntelImage _img = CDetectFace.CreateIntelImageFromBitmapSource(srcimage); using (HaarClassifier haarface = new HaarClassifier(faceFileName)) using (HaarClassifier haareye = new HaarClassifier(eyeFileName)) { var faces = haarface.DetectObjects(_img.IplImage()); if(faces.Count>0) { var face = faces.ElementAt(0); facerect = new System.Drawing.Rectangle(face.x, face.y, face.width, face.height); int x=face.x,y=face.y,h0=face.height ,w0=face.width; System.Drawing.Rectangle temprect = new System.Drawing.Rectangle(x,y,w0,5*h0/8); System.Drawing.Bitmap bm_current=CDetectFace.ToBitmap(_img.IplImageStruc(),false) ; System.Drawing.Bitmap bm_eyes = bm_current.cropAtRect(temprect); bm_eyes.Save(AppDomain.CurrentDomain.BaseDirectory + "temp\\~eye.bmp", System.Drawing.Imaging.ImageFormat.Bmp); IntelImage image_eyes = CDetectFace.CreateIntelImageFromBitmap(bm_eyes); IntPtr p_eq_img_eyes= CDetectFace.HistEqualize(image_eyes); var eyes = haareye.DetectObjects(p_eq_img_eyes); NativeMethods.cvReleaseImage(ref p_eq_img_eyes); image_eyes.Dispose(); image_eyes = null; bm_eyes.Dispose(); if (eyes.Count > 0) { eyesrect = new System.Drawing.Rectangle[eyes.Count]; for (int i = 0; i < eyesrect.Length; i++) { var eye = eyes.ElementAt(i); eyesrect[i] = new System.Drawing.Rectangle(eye.x, eye.y, eye.width, eye.height); } } else eyesrect = null; } else { facerect = System.Drawing.Rectangle.Empty; eyesrect = null; } } _img.Dispose(); }

WPF and GDI+conversion. The WPF System.Wiindows.Media classes are ideal for presentation, but they are not so flexible when it comes to image manipulation. Drawing on System.Winows.Drawing.Bitmap is easier than drawing on System.Windows.Media.ImageSource. Thus for bitmap manipulation, I convert WPF BitmapSource to System.Windows.Drawing.Bitmap and for presentation on WPF, I convert backwards from System.Windows.Drawing.Bitmap  to BitmapSource.

 public static System.Windows.Media.Imaging.BitmapImage Bitmap2BitmapImage(System.Drawing.Bitmap bitmap) { System.Drawing.Image img = new System.Drawing.Bitmap(bitmap); ((System.Drawing.Bitmap)img).SetResolution(96, 96); MemoryStream ms = new MemoryStream(); img.Save(ms, System.Drawing.Imaging.ImageFormat.Png );   img.Dispose();   img=null; ms.Seek(0, SeekOrigin.Begin); BitmapImage bi = new BitmapImage(); bi.BeginInit(); bi.StreamSource = ms; bi.EndInit(); bi.Freeze(); return bi; } public static System.Drawing.Bitmap BitmapImage2Bitmap(BitmapSource bitmapImage) { using (MemoryStream outStream = new MemoryStream()) { BitmapEncoder enc = new PngBitmapEncoder(); enc.Frames.Add(BitmapFrame.Create(bitmapImage)); enc.Save(outStream); System.Drawing.Bitmap bitmap = new System.Drawing.Bitmap(outStream); bitmap.SetResolution(96, 96); System.Drawing.Bitmap bm=new System.Drawing.Bitmap(bitmap); tempbm.Dispose();   tempbm=null; return bm; } }

Snapping from viewport. The RenderTargetBitmap class is useful for grabbing an image from Viewport3D. However the entire viewport is grabbed. Nonetheless, we can get the object, as most of the viewport snapped will have transparent pixels. A bounding rectange for non transparent pixel can be found, the bound adjusted to get some margin, and we do a crop from the grabbed RenderTargetBitmap using the CropBitmap class. We then use the FormatConvertedBitmap class to convert the final image to RGB24 format which is the standard used for most image processing software, including our Opencv wrapper.

var viewport = this.viewport3d;
var renderTargetBitmap = new RenderTargetBitmap((int)(((int)viewport.ActualWidth+3)/4 *4) , (int)viewport.ActualHeight , 96, 96, PixelFormats.Pbgra32);
renderTargetBitmap.Render(viewport); byte[] b=new byte[(int)renderTargetBitmap.Height*(int)renderTargetBitmap.Width*4];
int stride=((int)renderTargetBitmap.Width )*4;
renderTargetBitmap.CopyPixels(b, stride, 0); int x = 0, y = 0,minx=99999,maxx=0,miny=99999,maxy=0; for(int i=0;i<b.Length;i=i+4)
{ y = i /stride; x = (i % stride) / 4; if (b[i + 3] == 0) { b[i] = 255; b[i + 1] = 255; b[i + 2] = 255; } else { if (x > maxx) maxx = x; if (x < minx) minx = x; if (y > maxy) maxy = y; if (y < miny) miny = y; }
} BitmapSource image = BitmapSource.Create( (int)renderTargetBitmap.Width , (int)renderTargetBitmap.Height, 96, 96, PixelFormats.Bgra32, null, b, stride); int cropx = minx - 20;
if (cropx < 0) cropx = 0;
int cropy = miny - 20; if (cropy < 0) cropy = 0; int cropwidth = (((maxx - cropx + 20 + 1) + 3) / 4) * 4;
int cropheight = maxy - cropy + 20 + 1; int excessx = cropwidth + cropx - image.PixelWidth;
int excessy = cropheight + cropy - image.PixelHeight;
if (excessx < 0) excessx = 0;
if (excessy < 0) excessy = 0;
excessx = ((excessx + 3) / 4) * 4; CroppedBitmap crop;
try
{ crop = new CroppedBitmap(image, new Int32Rect(cropx, cropy, cropwidth - excessx, cropheight - excessy));
}
catch
{ return;
} var destbmp = new FormatConvertedBitmap();
destbmp.BeginInit();
destbmp.DestinationFormat = PixelFormats.Rgb24;
destbmp.Source = crop;
destbmp.EndInit();

Save Image and Background. Window2 implements a  generic window to display and save images. It consists of a Grid (TopGrid) containing an Image (Image1). The code retrieves the image source of TopGrid background and draws the image source from Image1 onto it. For such overlaying, both the background image and the foreground image must support transparency

int imagewidth = (int)Image1.Source.Width;
int imageheight = (int)Image1.Source.Height ;
System.Drawing.Bitmap bm=null; if (SourceBrushImage == null) bm = new System.Drawing.Bitmap(imagewidth, imageheight, System.Drawing.Imaging.PixelFormat.Format32bppArgb);
else
{ ImageBrush ib=(ImageBrush)(TopGrid.Background) ; BitmapSource ibimgsrc = ib.ImageSource as BitmapSource; bm = CCommon.BitmapImage2Bitmap(ibimgsrc);
}
System.Drawing.Graphics gbm = System.Drawing.Graphics.FromImage(bm); if (SourceBrushImage == null) gbm.Clear(System.Drawing.Color.AliceBlue); System.Drawing.Bitmap bm2 = CCommon.BitmapImage2Bitmap(Image1.Source as BitmapSource );
gbm.DrawImage(bm2, 0, 0);
gbm.Dispose(); bm.Save(filename, System.Drawing.Imaging.ImageFormat.Jpeg);

AForge Brightness and Contrast. When we adjust the brightness or/and contrast, we retrieve the orginal unfiltered (ie not operated with any image filters) and apply the brightness and contrast filters on the original image, in order, brightness first and then using the resulting image we set the contrast filter

if (((System.Windows.Controls.Slider)sender).Name == "sliderBrightness" || ((System.Windows.Controls.Slider)sender).Name == "sliderContrast")
{ if (colorbitmap == null) return; System.Drawing.Bitmap bm = CCommon.BitmapImage2Bitmap((BitmapImage)colorbitmap); AForge.Imaging.Filters.BrightnessCorrection filterB = new AForge.Imaging.Filters.BrightnessCorrection(); AForge.Imaging.Filters.ContrastCorrection filterC = new AForge.Imaging.Filters.ContrastCorrection(); filterB.AdjustValue = (int)sliderBrightness.Value; filterC.Factor = (int)sliderContrast.Value; bm = filterB.Apply(bm); bm = filterC.Apply(bm); BitmapImage bitmapimage = CCommon.Bitmap2BitmapImage(bm); theMaterial.Brush = new ImageBrush(bitmapimage) { ViewportUnits = BrushMappingMode.Absolute }; }

Getting the Best Fitting Face Model. The algo for Face Fitting has been covered earlier. Here we work through the ratios, without actually performing the image manipulation for the new face, to find the fitting error and then pick the face model with the least error. 

public string getBestFittingMesh(string filename) { FeaturePointType righteyeNew = new FeaturePointType(); FeaturePointType lefteyeNew = new FeaturePointType(); FeaturePointType noseNew = new FeaturePointType(); FeaturePointType mouthNew = new FeaturePointType(); FeaturePointType chinNew = new FeaturePointType(); for (int i = 0; i < _imagefacepoints.Count; i++) { FeaturePointType fp = new FeaturePointType(); fp.desp = _imagefacepoints[i].desp; fp.pt = _imagefacepoints[i].pt; switch (fp.desp) { case "RightEye1": righteyeNew = fp; break; case "LeftEye1": lefteyeNew = fp; break; case "Nose1": noseNew = fp; break; case "Mouth3": mouthNew = fp; break; case "Chin1": chinNew = fp; break; } } if (_degPreRotate != 0) { righteyeNew = rotateFeaturePoint(righteyeNew, _degPreRotate); lefteyeNew = rotateFeaturePoint(lefteyeNew, _degPreRotate); noseNew = rotateFeaturePoint(noseNew, _degPreRotate); mouthNew = rotateFeaturePoint(mouthNew, _degPreRotate); chinNew = rotateFeaturePoint(chinNew, _degPreRotate); } int eyedistNew = (int)(lefteyeNew.pt.X - righteyeNew.pt.X); FeaturePointType righteyeRef = new FeaturePointType(); FeaturePointType lefteyeRef = new FeaturePointType(); FeaturePointType noseRef = new FeaturePointType(); FeaturePointType mouthRef = new FeaturePointType(); FeaturePointType chinRef = new FeaturePointType(); string[] meshinfofiles = Directory.GetFiles(AppDomain.CurrentDomain.BaseDirectory + "mesh\\","*.info.txt"); List<Tuple<string,string, double>> listerr = new List<Tuple<string,string, double>>(); foreach(var infofilename in meshinfofiles) { using (var file = File.OpenText(infofilename)) { string s = file.ReadToEnd(); var lines = s.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.RemoveEmptyEntries); for (int i = 0; i < lines.Length; i++) { var parts = lines[i].Split('='); FeaturePointType fp = new FeaturePointType(); fp.desp = parts[0]; fp.pt = ExtractPoint(parts[1]); switch (fp.desp) { case "RightEye1": righteyeRef = fp; break; case "LeftEye1": lefteyeRef = fp; break; case "Nose1": noseRef = fp; break; case "Mouth3": mouthRef = fp; break; case "Chin1": chinRef = fp; break; } } } double x0Ref = (lefteyeRef.pt.X + righteyeRef.pt.X) / 2; double y0Ref = (lefteyeRef.pt.Y + righteyeRef.pt.Y) / 2; double x0New = (lefteyeNew.pt.X + righteyeNew.pt.X) / 2; double y0New = (lefteyeNew.pt.Y + righteyeNew.pt.Y) / 2; int eyedistRef = (int)(lefteyeRef.pt.X - righteyeRef.pt.X); double noselengthNew = Math.Sqrt((noseNew.pt.X - x0New) * (noseNew.pt.X - x0New) + (noseNew.pt.Y - y0New) * (noseNew.pt.Y - y0New)); double noselengthRef = Math.Sqrt((noseRef.pt.X - x0Ref) * (noseRef.pt.X - x0Ref) + (noseRef.pt.Y - y0Ref) * (noseRef.pt.Y - y0Ref)); double ratiox = (double)eyedistRef / (double)eyedistNew; double ratioy = noselengthRef / noselengthNew; double errFitting = (ratiox - ratioy) / ratiox; Point newptNose = new Point(noseNew.pt.X * ratiox, noseNew.pt.Y * ratioy); Point newptMouth = new Point(mouthNew.pt.X * ratiox, mouthNew.pt.Y * ratioy); double mouthDistRef = mouthRef.pt.Y - noseRef.pt.Y; double mouthDistNew = newptMouth.Y - newptNose.Y; double ratioy2 = mouthDistRef / mouthDistNew; double errFitting1 = (1 - ratioy2); Point newptChin = new Point(chinNew.pt.X * ratiox, chinNew.pt.Y * ratioy); double chinDistRef = chinRef.pt.Y - mouthRef.pt.Y; double chinDistNew = newptChin.Y - newptMouth.Y; double ratioy3 = chinDistRef / chinDistNew; double errFitting2 = (1 - ratioy3); double score = Math.Abs(errFitting)*4+ Math.Abs(errFitting1)*2+ Math.Abs(errFitting2); string fittingerr = (int)(errFitting*100)+":"+ (int)(errFitting1*100) +":"+ (int)(errFitting2*100); Tuple<string,string,double> tp=new Tuple<string,string,double> (infofilename,fittingerr,score); listerr.Add(tp); } var sortedlist = listerr.OrderBy(o => o.Item3).ToList(); string selected=sortedlist[0].Item1; var v=selected.Split('\\'); var v2 = v[v.Length - 1].Split('.'); string meshname = v2[0].Replace("mesh",""); return meshname ; }

Magnifier implementation using WritableBitmap. This is a simple but very useful implementation of a Magnifier. The idea is that WPF image with Auto width and height will stretch when the container (Grid/Window) is resized. In the xaml file, we define the window size as 50X50 for Window3

<Window x:Class="ThreeDFaces.Window3"

        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"

        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Height="50" Width="50" WindowStyle="None"  PreviewMouseLeftButtonDown="Window_PreviewMouseLeftButtonDown" PreviewMouseMove="Window_PreviewMouseMove" Loaded="Window_Loaded" SizeChanged="Window_SizeChanged"> <Grid Name="MainGrid"> <Image Name="Image1" HorizontalAlignment="Left" Height="Auto"  VerticalAlignment="Top"  Width="Auto"/> </Grid> </Window>

winMagnifier is a reference to Window3. When we first create a new Window3, we initialize the image source of Image1 in winMagnifier to a WritetableBitmap (50X50 in size). 

_wbitmap = new WriteableBitmap(50, 50, 96, 96, PixelFormats.Bgra32, null);
winMagnifier = new Window3();
winMagnifier.Image1.Source = _wbitmap;
UpdateMagnifier(0, 0);
winMagnifier.Owner = this;
winMagnifier.Show();

When the window is loaded, we resize it to 150X150, thus the image would look like it is being magnified 3X. We have to also ensure to keep the aspect ratio so that the window is not stretched unevenly. We implement a timer that checks that the window width will equal the window height  when the window is resized.

private void Window_Loaded(object sender, RoutedEventArgs e)
{ this.Width = 150; this.Height = 150; _resizeTimer.Tick += _resizeTimer_Tick;
} void _resizeTimer_Tick(object sender, EventArgs e)
{ _resizeTimer.IsEnabled = false; if (bHeightChanged) this.Width = this.Height; else this.Height = this.Width;
} private void Window_SizeChanged(object sender, SizeChangedEventArgs e)
{ Size oldsize = e.PreviousSize; Size newsize = e.NewSize; bHeightChanged = ((int)oldsize.Height) == ((int)newsize.Height) ? false : true; _resizeTimer.IsEnabled = true; _resizeTimer.Stop(); _resizeTimer.Start(); }

In the calling procedure, we have a method that updates the WritetableBitmap which is the source for the Image1 in  winMagnifier. In this way the content of winMagnifier changes each time we call UpdateMagnifier

private void UpdateMagnifier(int x, int y) { try { BitmapImage bmi = Image1.Source as BitmapImage; int byteperpixel=(bmi.Format.BitsPerPixel + 7) / 8; int stride = bmi.PixelWidth * byteperpixel; byte[] _buffer = new byte[50 * stride]; bmi.CopyPixels(new Int32Rect(x, y, 50, 50), _buffer, stride, 0); for (int i = 0; i < 50;i++ ) for (int k = 0; k < 2;k++ ) _buffer[24 * stride + i * byteperpixel+k] = 255; for (int j = 0; j < 50;j++ ) for (int k = 0; k < 2; k++) { _buffer[j * stride + 24 * byteperpixel + k] = 255; } _wbitmap.WritePixels(new Int32Rect(0, 0, 50, 50), _buffer, stride, 0); } catch { } }

Demo

For Crime Investigation: Getting the side profile from a previous frontal to match a side view of another image of the same person.

William Shakespear Mask  fitting with all our 6 face models:

Just for Fun with shades, colors and orientation:

Notes

Faces with open mouth showing teeth do not show well on side view. It would seem that the teeth are protruding from the mouth.

Faces with eyes significantly not level does not work well.

I have include some images in the \Images directory for testing. The source for these pictures are:

https://www.nist.gov/itl/iad/image-group/special-database-32-multiple-encounter-dataset-meds

http://www.shakespearemagazine.com

http://bigthink.com/words-of-wisdom/albert-einstein

http://pngimg.com/img/people/face

http://weknowyourdreams.com/people.html

Updates

From Version 2.0 onwards, you can process images with multiple faces.

When an image with multiple faces is loaded, a face selection window will pop up with all detected faces marked within red rectangles. Select a face by double-clicking in the rectangle. The face selection window will be minimized and the selected face will appear in the face fitting window. If you minimize the face selection window without making any selection, the first face detected will be automatically selected. If you close the face selection window, the entire multiple faces image will appear in the face fitting window. 

If you want to select another face from the multple faces image,  just restore the face selection window from the task-bar and make another selection, you do not have to reload the multiple faces image from file.

I have also included a test file with many faces in the \GroupImages directory.

In Version 2.2, you can create animated gif files for the model with the camera rotating left and right about the y axis.

  

The encoder for the animated gif is from the Code Project article: NGif, Animated GIF Encoder for .NET.

In Version 2.3, you can do Face Left/Right Mapping. Right click on the face model in the viewport and a context menu will appear for you to select 1) Right Mapping, 2) Left Mapping or 3) No Maping.

If you select Left Mapping, the Left Face (ie the side of the face that appear on the right side of your screen) will be replaced with the Right Face. Likewise, Right Mapping replaces the Right Face with the Left Face. See the Figure below:

For the face mapping to be done correctly, the stretched face must be aligned such that the eyes are level and the nose base is aligned with the nose-base marker on the face-mesh superimposed image. See Figure below:

For fine adjustments, you can move the eyes and nose markers in the Face Fitting window incrementally and observe the changes in the face-mesh superimposed image. 

History

28 May 2017: Version 2.3

New Feature: Face Left/Right Mapping

7 May 2017: Version 2.2

New Feature: Creation of animated gif

Bug Fixes: Fix inconsistent WPF controls refresh/update

5 May 2017: Version 2.1

New Feature: Caching of multiple faces, more accurate eye detection for multiple faces images

Bug Fixes: Disposing of GDI+ Bitmaps after use to clear memory

2 May 2017: Version 2.0

New Feature: Processing of Multiple faces image file

30 April 2017: Version 1.2A

New Features:

  • Caching of facial points for face files
  • More accurate eye detection
  • Auto crop of face for better display on fitting window

28 April 2017: Version 1.2

For the article:

  • Proof reading and fixing typo errors
  • Include more information on correction to fitting error

27 April 2017: Version 1.2

Bug Fixes: 

  • Added a thrid parameter for cvSaveImage function in the OpenCV wrapper to match orginal specification in the OpenCV .h file
  • Improve eye detection by fixing the implementaion for Histogram Equalization

26 April 2017: Version 1.1

Bug Fixes:

  • Include validation on input face points and selected face file.
  • Allows 8 bit image files

24 April 2017: Version 1.0

First Release

LEAVE A REPLY