University of California in Santa Barbara
CS290 Imaging: Final Project

by Svetlin (Alex) Bostandjiev , Liubov Kovaleva , Gargi Shah

Automatic Caricature Generator

Our program takes an image of a person's face and creates a caricature of the person. We create a cartoon-like sketch of the person's face. We decided to split the project into some main facial features in the following way:

The long term goal of this project is to come up with robust algorithms for detecting facial features with no user interaction.

Results





Eyes and Eyebrows

We are using our own algorithms for detecting the eye and the eyebrows. The algorithms are aimed to be simple and easy to implement. Here are some results.

Initialization

The input can be an RGB color jpg image of any size.

We ask the user to select the location of pupils, each of them located at the very center of the iris, the colored part of the eye. The eye and eyebrow feature points detection depends on how precisely the user marks the pupils.

Once the pupils are marked we want to focus only on the part of the image where the face is located. Therefore, we crop the original image. We resize the cropped part into an image with resolution of 600x800. The horizontal distance between the pupils determines where the image needs to be cropped. In the new 600x800 image the pupils have coordinates (200, 400) and (400, 400). Thus, the new image's width is 3 times the distance between the pupils. The fact that the pupils are fixed at certain locations of the new image allows us to make reasonable assumptions of the size of facial featues in terms of pixels. For example we can assume that the iris has a diameter of at least 20 pixels, which is true for over 99% of all people.

We also convert the color image into a grayscale image to simplify our algorithms to work with one dimensional color.

Eye

Feature detection

The five feature points that we have chosen to define the shape of the eye are: the pupil, and the leftmost, rightmost, top, and bottom points of the eye. We already know the position of the pupil.

Top and bottom

To find the top and bottom points of the eye we first calculate the grayscale eye and skin color. The eye color we obtain from averaging pixel intensities around the pupils, and the skin color we obtain by averaging intensities from pixels located on the forehead. Then, we go up and down from the pupil pixel by pixel until we hit a pixel that has intensity higher than some threshold value determined by the eye and skin colors. This works well except in the case of dark skinned people, as in the image on the right, and sometimes when the person is wearing makeup or when the brows are too close to the eyes.

Left and right

Our algorithm for detecting the leftmost and rightmost points of the eye relies on the Matlab "edge" function. Edge takes a grayscale image I as its input, and returns a binary image BW of the same size as I, with 1's where the function finds edges in I and 0's elsewhere. Edge supports six different edge-finding methods, and we used the Sobel one. Description of the method can be found here. It returns edges at the points where the gradient of I is maximum. The function call we used is,

	BW = edge(I,'sobel',thresh)
with tresh = 0.05. This threshold value gives us a good binary image to work with. Some cases, in which it does not produce good edges include: image of low resolution (under 100x100), and dark shadow on one side of the face. Once we have the binary image we use our "detect" function to detect the leftmost and rightmost points of the eye. The detect function looks at a vertical line situated between the top and bottom points of the eye. The size of the vertical line can be adjusted for better results. The vertical line scans from the pupil to the left/right until it contains only black pixels, that is, when it hits an area of no prominent edges. We assume that the eye ends there. Then we back up a couple of pixels to the the pupil and look at the vertical line situated there. We use our "averageColor" function on every pixel on this line. The point with the highest average color determines at what height the eye's leftmost/rightmost point is. The averageColor function calculates the average color of a circle with a center at a given point. We use a circle instead of just a single pixel in order to account for errors due to outliers. This algorithm works well for people of any color. It does not work well in the cases when the person is wearing dark makeup or when there is a dark shadow on the side of the forehead.

Displaying

The caricature version of the eye consists of an outside shape, an iris, and a pupil. The first two are drawn using the functions "pchip" and "splinetx" that use cubic interpolation. The iris's color depends on the RGB color of the iris in the original image.

Eyebrow

Feature detection

The algorithm for detecting the brows relies on the fact that the eyebrows are positioned above the eyes. The algorithm looks at a certain rectangular region above the eye and finds the points with color that exceed a certain threshold. The threshold is a function of the skin color and the horizontal position of the pixel we are looking at. The latter is important because in most images, due to the curvature of the head the sides of the head appear darker, and thus brows are harder to distinguish. For each vertical line of pixels in the rectangular region we create a vector of the vertical positions of the points that are darker than the threshold. We say that part of the brow is located in this vertical line if we have detected enough dark pixels (i.e. at least 5). We find the top and bottom of the brow in this vertical region by taking the max and min y values of some middle percentile range of all y values in this vertical line. We do not take the absolute min and max because sometimes we have outliers, due to wrinkles or makeup for example. Then we get rid of the brow information at this vertical line if the distance between and min and the max is too big (i.e. more than 25 pixels). As we said before we can make assumptions about the sizes of facial features because of the way we crop and resize the image. Finally, the brow is composed of all top and bottom points at all vertical lines. This method works best in the cases of dark brows on light skin. It does not work well for light brows on light skin. Also due to the shadow on the side of the head and the fact that sometimes the person's hair falls close to the brows, we sometimes get wrong information about the outer ends of the brows like in the image on the right.

Displaying

The caricature version of the brow consists of two curves. One represents the top edge of the brow and the other represents the bottom edge.

Future work

Features we wanted to implement but ran out of time:

Mouth and Nose

In our caricature creation, we first find several points on the mouth nose outline, then distort them according the the average face values, interpolate those points, and then draw the caricature of these parts of the face. Also, the color of the mouth is approximated. The following paragraphs will explain some details about this process.

Finding the outline points:

Distortion

Interpolation

Mouth Color

Hair

Hair is an integral part of someone’s appearance. A caricature does not look natural without a realistic-looking hair style. Human hair is a very complex visual pattern. The artist drawings are concise, and they manage to grasp the essences of hair perception with a few strokes.

Hair, however, cannot be handled in the same way as the face.

Hair is decomposed into frequency bands. The low frequency band contains the shading information whereas the high frequency band contains the texture information. Decomposition into these bands is done using the Discrete Wavelet Transform for 2-D images (on the intensity image).

For this to work, we exclusively need the 'hair' part of the image. To detect the hair part, the algorithm used is:-

Requirements of the image

It should not be too large (write the size restriction).

Limitations

For correct hair construction, the ideal image has following properties:-

(The program doesn't work correctly until the following conditions are met. Sometimes, even when the conditions are met, there are artifacts).

Future work

As of now, this project detects the outline of the hair (for most images). However, this outline is not sufficient for caricature. The texture and the shading information from the Discrete Wave Transform has to be manipulated to get the appropriate artistic strokes. It is interesting to note that there are multiple caricatures of a person, and often, it is not necessary to have a precise representation of the photo for a viewer to recognize the person. Insomplete information may also be sufficient.

References

Matlab functions
http://www.mathworks.com/moler/ncmfilelist.html

Facial feature detection:
http://research.microsoft.com/~jgemmell/pubs/WaveBaseTR-2002-05.pdf
http://vismod.media.mit.edu/tech-reports/TR-401/node2.html

Caricature generation:
http://cgm.cs.mcgill.ca/~godfried/student_projects/garton_caricature/
http://imsc.usc.edu/research/project/facecomp/facecomp_nsf8.pdf
http://www.stat.ucla.edu/~hchen/projects/projects.htm
http://www.stat.ucla.edu/~hchen/projects/iccv2001-sketch.pdf