University of California in Santa Barbara
CS290 Imaging: Final Project
by Svetlin (Alex) Bostandjiev , Liubov Kovaleva , Gargi Shah
Our program takes an image of a person's face and creates a caricature of the person. We create a cartoon-like sketch of the person's face. We decided to split the project into some main facial features in the following way:
The long term goal of this project is to come up with robust algorithms for detecting facial features with no user interaction.
The input can be an RGB color jpg image of any size.
We ask the user to select the location of pupils, each of them located at the very center of the iris, the colored part of the eye. The eye and eyebrow feature points detection depends on how precisely the user marks the pupils.
Once the pupils are marked we want to focus only on the part of the image where the face is located. Therefore, we crop the original image. We resize the cropped part into an image with resolution of 600x800. The horizontal distance between the pupils determines where the image needs to be cropped. In the new 600x800 image the pupils have coordinates (200, 400) and (400, 400). Thus, the new image's width is 3 times the distance between the pupils. The fact that the pupils are fixed at certain locations of the new image allows us to make reasonable assumptions of the size of facial featues in terms of pixels. For example we can assume that the iris has a diameter of at least 20 pixels, which is true for over 99% of all people.
We also convert the color image into a grayscale image to simplify our algorithms to work with one dimensional color.
The five feature points that we have chosen to define the shape of the eye are: the pupil, and the leftmost, rightmost, top, and bottom points of the eye. We already know the position of the pupil.
To find the top and bottom points of the eye we first calculate the grayscale eye and skin color. The eye color we obtain from averaging pixel intensities around the pupils, and the skin color we obtain by averaging intensities from pixels located on the forehead. Then, we go up and down from the pupil pixel by pixel until we hit a pixel that has intensity higher than some threshold value determined by the eye and skin colors. This works well except in the case of dark skinned people, as in the image on the right, and sometimes when the person is wearing makeup or when the brows are too close to the eyes.
Our algorithm for detecting the leftmost and rightmost points of the eye relies on the Matlab "edge" function. Edge takes a grayscale image I as its input, and returns a binary image BW of the same size as I, with 1's where the function finds edges in I and 0's elsewhere. Edge supports six different edge-finding methods, and we used the Sobel one. Description of the method can be found here. It returns edges at the points where the gradient of I is maximum. The function call we used is,
BW = edge(I,'sobel',thresh)with tresh = 0.05. This threshold value gives us a good binary image to work with. Some cases, in which it does not produce good edges include: image of low resolution (under 100x100), and dark shadow on one side of the face. Once we have the binary image we use our "detect" function to detect the leftmost and rightmost points of the eye. The detect function looks at a vertical line situated between the top and bottom points of the eye. The size of the vertical line can be adjusted for better results. The vertical line scans from the pupil to the left/right until it contains only black pixels, that is, when it hits an area of no prominent edges. We assume that the eye ends there. Then we back up a couple of pixels to the the pupil and look at the vertical line situated there. We use our "averageColor" function on every pixel on this line. The point with the highest average color determines at what height the eye's leftmost/rightmost point is. The averageColor function calculates the average color of a circle with a center at a given point. We use a circle instead of just a single pixel in order to account for errors due to outliers. This algorithm works well for people of any color. It does not work well in the cases when the person is wearing dark makeup or when there is a dark shadow on the side of the forehead.
In our caricature creation, we first find several points on the mouth nose outline, then distort them according the the average face values, interpolate those points, and then draw the caricature of these parts of the face. Also, the color of the mouth is approximated. The following paragraphs will explain some details about this process.
Hair is an integral part of someone’s appearance. A caricature does not look natural without a realistic-looking hair style. Human hair is a very complex visual pattern. The artist drawings are concise, and they manage to grasp the essences of hair perception with a few strokes.
Hair, however, cannot be handled in the same way as the face.
Hair is decomposed into frequency bands. The low frequency band contains the shading information whereas the high frequency band contains the texture information. Decomposition into these bands is done using the Discrete Wavelet Transform for 2-D images (on the intensity image).
For this to work, we exclusively need the 'hair' part of the image. To detect the hair part, the algorithm used is:-
(The program doesn't work correctly until the following conditions are met. Sometimes, even when the conditions are met, there are artifacts).
Matlab functions
http://www.mathworks.com/moler/ncmfilelist.html
Facial feature detection:
http://research.microsoft.com/~jgemmell/pubs/WaveBaseTR-2002-05.pdf
http://vismod.media.mit.edu/tech-reports/TR-401/node2.html
Caricature generation:
http://cgm.cs.mcgill.ca/~godfried/student_projects/garton_caricature/
http://imsc.usc.edu/research/project/facecomp/facecomp_nsf8.pdf
http://www.stat.ucla.edu/~hchen/projects/projects.htm
http://www.stat.ucla.edu/~hchen/projects/iccv2001-sketch.pdf