Due to my attempt to create a perceptual interface using OpenCV, this post is the first in a series to explain briefly some of the capabilities of the OpenCV library, an open-source computer-vision library.

OpenCV comes with over 500 functions that cover many areas in vision, and its goal is to provide a simple-to-use computer vision infraestructure to build fairly sophisticated vision application quickly. The library is written in C and C++ and runs under Linux, Windows and MAC OS X. There is active development on interfaces for Phyton, Ruby, Matlab, and other languages.

How to find faces

Finding faces means finding complex objects, so OpenCV uses a statistical model (often called classifier), which is trained to find the object we are looking for. The training consists in a set of images, divided into “positive” samples and “negative” samples. The positive samples are instances of the object class of interest and the “negative”, images that don’t contain the object of interest.

From training the statistical model with the images explained above, a set of features are extracted and distinctive features that can be used to classify the object are selected. At the end you have a detector with a set of features that give that chance to find the desired object. The detector used by openCV is called Haar Cascade Classifier and uses simple rectangular features, called Haar features. The word “cascade” in the classifier means that the resultant classifier consists of several simpler classifiers that are applied subsequently to a region of interest until at some stage the candidate is rejected or all stages are passed. These classifiers use Haar features as these:

haarclassifiers

The feature used in a particular classifiers is specified by its shape, position within the region of interest and the scale and for a basic understanding the idea is that these features encode the existence of oriented constrasts between regions in the image so a set of these features can be used to encode the constrast exhibited by a human face and their spacial relationships.

Basically, the process of face detection slides a “search window” through the image, checking whether an image region can be considered as a “face object” or not. The detector asumes a fixed scale for the object, but since face in an image can be different from the asumed scale, the “search window” goes trough the image several times, searching for the object across a ranges of sizes.

windowsearch

The last openCV download offers a completes set of classifiers as XML files, which include, among others, classifiers for frontal faces, profile faces, eyes, mouth, nose, upper body, lower body, etc…

If you have downloaded OpenCV 2.0, you will find a nice example to start playing with, it’s located at [installation_directory/OpenCV2.0/samples/c/facedetect.exe, and the C source code at [installation_directory/OpenCV2.0/samples/c/facedetect.c.

For the beginners, the code is not enough simple as it should be, so before digging into the code, I recommend start looking at the face detection page in the openCV Wiki. Anyway, it's worth to add some comments to the original example, so let's have a look at the function detectAndDraw of facedetect.c, where all the stuff regarding face detections happens:

void detectAndDraw( Mat& img, CascadeClassifier& cascade, CascadeClassifier& nestedCascade, double scale)
{
    int i = 0;
    double t = 0;
    vector<Rect> faces;
    const static Scalar colors[] =  { CV_RGB(0,0,255),CV_RGB(0,128,255),CV_RGB(0,255,255),CV_RGB(0,255,0), CV_RGB(255,128,0),CV_RGB(255,255,0),CV_RGB(255,0,0),CV_RGB(255,0,255)} ;
    Mat gray, smallImg( cvRound (img.rows/scale), cvRound(img.cols/scale), CV_8UC1 );

The classifier works on on grey scale images, so the incoming BGR image img is converted to greyscale and then optionally resized.

    cvtColor( img, gray, CV_BGR2GRAY );
    resize( gray, smallImg, smallImg.size(), 0, 0, INTER_LINEAR );
    equalizeHist( smallImg, smallImg );

Now the histogram is equalized, that means to spread out the intensity values (the brightness) of the image histogram. This image will help to get the idea:

Now it's time to detect the faces in the input image, so the function detectMultiscale does exactly that. This function returns the detected objects as a list of rectangles. In this case, the vector faces will save the returned data. This function also receives some params to configure the detection:

  • scaleFactor: how much the image size is reduced at each image scale
  • minNeighbors: how many neighbors should each candidate rectangle have to retain it
  • the flag CV_HAAR_SCALE_IMAGE, which tells the algorithm to scale the image rather than the detector
  • minSize(the minmum possible object size, object smaller are ignored).
t = (double)cvGetTickCount();
    cascade.detectMultiScale( smallImg, faces,
        1.1, 2, 0
        //|CV_HAAR_FIND_BIGGEST_OBJECT
        //|CV_HAAR_DO_ROUGH_SEARCH
        |CV_HAAR_SCALE_IMAGE
        ,
        Size(30, 30) );
    t = (double)cvGetTickCount() - t;
    printf( "detection time = %g ms\n", t/((double)cvGetTickFrequency()*1000.) );

From here on, everythings is pretty straight forward, we loop through each face and drawing a circle in each face location.

    for( vector<Rect>::const_iterator r = faces.begin(); r != faces.end(); r++, i++ )
    {
        Mat smallImgROI;
        vector<Rect> nestedObjects;
        Point center;
        Scalar color = colors[i%8];
        int radius;
        center.x = cvRound((r->x + r->width*0.5)*scale);
        center.y = cvRound((r->y + r->height*0.5)*scale);
        radius = cvRound((r->width + r->height)*0.25*scale);
        circle( img, center, radius, color, 3, 8, 0 );

Here we check if the we have loaded a second classifier. In our case, facedetect.c defines a second classifiers, which has been trained to detect eyeglasses, so what we are doing here is to repeat the process explained above, but instead of looking for faces in the input image, we are looking for eyeglasses only in the regions of the images that are considered as faces.

        if( nestedCascade.empty() )
            continue;
        smallImgROI = smallImg(*r);
        nestedCascade.detectMultiScale( smallImgROI, nestedObjects,
            1.1, 2, 0
            //|CV_HAAR_FIND_BIGGEST_OBJECT
            //|CV_HAAR_DO_ROUGH_SEARCH
            //|CV_HAAR_DO_CANNY_PRUNING
            |CV_HAAR_SCALE_IMAGE
            ,
            Size(30, 30) );
        for( vector<Rect>::const_iterator nr = nestedObjects.begin(); nr != nestedObjects.end(); nr++ )
        {
            center.x = cvRound((r->x + nr->x + nr->width*0.5)*scale);
            center.y = cvRound((r->y + nr->y + nr->height*0.5)*scale);
            radius = cvRound((nr->width + nr->height)*0.25*scale);
            circle( img, center, radius, color, 3, 8, 0 );
        }
    }  
    cv::imshow( "result", img );    
}
Posted in OpenCV Tags:

1 Comment • Give your comment!

  1. by varsha holla

    10th May 2014

    10:28 am

    This explanation helped me alot. Thank you so much.

Have your say

About me

Freelance interactive developer and multimedia engineer. I'm interested in data visualization, RIA's and interactive applications.


More info here or also at the following sites: