face — Face Detection#

This module provides a few utilities for doing face detection on image files, with the main purpose of cropping the images that the presenter provides turning them into square headshots that can be displayed on the system screens.

More specifically, crop_face() is the main function of the module, which wraps everything together into a common interface and process a given image file into the corresponding cropped headshot. Internally, the process happens in two steps:

  • first the actual face detection is performed, leveraging the facilities provided by the opencv library; note that the actual opencv call are wrapped into the thin layer run_face_detection(), which unifies the interface across different face-detection models and transforms the output into a list of Box objects;

  • then the best candidate box is selected and post-processed into the final square to be used for cropping, and the actual headshot is generated and saved to the output file.

Available models#

The module currently supports two different face-detection models, which can be selected via the model parameter of crop_face(): (or, equivalently, via the same option from the main command-line interface):

If you do process lots of images in batch, you will realize that there isn’t a single setup that handles all the edge cases without manual interventions. (Especially when a presenter sends you a close-up picture which is taken in Antarctica, from very far away, with glasses and helmet on, and a polar bear in the background.)

In cases when things do not quite work out as expected, your best bet is to switch between the two models and/or change the min_fractional_area parameter, which controls the minimum size of the box containing the detected face. You can run the thing into interactive mode (with the --interactive flag) to get some insights into the face-detection process and understand better what is going wrong.

Warning

Each face-detection model has its own parameters, but these are not yet exposed through the public API. We could implement that in the future, if it turns out that this would be useful.

Module documentation#

Face-detection and cropping facilities.

class easely.face.FaceDetection(*values)[source]#

Small Enum class with the available face-detection models.

CASCADE = 'cascade'#
YUNET = 'yunet'#
easely.face._read_image(file_path: str | Path) ndarray[source]#

Run cv2.imread() on a given file path.

This is a generic helper function for all the open-cv face-detection algorithms.

Arguments#

file_pathPathLike

The path to the image file.

Returns#

np.ndarray

The image as a NumPy array.

class easely.face.Box(x0: int, y0: int, width: int, height: int, fractional_area: float, score: float = 1.0)[source]#

Wrapper around the Rectangle class, representing a bounding box from face detection.

In addition to the basic rectangle properties, this container keeps track of all the stuff that we need in order to sort the face-detection candidates and select the best one (e.g., the fractional area within the original image, and any score metrics from the face-detection algorithm itself).

Arguments#

x0int

The x coordinate of the top-left corner of the rectangle.

y0int

The y coordinate of the top-left corner of the rectangle.

widthint

The width of the rectangle.

heightint

The height of the rectangle.

fractional_areafloat

The area of the rectangle as a fraction of the original image area.

scorefloat

The confidence score of the face detection, if available (1.0 if not).

fractional_area: float#
score: float = 1.0#
classmethod from_cascade(data: Tuple[float, float, float, float], original_area: int) Box[source]#

Create a Box object from the output of the cascade face-detection model.

The cascade model returns rectangles in the form of (x, y, width, height) tuples, and this method is meant to convert them into Box objects, by calculating the corresponding fractional area and setting the score to 1.0 (since the cascade model does not provide a confidence score).

Arguments#

datatuple

The output of the cascade face-detection model.

original_areaint

The area of the original image in pixels.

Returns#

Box

A Box object corresponding to the given cascade output.

classmethod from_yunet(data: ndarray, original_area: int) Box[source]#

Create a Box object from the output of the YuNet face-detection model.

The YuNet model returns rectangles in the form of (x, y, width, height, score) tuples, and this method is meant to convert them into Box objects, by calculating the corresponding fractional area and setting the score to the value provided by the model.

Arguments#

datanp.ndarray

The output of the YuNet face-detection model.

original_areaint

The area of the original image in pixels.

Returns#

Box

A Box object corresponding to the given YuNet output.

quality() float[source]#

Empirical quality factor for sorting the candidate face-detection boxes.

easely.face.run_cascade(file_path: str | Path, min_fractional_area: float = 0.02, scale_factor: float = 1.1, min_neighbors: int = 2) List[Box][source]#

Minimal wrapper around the standard opencv face detection, see, e.g, https://www.datacamp.com/tutorial/face-detection-python-opencv

Internally this is creating a cv2.CascadeClassifier object based on a suitable model file for face detection, and running a detectMultiScale call with the proper parameters. The output rectangles containing the candidate faces, which are returned by opencv as simple (x, y, width, height) tuples, are converted into Box objects, and the list of boxes is sorted according to the corresponding area from the smallest to the largest to help with the selection process downstream.

Note that this is producing squares (since apparently this is the way the default model we are using was trained) that are only big enough to cover the visible part of the face, and if you use this to crop a large image to the person face it is very likely that you will want to add some padding on the four sides, and especially on the top, which empirically seems to be the most overlooked part of the face.

The min_neighbors parameter has an important effect on the results and should be set on a case-by-case basis. The cascade classifier applies a sliding window through the image, and initially it will capture a large number of false positives. This parameter specifies the number of neighboring rectangles that need to be identified for an object to be considered a valid detection: a value of 0 is idiotic, and it will likely return an enormous number of (possibly overlapping) rectangles. Small values will yield comparatively more false positives. I would say 2 is the absolute minimum one might consider using, and something around 5 is more germane to what is commonly found in tutorials online.

Parameters#

file_pathPathLike

The path to input image file.

min_fractional_areafloat

The minimum area of the output rectangle as a fraction of the original image area. Objects smaller than that are ignored. This is converted internally to an actual size in pixels and passed as the minSize parameter to the detectMultiScale call.

scale_factorfloat

Parameter specifying how much the image size is reduced at each image scale (passed along verbatim as scaleFactor to the detectMultiScale call).

min_neighborsint

Parameter specifying how many neighbors each candidate rectangle should have to retain it (passed along verbatim as minNeighbors to the detectMultiScale call).

Returns#

list[Box]

The list of Box objects containing the face candidates.

easely.face.run_yunet(file_path: str | Path, score_threshold: float = 0.7, nms_threshold: float = 0.3, top_k: int = 5000) List[Box][source]#

Run the YuNet face detection model.

The YuNet output is of the form of a numpy array where each candidate face is represented by a 15-element vector

  • 0-1: x, y of bbox top left corner

  • 2-3: width, height of bbox

  • 4-5: x, y of right eye (blue point in the example image)

  • 6-7: x, y of left eye (red point in the example image)

  • 8-9: x, y of nose tip (green point in the example image)

  • 10-11: x, y of right corner of mouth (pink point in the example image)

  • 12-13: x, y of left corner of mouth (yellow point in the example image)

  • 14: face score

We are basically interested in the first four values for the bounding box, and the last one for the confidence score.

See https://docs.opencv.org/4.x/df/d20/classcv_1_1FaceDetectorYN.html for more information. You can retrieve the model file from https://github.com/opencv/opencv_zoo/tree/main/models/face_detection_yunet

Arguments#

file_pathPathLike0

The path to input image file.

score_thresholdfloat

The confidence score threshold for the face detection (0–1). This simply filters out all the candidates below a given threshold.

nms_thresholdfloat

The non-maximum suppression threshold for the face detection. This is meant to remove duplicate overlapping boxes for the same face, which are commonly predicted by the model. Lower value (e.g. 0.3) imply a more aggressive removal, with fewer duplicates, while higher values (e.g. 0.7) keeps more boxes, at the risk of duplicates.

top_kint

The maximum number of candidates to consider before non-maximum suppression. The model may produce thousands of raw detections, and top_k keeps only the best K by score before suppression. Lower values result in shorter running times, but you might miss faces in crowded scenes; higher is safer, but slower on average.

Returns#

list[Box]

The list of Box objects containing the face candidates.

easely.face.run_face_detection(file_path: str | Path, model: FaceDetection, min_fractional_area: float = 0.02, **kwargs) List[Box][source]#

Run the face detection on the input image, with the specified model and parameters.

This is designed to wrap the actual face-detection algorithms implemented in opencv and provide a single, unified interface to be used by the rest of the codebase. We assume that any worker function is wrapped to return a list of Box objects, which we then filter here to eliminate candidates with a fractional area smaller than the specified threshold, and sort based on the overall quality.

Arguments#

file_pathPathLike

The path to input image file.

modelFaceDetection

The face-detection model to use. This is an Enum with the available models, and it is meant to be extended in the future as we add more models.

min_fractional_areafloat

The minimum area of the detected face bounding box as a fraction of the original image area. Objects smaller than that are ignored.

kwargs

Optional keyword arguments to be passed to the actual face-detection function, depending on the model. See the documentation of the specific functions for details on what parameters are accepted.

easely.face.refine_rectangle(rectangle: Rectangle, image_width: int, image_height: int, horizontal_padding: float = 0.5, top_scale_factor: float = 1.25) Rectangle[source]#

Massage a given rectangle to make it suitable for cropping a face out of an image.

This is used to transform the candidate rectangle containing the face returned by opencv into a proper bounding box to be cropped off the original image, which in general we would like to be significantly larger than the face-detection output. The process takes place in two steps: first we pad the original rectangle based on the input parameters, and then we make the necessary modifications, if any, to make the final rectangle fit within the original image. The rule of thumb is that if the overall dimensions of the rectangle fit in the original image, we keep the width and the height of the rectangle and apply the smallest possible shift to the origin so that the cropping area does not extend outside the image. When the padded rectangle is too big for the original image, instead, we resort to the largest square that can be embedded in the image itself, and is approximately centered on the initial rectangle. (The comments in code might provide the user a firmer grasp on what is actually happening behind the scenes.)

Parameters#

rectangleRectangle

The original rectangle returned by the face-detection stage.

image_widthint

The with of the original image.

image_heightint

The height of the original image.

horizontal_paddingfloat

The horizontal padding, on either side, in units of the equivalent ] square side of the rectangle.

top_scale_factorfloat

The ratio between the pad on the top and that on the right/left.

Returns#

Rectangle

A new Rectangle object, ready for cropping.

easely.face.crop_face(file_path: str | Path, output_file_path: str | Path, size: int, circular_mask: bool = False, model: FaceDetection = FaceDetection.CASCADE, min_fractional_area: float = 0.02, detect_kwargs: dict = None, horizontal_padding: float = 0.5, top_scale_factor: float = 1.25, interactive: bool = False, overwrite: bool = False) str | Path[source]#

Produce a square, cropped version of the input image, suitable for use as a headshot (i.e., cropped around the face of the person in the image).

This is running a simple face detection based on opencv, and then adapting the best candidate bounding box to produce the final square cropping area.

Arguments#

file_pathPathLike

The path to the input image file.

output_file_pathPathLike

The path where the cropped image will be saved.

sizeint

The size of the output image (square).

circular_maskbool, optional

Whether to apply a circular mask to the output image.

modelFaceDetection

The face-detection model to use.

min_fractional_areafloat

The minimum area of the detected face bounding box as a fraction of the original image area. Objects smaller than that are ignored.

detect_kwargsdict, optional

Optional keyword arguments to be passed to the face detection function.

horizontal_paddingfloat, optional

The amount of horizontal padding to be applied to the detected face bounding box.

top_scale_factorfloat, optional

The scale factor for the top padding relative to the horizontal padding.

interactivebool, optional

Whether to display the image with bounding boxes for debugging.

Returns#

PathLike

The path to the cropped image file, if it was actually created/overwritten, of None otherwise.