face — Face Detection#
This module provides a few utilities for doing face detection on image files, with the main purpose of cropping the images that the presenter provides turning them into square headshots that can be displayed on the system screens.
More specifically, crop_face() is the main function of the module,
which wraps everything together into a common interface and process a given image
file into the corresponding cropped headshot. Internally, the process happens in two steps:
first the actual face detection is performed, leveraging the facilities provided by the opencv library; note that the actual opencv call are wrapped into the thin layer
run_face_detection(), which unifies the interface across different face-detection models and transforms the output into a list ofBoxobjects;then the best candidate box is selected and post-processed into the final square to be used for cropping, and the actual headshot is generated and saved to the output file.
Available models#
The module currently supports two different face-detection models, which can be
selected via the model parameter of crop_face(): (or, equivalently,
via the same option from the main command-line interface):
CASCADE, which is the traditional Haar-cascade-based face detection model provided by opencv; this is the default model, and it is very fast. (See https://docs.opencv.org/4.x/db/d28/tutorial_cascade_classifier.html for more information);YUNET, which is a more modern face-detection model based on a convolutional neural network; this is somewhat slower, but performs better in some cases. (See https://github.com/opencv/opencv_zoo/tree/main/models/face_detection_yunet for more information.)
If you do process lots of images in batch, you will realize that there isn’t a single setup that handles all the edge cases without manual interventions. (Especially when a presenter sends you a close-up picture which is taken in Antarctica, from very far away, with glasses and helmet on, and a polar bear in the background.)
In cases when things do not quite work out as expected, your best bet is to switch
between the two models and/or change the min_fractional_area parameter, which controls
the minimum size of the box containing the detected face. You can run the thing into
interactive mode (with the --interactive flag) to get some insights into the
face-detection process and understand better what is going wrong.
Warning
Each face-detection model has its own parameters, but these are not yet exposed through the public API. We could implement that in the future, if it turns out that this would be useful.
Module documentation#
Face-detection and cropping facilities.
- class easely.face.FaceDetection(*values)[source]#
Small Enum class with the available face-detection models.
- CASCADE = 'cascade'#
- YUNET = 'yunet'#
- easely.face._read_image(file_path: str | Path) ndarray[source]#
Run
cv2.imread()on a given file path.This is a generic helper function for all the open-cv face-detection algorithms.
Arguments#
- file_pathPathLike
The path to the image file.
Returns#
- np.ndarray
The image as a NumPy array.
- class easely.face.Box(x0: int, y0: int, width: int, height: int, fractional_area: float, score: float = 1.0)[source]#
Wrapper around the Rectangle class, representing a bounding box from face detection.
In addition to the basic rectangle properties, this container keeps track of all the stuff that we need in order to sort the face-detection candidates and select the best one (e.g., the fractional area within the original image, and any score metrics from the face-detection algorithm itself).
Arguments#
- x0int
The x coordinate of the top-left corner of the rectangle.
- y0int
The y coordinate of the top-left corner of the rectangle.
- widthint
The width of the rectangle.
- heightint
The height of the rectangle.
- fractional_areafloat
The area of the rectangle as a fraction of the original image area.
- scorefloat
The confidence score of the face detection, if available (1.0 if not).
- fractional_area: float#
- score: float = 1.0#
- classmethod from_cascade(data: Tuple[float, float, float, float], original_area: int) Box[source]#
Create a Box object from the output of the cascade face-detection model.
The cascade model returns rectangles in the form of (x, y, width, height) tuples, and this method is meant to convert them into Box objects, by calculating the corresponding fractional area and setting the score to 1.0 (since the cascade model does not provide a confidence score).
Arguments#
- datatuple
The output of the cascade face-detection model.
- original_areaint
The area of the original image in pixels.
Returns#
- Box
A Box object corresponding to the given cascade output.
- classmethod from_yunet(data: ndarray, original_area: int) Box[source]#
Create a Box object from the output of the YuNet face-detection model.
The YuNet model returns rectangles in the form of (x, y, width, height, score) tuples, and this method is meant to convert them into Box objects, by calculating the corresponding fractional area and setting the score to the value provided by the model.
Arguments#
- datanp.ndarray
The output of the YuNet face-detection model.
- original_areaint
The area of the original image in pixels.
Returns#
- Box
A Box object corresponding to the given YuNet output.
- easely.face.run_cascade(file_path: str | Path, min_fractional_area: float = 0.02, scale_factor: float = 1.1, min_neighbors: int = 2) List[Box][source]#
Minimal wrapper around the standard opencv face detection, see, e.g, https://www.datacamp.com/tutorial/face-detection-python-opencv
Internally this is creating a
cv2.CascadeClassifierobject based on a suitable model file for face detection, and running adetectMultiScalecall with the proper parameters. The output rectangles containing the candidate faces, which are returned by opencv as simple (x, y, width, height) tuples, are converted intoBoxobjects, and the list of boxes is sorted according to the corresponding area from the smallest to the largest to help with the selection process downstream.Note that this is producing squares (since apparently this is the way the default model we are using was trained) that are only big enough to cover the visible part of the face, and if you use this to crop a large image to the person face it is very likely that you will want to add some padding on the four sides, and especially on the top, which empirically seems to be the most overlooked part of the face.
The
min_neighborsparameter has an important effect on the results and should be set on a case-by-case basis. The cascade classifier applies a sliding window through the image, and initially it will capture a large number of false positives. This parameter specifies the number of neighboring rectangles that need to be identified for an object to be considered a valid detection: a value of 0 is idiotic, and it will likely return an enormous number of (possibly overlapping) rectangles. Small values will yield comparatively more false positives. I would say 2 is the absolute minimum one might consider using, and something around 5 is more germane to what is commonly found in tutorials online.Parameters#
- file_pathPathLike
The path to input image file.
- min_fractional_areafloat
The minimum area of the output rectangle as a fraction of the original image area. Objects smaller than that are ignored. This is converted internally to an actual size in pixels and passed as the
minSizeparameter to thedetectMultiScalecall.- scale_factorfloat
Parameter specifying how much the image size is reduced at each image scale (passed along verbatim as
scaleFactorto thedetectMultiScalecall).- min_neighborsint
Parameter specifying how many neighbors each candidate rectangle should have to retain it (passed along verbatim as
minNeighborsto thedetectMultiScalecall).
Returns#
- list[Box]
The list of
Boxobjects containing the face candidates.
- easely.face.run_yunet(file_path: str | Path, score_threshold: float = 0.7, nms_threshold: float = 0.3, top_k: int = 5000) List[Box][source]#
Run the YuNet face detection model.
The YuNet output is of the form of a numpy array where each candidate face is represented by a 15-element vector
0-1: x, y of bbox top left corner
2-3: width, height of bbox
4-5: x, y of right eye (blue point in the example image)
6-7: x, y of left eye (red point in the example image)
8-9: x, y of nose tip (green point in the example image)
10-11: x, y of right corner of mouth (pink point in the example image)
12-13: x, y of left corner of mouth (yellow point in the example image)
14: face score
We are basically interested in the first four values for the bounding box, and the last one for the confidence score.
See https://docs.opencv.org/4.x/df/d20/classcv_1_1FaceDetectorYN.html for more information. You can retrieve the model file from https://github.com/opencv/opencv_zoo/tree/main/models/face_detection_yunet
Arguments#
- file_pathPathLike0
The path to input image file.
- score_thresholdfloat
The confidence score threshold for the face detection (0–1). This simply filters out all the candidates below a given threshold.
- nms_thresholdfloat
The non-maximum suppression threshold for the face detection. This is meant to remove duplicate overlapping boxes for the same face, which are commonly predicted by the model. Lower value (e.g. 0.3) imply a more aggressive removal, with fewer duplicates, while higher values (e.g. 0.7) keeps more boxes, at the risk of duplicates.
- top_kint
The maximum number of candidates to consider before non-maximum suppression. The model may produce thousands of raw detections, and
top_kkeeps only the best K by score before suppression. Lower values result in shorter running times, but you might miss faces in crowded scenes; higher is safer, but slower on average.
Returns#
- list[Box]
The list of
Boxobjects containing the face candidates.
- easely.face.run_face_detection(file_path: str | Path, model: FaceDetection, min_fractional_area: float = 0.02, **kwargs) List[Box][source]#
Run the face detection on the input image, with the specified model and parameters.
This is designed to wrap the actual face-detection algorithms implemented in opencv and provide a single, unified interface to be used by the rest of the codebase. We assume that any worker function is wrapped to return a list of
Boxobjects, which we then filter here to eliminate candidates with a fractional area smaller than the specified threshold, and sort based on the overall quality.Arguments#
- file_pathPathLike
The path to input image file.
- modelFaceDetection
The face-detection model to use. This is an Enum with the available models, and it is meant to be extended in the future as we add more models.
- min_fractional_areafloat
The minimum area of the detected face bounding box as a fraction of the original image area. Objects smaller than that are ignored.
- kwargs
Optional keyword arguments to be passed to the actual face-detection function, depending on the model. See the documentation of the specific functions for details on what parameters are accepted.
- easely.face.refine_rectangle(rectangle: Rectangle, image_width: int, image_height: int, horizontal_padding: float = 0.5, top_scale_factor: float = 1.25) Rectangle[source]#
Massage a given rectangle to make it suitable for cropping a face out of an image.
This is used to transform the candidate rectangle containing the face returned by opencv into a proper bounding box to be cropped off the original image, which in general we would like to be significantly larger than the face-detection output. The process takes place in two steps: first we pad the original rectangle based on the input parameters, and then we make the necessary modifications, if any, to make the final rectangle fit within the original image. The rule of thumb is that if the overall dimensions of the rectangle fit in the original image, we keep the width and the height of the rectangle and apply the smallest possible shift to the origin so that the cropping area does not extend outside the image. When the padded rectangle is too big for the original image, instead, we resort to the largest square that can be embedded in the image itself, and is approximately centered on the initial rectangle. (The comments in code might provide the user a firmer grasp on what is actually happening behind the scenes.)
Parameters#
- rectangleRectangle
The original rectangle returned by the face-detection stage.
- image_widthint
The with of the original image.
- image_heightint
The height of the original image.
- horizontal_paddingfloat
The horizontal padding, on either side, in units of the equivalent ] square side of the rectangle.
- top_scale_factorfloat
The ratio between the pad on the top and that on the right/left.
Returns#
- Rectangle
A new Rectangle object, ready for cropping.
- easely.face.crop_face(file_path: str | Path, output_file_path: str | Path, size: int, circular_mask: bool = False, model: FaceDetection = FaceDetection.CASCADE, min_fractional_area: float = 0.02, detect_kwargs: dict = None, horizontal_padding: float = 0.5, top_scale_factor: float = 1.25, interactive: bool = False, overwrite: bool = False) str | Path[source]#
Produce a square, cropped version of the input image, suitable for use as a headshot (i.e., cropped around the face of the person in the image).
This is running a simple face detection based on opencv, and then adapting the best candidate bounding box to produce the final square cropping area.
Arguments#
- file_pathPathLike
The path to the input image file.
- output_file_pathPathLike
The path where the cropped image will be saved.
- sizeint
The size of the output image (square).
- circular_maskbool, optional
Whether to apply a circular mask to the output image.
- modelFaceDetection
The face-detection model to use.
- min_fractional_areafloat
The minimum area of the detected face bounding box as a fraction of the original image area. Objects smaller than that are ignored.
- detect_kwargsdict, optional
Optional keyword arguments to be passed to the face detection function.
- horizontal_paddingfloat, optional
The amount of horizontal padding to be applied to the detected face bounding box.
- top_scale_factorfloat, optional
The scale factor for the top padding relative to the horizontal padding.
- interactivebool, optional
Whether to display the image with bounding boxes for debugging.
Returns#
- PathLike
The path to the cropped image file, if it was actually created/overwritten, of None otherwise.