Application of Facial Attribute Recognition in Different Domains

Akshat Dubey
8 min readJul 13, 2021

The total valuation of the global facial recognition market in 2020 was USD 3.86 Billion. With a compound annual growth rate (CAGR) of 15.4% from 2021 to 2028, the global facial recognition market is expanding. The utilization of this technology is evolving exponentially. Biometric technologies are widely utilized to improve security. These can be found in a variety of applications, including access control, attendance tracking, security and surveillance, and more. [ Source]

What is facial attribute recognition?

Facial attribute recognition comes under the category of biometrics and is a type of image recognition technology. This technology has been accepted worldwide by banks, non-public places, etc for security purposes. This technology involves:

  • A digital camera to detect and capture faces.
  • The detected faces will be provided as input through to a CNN-based neural network.
  • Then the neural network will extract the attributes. These attributes will be matched against the attributes stored in a secure database.

Utilization of facial attribute recognition in the industry

[Source]

There are multiple examples of organizations that are integrating facial recognition technologies with their day-to-day operations to increase security and efficiency. Here, we are listing the most common use cases of this technology in different domains.

Facial attribute recognition in different domains

In the field of security and surveillance

Law enforcement agencies are rapidly adopting this technology for enhanced security and surveillance. NtechLab, a Russian artificial intelligence algorithm developing company has provided Moscow police with facial recognition cameras. The police will be using these cameras to search for suspects using live face detection. The software would automatically notify the police when there is a high probability match.

[Source]

Facial recognition techniques are being used by law enforcement authorities to locate missing children and reveal the identities of criminals.

[Source]

Additionally, airports are increasingly adopting facial recognition technology at security checkpoints, as people are less inclined to conduct crimes when they are monitored by security systems. As a result, this technology reduces the likelihood of crime in public places.

In the field of Real-State

A Japanese-based real state company, Mitsui Fudosan Co., Ltd, a Japan-based real-estate company adopted a facial recognition technique provided by NEC Corporation in the year 2020. NEC Corporation is an electronics and information technology service. NEC Corporation has offered a smart hospitality service that uses facial recognition technology which would help the customers with secure and safe stays at hotels. This service would be used by Mitsui Fudosan Co., Ltd, for various services such as initiating a cashless payment, entering rooms, and check-in.

In the field of retail and e-commerce

[Source]

In June during the year 2019, Alibaba Group Holding Ltd. and Bestore Co Ltd. collaborated to utilize and integrate facial recognition technology. Bestore is a snacks manufacturing company, it allows its customers to make a payment via a face-scanning tablet developed by Alibaba. Users who visited the store earlier will get a face ID created. The next time when these people visit the store, the shop assistant will be provided to them based on the past data. The collaboration helped to increase the sales rate and a better customer-focused experience.

Importance of facial recognition

In American market

The region which dominated the market most in 2020 was the North American region. The North American region contributed 37% of the total global revenue. This huge contribution to the global revenue was due to the exponential acceptance of facial recognition in the domain of security and surveillance. Various departments such as homeland security, justice, defense, and other sectors in the US are boosting the overall growth. One of the most trusted vendors of biometry products in the US market is MorphoTrust. MorphoTrust developed facial recognition products for federal law enforcement agencies.

In Asian Market

In April 2018, the Indian police team in the union territory of Delhi successfully used a facial recognition system to identify children who were kidnapped or lost. Facial recognition systems are rapidly being adopted by Indian law enforcement agencies. Several countries in the Asia-Pacific region are working to develop electronic identification systems for persons. Initiatives like the e-KTP project in Indonesia and the UIDAI project in India are opening up new market penetration potential in the area.

How Deep Learning is used for facial attribute recognition?

Facial attribute recognition involves taking an image as an input and then passing it through a CNN-based architecture to perform feature extraction, allowing the neural network to perform facial attribute recognition. The images passed through the neural networks are converted into an array first. These arrays are scaled and normalize for better generalization of the model in the real world. We tried to detect the attributes of the left eye, right, nose, and lips.

We have successfully identified facial attributes such as eyes, nose, lips, and face

A Convolutional Neural Network(CNN) based model was trained and used for prediction for the attributes recognition task. The last layers of the CNN model were used for regression purposes as our task was to predict 10 coordinates of the facial attributes.

Convolutional Neural Network (CNN)

Convolutional Neural Networks are the neural networks that power the majority of the image, speech, or audio signal-related tasks. CNN consists of mainly three types of layers:

  • Max Pooling Layer
  • Convolutional Layer
  • Fully Connected Layer

A convolutional network’s initial layer is the convolutional layer. The fully connected layer is the last while further convolutional layers or pooling layers can be added after convolutional layers. The CNN becomes more sophisticated with each layer, recognizing larger areas of the picture. Earlier layers concentrate on basic elements like colors and borders.

As the picture data goes through the CNN layers, it begins to detect bigger components or forms of the item, eventually identifying the desired object.

A basic CNN model. [Source]

Let’s get started with the code

The link to the notebook [Source]. To follow the blog completely I would suggest you fork this notebook.

  1. We will first load all the important libraries. For deep learning-related tasks we will be using Tensorlfow 2.x. For performing image operations we will be using Pillow (PIL) library.

2. Here we are defining the path to the dataset. For the demo purpose, we are only taking a subset of this huge dataset. Only 10K images are selected. The images are rescaled too.

3. Now, we will load the key points file which contains the coordinates of the facial attributes.

4. We need to convert the images into an array so that they can be fed into the model for training, validating, testing, and predicting tasks. Hence, here we are converting the images into an array and then we are dividing the pixel values by 255 to normalize the images.

5. This is an example image from our dataset.

6. Reading all the images stored in the dataset (10000 images) and converting them into arrays.

7. Now, we will define a function to retrieve the key point coordinates of the facial attribute features of respective images.

8. Writing a function that will plot the key points on the image of the faces.

9. Here is an example output of the above function.

10. We will be loading the CSV file which contains the coordinates of the key points.

11. Since, we rescaled the images earlier and our original key point coordinates dataset included the coordinates of original images so we need to rescale the key point coordinated too. Hence, are creating a function to rescale the coordinates of the key points according to the new rescaled images.

12. Here is the output of the above function.

13. Splitting the data into train and test sets. Further, we have divided the test into test and validation data.

14. Finally, we have created a CNN-based neural network from scratch using Tensorflow 2.x. The model will basically performing regression and will output the values of the key point coordinates.

15. Finally, we have compiled and trained the model. The loss function which was used for this task was the mean squared error. The RMSProp optimizer was used. The model was trained for 50 epochs and with batch_size of 4.

16. Writing a function to perform the predictions.

17. Now, we are defining a function to draw the key points on the image of the faces.

18. Here is a prediction from our model. To make the model more accurate one can increase the training size of the dataset, increase the quality of the images, and train for a higher number of epochs with increased batch size if he or she has enough computing resources.

About the writer

Hey Guys, congratulation on making it to the end! I am Akshat Dubey and I work as a Data Science Intern at Labellerr. I am a fourth-year student pursuing an Integrated Master of Science in Mathematics and Computing from Birla Institute of Technology — Mesra, Ranchi. My course focuses on the implementation of mathematics in the field of artificial intelligence. I am a Kaggle Master well versed in Machine Learning, Deep Learning, Computer Vision, and Natural Language Processing. My primary interests involve the application of artificial intelligence in the field of healthcare and retail. To connect with me, you can click on the following links:
Kaggle: https://www.kaggle.com/akshat0007/
Linkedin: https://www.linkedin.com/in/akshat0007/
Github: https://github.com/dubeyakshat07

Connect with the Labellerr Team:
Website: https://www.labellerr.com/

Originally published at https://blog.labellerr.com on July 13, 2021.

--

--

Akshat Dubey

I love exploring and discovering. Also, I am a keen observer, who loves writing about Machine Learning, Deep Learning, Computer Vision, NLP. My focus is XAI.