Decision Support Through Deep Learning: Application To Image Classification and Recognition

Glory Ndele Wum Edmond; Simon Ntumba Badibanga

doi:10.17577/IJERTV11IS050074

Volume 11, Issue 05 (May 2022)

Decision Support Through Deep Learning: Application To Image Classification and Recognition

DOI : 10.17577/IJERTV11IS050074

Download Full-Text PDF Cite this Publication

Open Access
[post-views]
Authors : Glory Ndele Wum Edmond , Simon Ntumba Badibanga
Paper ID : IJERTV11IS050074
Volume & Issue : Volume 11, Issue 05 (May 2022)
Published (First Online): 09-06-2022
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Decision Support Through Deep Learning: Application To Image Classification and Recognition

Glory Ndele Wum Edmond (1), Simon Ntumba Badibanga (2).

(1) Senior lecturer, Department of Mathematics and Computer Science

(2) Professor, Department of Mathematics and Computer Science.

Department of Mathematics and Computer Science, Faculty of Sciences, University of Kinshasa, Kinshasa, Democratic Republic of Congo. Febrary 2022

Abstract:-By extracting the relevant characteristics of a multilayer perceptron with four layers of convolutions, we were able to create our Convutionnal Neural Network model, for facial recognition, using an existing CNN architecture whose parameters are driven by the gradient retro-propagation algorithm In a database of 1672 images of which 80% (or 1337 learning images) and 20% (or 335 images of tests); The model predicted very strongly at a coefficient of 99.8% reliability while minimizing the risks of 0.02%. This model is tested with 20 epochs.

Keywords :- PMC, CNN, Neurones, Deep Leraning, ReLU, Flatten, Sofmax.

INTRODUCTION

Deep learning has revolutionized machine learning in recent years. While the first striking results were obtained mainly in image analysis, current work in deep learning now focuses on all types of data and almost all types of processing. Its application has an impact in the field of data science and the extraction of knowledge is considerable.

However, bases of observations characterize a particular domain (animals, fruit, sick, genes, . . .), which are grouped into several classes. Automatic image classification is an application of pattern recognition, which consists of automatically assigning an image to a class using a classification system. [2, 6, 8]
The problem of this research is to recognize the images, restructured them, and to apply the techniques of searching for learning images and tests to facilitate decision-making, using a model of a multilayer perceptron of a convolutional neural network.

This problem formulates the hypothesis that images, can be classified into several classes, to allow better processing, which will help optimization in deep learning to search for images; the recognition of images similar to a query image, as well as the achievement of good performance on average on all images.

The objective of this research is to facilitate the task of searching for images in automatic classification thanks to automatable methods that allow a machine to evolve through a learning process and thus, perform tasks that are difficult or impossible to be performed by more conventional algorithmic means.

Thanks to the Analytical method that made it possible to analyze the different models of convolutional neural networks supported by techniques based on the (CNN) that are part of the types of deep neural network (Deep Neural Network) and the documentary technique; this model, which is a multilayer perceptron driven by the gradient retro-propagation algorithm using an existing architecture of the Convolutional Neural Network, in acronym (CNN) could be realized
MACHINE LEARNING

Machine learning is a technique through which a machine acquires new knowledge for future use. Machine learning is formed of two types of learning, it is a field of data science,

which creates a machine = () defined by :

= {(, ) , }. This machine can be created

through: (Decision Tree, Neural Network, Vector Machine

Support (SVM), Bayesian Network, Random Forest). [2,3,5,6]
2.1.2. Hierarchical classification.

Hierarchical classification is part of the Unsupervised Learning methods, which includes two main families of

Superviseur Classifier

Esteemed Class

~

methods namely: hierarchical Ascending Classification and Hierarchical Descending Classification. Individuals are

represented by a tree structure called the Dendrogram or byDonnÃ©es dentrÃ©e

the Nested Partition. [1,5,7]
7 = ( 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8)

learning

Figure 4 : Supervisedlearning
DEEP LEARNING (CNN)

6 = ( 1, 2, 3, 4, 5, 6, 7)

5 = ( 1, 2, 3, 4, 5, 6)

4 = ( 1, 2, 3, 4)

1 = ( 1, 2) 2 = ( 3, 4) 3 = ( 5, 6)

1 2 3 4 5 6 7 8

Figure 1 : Dendrogram

Deep learning or convolutional neural networks are very

similar to neural networks. They exploit one of the important characteristics of images, namely the spatial distribution of sampling. They consist successively of layers of convolutions, layers of groupings, and connected layers.

The term deep learning refers to the many layers that need to be learned as you train. Convolutional neural networks are not directly inspired by biology and rely on learning algorithms that can fundamentally differ from biological brains. However, they learn internal representations that strongly resemble the ideas that one imagines of representations of the visual cortex.

Considering the classical architecture of a convolutional neural network. An imge is provided as input and is convoluted with filters (first layer of convolution) and whose activation cards are grouped and concatenated.

3.1. Layers of Deep Learning

The major layers in a convolutional neural network are: (Convolution layer, Pooling layer, and Fully Connected Layers). [10,12,14]
1. Convolutional Layers
  
  Convolution layers are a set of filters that are learned during training. The size and number of these filters are defined a priori. [14]
2. Pooling Layers
Pooling layers are predefined functions and reduce the number of parameters to learn for later layers while expanding the receptive field. They operate independently at different depths of the network and do not require any weight to drive. One of the classic operations performed is the maximum

Good specify this problem

Estimate a performances

Use the

Place a Database

Adapt

Construct or ou Adjust the frame of Network

function, where in the vicinity of N pixels only the maximum is retained in the grouping layer.

algorithm of

regulation

this

learning
1. Fully Connected Layers
  
  The neurons in these layers are all connected to all the neurons in the previous activation maps.
  
  In short, in general, the components of a convolutional neural network are:
  - Layer of filters convolved on the different channels;
    
    Figure 5 : Diagram of Deep learning
    
    3.4. Convolutional operation
    
    We note ,
    
    , , the convolution of X by f a pixel of coordinates
    
    (, ), [0, 1], [0, 1] is defined by :
  - Pooling: maximum value (max pool) or average (avg
    
    pool) in a certain convoluted window;
    
    2
    
    2
  - Transfer functions: ReLU, etc.
    
    ( )[,] = [ + , + ] [ + 2 , + 2 ]
  - Near the output, fully connected layers (as with multilayer perceptron)
= 2

= 2
To properly realize a Deep Learning application, it is necessary to respect the following scheme :

In this case, if + + comes out of the image of X, (that is, + for example) then we take ( + , + = 0), and

we talk about Â« Zero podding Â» to calculate other podding it is

possible by taking the value of the nearest pixel. We can also

perform the calculation that on the pixel [, ] such that ( +

, + ) are always in the image; there will then be the

reduction of the image of the output dimension. [7,8,10]
The filter or kernel f is called Kernel or Filter in English. In a convolution neuron, we do not choose the Filter, we learn them, because these are trainable parameters of the network.

For Input , with c : numbers of channels or channels ; When c = 1, the image is gray level, if c = 3 it is a

color channel; And, in the intermediate layers C corresponds to

the number of the previous layer. The convolution neuron =

( = ) ; =

;

Pooling: operation used to reduce the dimension,

searches for coarser details of larger structures in the image. (MaxPooling of size l: we take the maximum element of each sub array of l; Sum pooling of size l: we sum all the elements of each sub-array of size l).

Pooling can be used for the frame of square matrices, for rectangular matrices, it is necessary to exploit an algorithm of the staggering to make it square.

3.4.1. Matrix scaling algorithm (Reduction to reduced staggered form).

In general,

Le Mean Pooling, it calculates the sum of all the values and divides it by the number of values to obtain the representative average of this batch of pixels

Let , (), there is a staggered matrix

, () and a matrix () = .

20 74
[15 35

17 52

21 30] [36 40]
This reduced staggered form is obtained by elementary

26 34

12 60

15 25

40 20

34 25

operations on the columns, more precisely by the following

algorithm :

1. Input : , ()
1. Initialization : = , matrix unit of order n, j = 1
2. Main loop : for i = 1 to m :
  
  By applying mean pooling, the starting matrix is divided into a region of square matrices of order 2, for each region we have retrieved the minimum values to form our Mean Pooling matrix.
  
  Le Sum Pooling, realizes the sum of all the values obtained
  1. Finding a pivot :
    
    20 74
    
    17 52
    
    144 120
    
    If Hi,j = 0
    [15 35
    
    21 30] [ ]
    If thÃ¨re is , 0
    
    26 34
    
    15 25
    
    132 100
    
    {
    
    12 60 40 20
    
    Applying Su g, ting Matrix is divided into a
    
    m Poolin
    
    the Star
    1. X = P(n)
    2. V = V.X
    3. H = H.X
      
      }
      
      Else next step.
  2. Set the pivot to 1
  region of square matrices of order 2, for each region we calculated the sum of all the values to form our Sum Pooling matrix..
  3
  
  0 1 4
  
  Flattening
  
  16
  
  20
  
  c) Reduce and stagger: set to 0 the letters coefficients of
  
  ]
  
  2 0 [20 60 5
  
  the pivot line for s ranging from 1 Ã j 1 then from j
  
  + 1 to n, loop {
  
  ()
  
  1 35 25
  
  6 60
  
  7 35
  
  8 25
  
  i. = , (, )
  
  ii. = .
  
  0 1
  
  9 40
  
  iii. V = V.X
  
  3 0
  
  40 15
  
  ]
  10 15
  
  }
  1. j = j + 1
  2. Exit when j = n
  1 [12 38
  
  Figure 6. Flattening
  
  11 12
  
  12 38
  
  End loop.
3. Output H staggered from () such that H = M.V
In the case of our research, we will apply Max Pooling for square matrices as defined in point 3.4.

The illustrative case or example of Max pooling is given below.

Le Max Pooling, (Max Subsampling Operation) takes a region and gives the maximum output. [12,13,14]
By applying flattening by operation Flatten, the Initial Matrices

form a column vector.
IMAGE RECOGNITION

The layer of neurons (ConvNet) is a stacking of these layers induces local properties of invariance by translation. These properties are essential for the purpose of recognizing characters and more generally images that can be seen from different angles. It is in this area that the most spectacular results were obtained while the name deep learning was advanced in order to accompany the growing success and the associated hype.

20 74

[15 35

17 52

21 30] [74 52

Physical word

Coding Preprocessing Analysis

26 34

12 60

15 25

40 20

60 40]

Apprentissage

By applying Max Pooling, the Starting Matrix is divided into a region of square matrices of order 2, for each region we have retrieved the maximum values to form our Max Pooling matrix.

InterprÃ©tation

DÃ©cision

Figure 7. Shape recognition step.

The physical world : allow to present the object in its real environment, and in its normal form before any possible treatment.
The coding : it consists in observing the shape of the environment in analog form. In order to observe or represent it in discrete form to be processed in the system. The object is codified by binary sequences.
The Preprocessing : consists of the standardization of coded information to keep only the information essential to the system.
Analysis : it therefore makes it possible to extract the clues that characterize the object represented, in order to establish the parameters on which the learning will be based.
Learning: it consists in memorizing and exploiting the knowledge resulting from the parameters of the analysis. Since the model will already be trained by training, it is then that the new data will be used for the test that leads to optimization (prediction). It is during this stage that we will practically know at what coefficient the new model predicts.
The decision : this is the stage of image recognition, that is, the stage by which the system will establish a definitive classification. This step is also the step of optimization, because it is during this that we will be exactly defining the object.
Interpretation : this is prediction on the image obtained, that is, accurately predict the object. [3,6,11,14]

Principles and experiments facial recognition system.

The problem of facial recognition is understood as an image of a face whose identity of the corresponding person is to be found. Face recognition is part of the field of pattern recognition. The purpose of pattern recognition is to classify objects of interest into a number of categories or classes. Objects of interest are usually called models or patterns and in our case they are vectors of characteristics. The classes here represent the different people. Since the classification procedure in our case will be applied on vectors of characteristics.

Recognition is the core of this system, and it is the comparison of the vector code of the face in input with those of the database, and starting from the fact that we want to model a function of the human brain and that we have a classification problem, we chose to implement a Neural Network which is a simulator of the biological neural network. However, in the recognition phase of our system, we used two types of neural network, the first is a Multi Layer Perceptron (MLP) neural network, and the second is a convolutional neural network.
1. Structure de lapplication.

X

50

Classifier Y W

faciale_fr H

Ontal_default.xml

V. Application.

Face Database.

Generally, databases are adapted to the needs of a few specific recognition algorithms. As far as we are concerned, our database consists of 1672 images of faces that we have distinguished into two classes : [2,4,13,14]
- The class of the person "untouched targets" containing 1113 images.
- The class of the "non-target" person containing 559 images of faces of people different from Intouche.
  
  Face
  
  Convolutionnal Neural Network
  
  Face non reconnue
Separation of Databases.

The implementation of our facial recognition system required to have two image databases: one to perform the learning and the other to test the effectiveness of the data trained. [4,11]

Learning images

Of the 1672 images of the base, we reserved 80% (1337 images) for training.
Tests images.

Here on the 1672 of the base we took the 20% (335 images) remaining at the service of the test.

Figure 8. Structure of image recognition

La figure 8 shows the image used in our application to test our Convolutional Neural Network.

Application architecture.

The architecture we used a multilayer Perceptron of our convolutional neural network.

This architecture uses four convolution layers, with which the image is filtered before it is vectorized (flattened) of the pixel matrix. This allowed a good prediction, because with these four convolutional layers the model predicts at 99.8% reliability and minimizes 0.02% of the error, these results will be presented in the prediction curve. [3,10,13,14]
50

50

Figure 9. Architecture of our CNN
Retro-propagation Algorithm.

The feedback algorithm used in this application is executed in three necessary steps which are :
1. Resizing input images to 50x50x1 Format ;
2. The construction of a CNN structure with four convolutional layers by associating a ReLU correction layer in each convolutional layer whose first layer uses a depth of 64 neurons, the second layer uses a depth of 32 neurons, the third layer uses a depth of 16 neurons and the last one uses a depth of 8 neurons with the nucleus of size 3×3. However, max pooling of size 2×2 is applied after two convolutional layers.
3. After extracting all the features, the Flatten operation will flatten the images..
Sequential model testing

Figure 10. Structure of the implementation of the CNN model architecture sequentially.

As we see, the sequential model is convoluted to four layers, using the ReLU activation function, the total parameters of the model is 57,440. Prediction with training data to see how the model will predict with 20 epochs.

Figure 11. The result found when evaluating the CNN model with 20 epochs.

As we said before, with 1337 training images, the model to use 20 epochs to predict at a coefficient of 99.8%, so we concluded that our optimizes our learning.
Overview of the GUI

Figure 12. Testing the application with a Java interface..

The GUI is created in Java script to test the model.

With this interface, we confirm the result of our model, this is a certainty that validates the search result.
Presentation of results.

Figure 13. Increased accuracy with the number of epochs.

Figure 14. Reduction of errors with the number of epochs.

By analyzing the results obtained, in Figures 13. and 14, we find that the accuracy of learning and validation increases with the number of epochs and after that, it falls again, which means that with each epoch where the accuracy accumulates, the model no longer learns information. If the accuracy is decreased, then we will need more information to make our model learn and therefore we must increase the number of epoch and vice versa. Similarly, the learning and validation error decreases with the number of ecpochs.

VI. CONCLUSION.

In summary, image classification is an important task in the field of computer vision, object recognition and machine learning. The objective of this work is to carry out a classification application of an image database into a set of classes in order to recognize the objects in the images, if necessary was that of facial recognition. To operationalize this classification in deep learning; We used the learning method that has shown its performance in recent years and we chose the Gradient Retro-propagation Algorithm as the classification

method, this choice is justified by the simplicity and efficiency of the method.

Thanks to the PYTHON language under the environment of the Anaconda (Tensorflow), we were able to implement our model, this also works thanks to an application created in Javascript on NetBeans 8.4. For each ime train the parameters of the model with the new data. The result obtained during the test phase confirms the effectiveness of our approach.

REFERENCES.

[1] Antoine CornÃ©jols, Laurent Miclet, Yves Kodratoff, Apprentissage Artificiel : Concepts et Algorithmes,DeuxiÃ¨me tirage Eyrilles, 2003.

[2] Achraf Cherti, Jargon Informatique, Logiciel, Version 1.3.6, Avril 2006.

[4] Gerard Swinnen, Apprendre Ã Programmer en Python 3, 2010.

[5] Karpathy A., Convolutional neural networks for visual recognition.

Neural networks, 2016.

[7] Richard O. Duda, Peter E. Hart, David G. Stork, Pattern Classification, Miley interscience, 2001.

[8] Thibault AllanÃ§on, Introduction Ã lapprentissage artificiel, Paris, 2016.

[9] V. Vapnik. The Nature of Statistical Learning Theory. Springer- Verlag (200)

[10] Lee, H., Grosse, R., Ranganath, R., and Ng, A.Y. (2009). Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th annual international confÃ©rence on machine learning, P. 609 616. ACM.

[12] Samuel, A. L. (2000). Some studies in machine learning using the game of checkers. Journal of research and development, 44 (1.2) : 206 226.

[13] S. DIB, Identification des individus multimodals : application sur les images du visage, ThÃ¨se de magister, UniversitÃ© de Mohamed Boudiaf, Oran, 2015.

[14] M. LEMMOUCHI, Identification des visages Humains par rÃ©seaux de neurones. ThÃ¨se de magister, UniversitÃ© de Batna 2, 2013.

Decision Support Through Deep Learning: Application To Image Classification and Recognition

Leave a Reply