Detectron2 Object Detection & Manipulating Images using Cartoonization

Allena Venkata Sai Abhishek; Sonali Kotni

doi:10.17577/IJERTV10IS080122

Volume 10, Issue 08 (August 2021)

Detectron2 Object Detection & Manipulating Images using Cartoonization

DOI : 10.17577/IJERTV10IS080122

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 13,646
Authors : Allena Venkata Sai Abhishek , Sonali Kotni
Paper ID : IJERTV10IS080122
Volume & Issue : Volume 10, Issue 08 (August 2021)
Published (First Online): 26-08-2021
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Detectron2 Object Detection & Manipulating Images using Cartoonization

Allena Venkata Sai Abhishek

Dept of Computer Science and Engineering GITAM University Visakhapatnam, India

Sonali Kotni

Dept of Computer Science and Engineering GITAM University Visakhapatnam, India

Abstract In today's world, there is a rapid increase in the autonomous vehicle. There are various levels of autonomous vehicles depending upon the degree of autonomy-for the lower degree of autonomy driver has more power and functionality for managing, on coming to the fully automated vehicle like Tesla are expected to have full control over the functions. These advances cooperate to plan the vehicle's position and its nearness to everything around it. Because of this, there is popularity for these vehicles, since they give a great deal of advantages to individuals utilizing them. We use the Facebook AI Research software system that implements object detection algorithms, Caffe2 deep learning framework for advanced object detection by offering speedy training. We have also manipulated images to derive insights addressing the issues companies face when making the step from research to production. We have implemented detectron2 object detection for faster detection of objects. There is labeling of the object & we used manipulation of images using cartoonization.

Keywords Mask R-CNN; Retina Net; Faster R-CNN; RPN; Fast R-CNN, R-FC; Classification; Deep Learning; Grayscale;

INTRODUCTION

In this automation, the information is gathered by the on-board sensors without any communication. Moreover, these automated vehicles can communicate with each other In this automation, the data is accumulated by the on-board sensors with no correspondence [1]. Besides, these computerized vehicles can speak with one another and can share data about the climate. . We utilize the Facebook AI Research programming framework that executes object location calculations, Caffe2 profound learning structure for cutting edge object discovery by offering expedient preparing.

The objective of Detectron is to offer a
- High-quality,
- High-execution
- Codebase for object location research.
It is intended to be adaptable to help fast execution and assessment of novel exploration.
DATA COLLECTION

Then input data for our model is image type. We give an input image in either JPEG or PNG format. The input image is used for manipulation of using various cartoonization techniques. Then we import the necessary libraries for the uploading of the images.

Fig. 1. Input Image
PROPOSED MODEL

The proposed model take up input as images in the format of PNG, or JPEG, using multiple libraries in which, we use the Detectron2 for the faster object detection of the objects using various object detection algorithms such as Mask R-CNN; Retina Net; Faster R-CNN; RPN; Fast R-CNN, R-FC; Classification; Deep Learning; Grayscale. We take the backbone & proposal that we crop & wrap and we implement all the box, mask, key points, dense pose and semantic segmentation, while clubbing it and generating labels and we detect the object using a box [4]. We have also manipulated images by grey scaling, cartoonizing, applying bilateral & Gaussian filtering, to derive insights addressing the issues companies face when shifting from research to production.

Fig. 2. Proposed model framework
METHODOLOGY

The detectron2 framework is initially imported using the git command from the Github repository of the detectron2.
1. Installing & Importing the required Dependencies:
  
  We install & import the required dependencies that are as follows-
  1. pyyaml
  2. CUDA
  3. Torch
  4. Torchvision
  5. detectron2 logger
  6. Numpy
  7. JSON
  8. OpenCV
  9. Random
  10. detectron2 utilities
    
    Fig. 3. Dependencies Installed & Imported
2. Uploading an Image:
  
  We define a function read_file function to read the file. Then input data for our model is image type. We give an input image in either JPEG or PNG format. The input image is used for manipulation of using various cartoonization techniques. Then we import the necessary libraries for the uploading of the images.
  
  Fig. 4. Image uploading
  
  Fig. 5. Image that has been uploaded
3. Detecting& Labelling of Image:
  
  We detect & label the objects of the image. We take the backbone & proposal that we crop & wrap and we implement all the box, mask, key points, dense pose and semantic segmentation, while clubbing it and generating labels and we detect the object using a box. A mask is applied on the image, finding the key points[2]. Then we use dense pose & semantic segmentation to finally display the image with labeled box.
  1. We get the Image from uploading or either from the MS- COCO dataset.
  2. We have created a detectron2 configuration and a detectron2 Default Predictor for the running of the inference on a particular image.
  3. We also add the model – specific configuration like, Tensor Mask, etc. here as we are not running a model in detectron2's core library.
  4. We set a certain threshold for this model.
  5. We find a model from detectron2's model zoo.
  6. Last and final step is to visualize our processed image.
  Fig. 6. Detecting& Labeling of Image
  
  Fig. 7. Detecting& Labeling of Image Computations
4. Manipulating the Image:
We use the input image and we manipulate it by using the following techniques to derive insights [3]. Manipulating of an image can be done in many ways. Here the image is manipulated by cartoonizing the image which involves in adding a cartoon effect to the image and the image can be filtered by using various filters. The steps for manipulating an image are briefly described below.
1. Creating an edge mask function
  
  Initially an image is uploaded from device or from a dataset. Then an edge mask is created. When creating an edge mask,the thickness of an image's edges is given first consideration when producing an edge mask. The cv2.adaptiveThreshold () method will be used to identify the edge of a picture. To determine the threshold for smaller areas of the picture, we utilize the cv2.adaptiveThreshold () function. As a result, different thresholds are obtained for various parts of the same picture. It will highlight the black edges surrounding the image's objects.
2. Converting into grayscale
  
  Secondly, the image is converted to grayscale. Here the image consists of two colors i.e. black and white. During the process of gray scaling and image, the noise is compressed from the image to reduce the number of detected edges that are not required. cv2.adaptiveThreshold () defines the line size of the edge. The thicker borders that will be highlighted in the image will have a higher line size.
  
  Fig. 9. Converting into grayscale
3. Reducing color palette
  
  Color Quantization: This method reduces the number of colors in the image and gives it a cartoon effect. When presenting output with a finite number of colors, color quantization is accomplished using the K-means clustering method. K-means is an unsupervised machine learning algorithm that performs clustering. From the word, K means number of clusters and Means refers to the variance. We can determine the number of color in the output picture using different values of K. So, here for the present image the number of colors is reduced to 9.
  
  Fig. 10 . Reducing color palette
4. Bilateral Filtering
  
  The bilateral filter is the next approach for decreasing picture noise. It decreases the image's blurriness and sharpness. Consider a 3D bilateral filter that is processing an image's edge region. Each pixel value is replaced by a weighted average of neighboring pixel values in a bilateral filter. In order to retain edges, it uses a variety of pixel intensities.
  
  For bilateral filtering, there are three key requirements. They are:
  
  Fig. 8. Detecting& Labeling of Creating an edge mask function
  - d :Diameter of each pixel neighborhood
  - sigmaColor: A higher value for the parameter implies that colors from further away in the pixel neighborhood will be blended together, resulting in bigger semi-equal color regions.
    - sigmaSpace: As long as the pixels' colors are close enough, a higher value of the parameter implies that they will affect each other.
  Fig. 11 . Bilateral Filtering
5. Combining edge mask with the colored image – Adding Cartoon Effect
  
  Finally the edge mask is combined with the color- processed image. Here cv2.bitwise_and function is used. Bitwise operations are performed on the image to get the output. Now you can see how an image can be converted into a cartoon. So, come on and have a try by converting your images into a cartoon.
  
  Fig. 12. Combining edge mask with the colored image –
  
  Adding Cartoon Effect
6. Filtering the Image
Apart from using bilateral filter for filtering the image, Gaussian Blur, sharpen and mean Blur kernel can be used in filtering an image.

Gaussian blurring to an Image: This approach utilizes a Gaussian filter that performs a weighted average. The Gaussian blurs weights pixel values based on their distance from the kernel's centre. The weighted average is less affected by pixels that are further away from the centre.

Median Blurring to an Image: Each pixel in the source picture is replaced by the median value of the image pixels in the kernel region in median blurring.

Sharpening an Image: A 2D-convolution kernel can be used to sharpen a picture. Create a custom 2D kernel first, then apply the convolution operation to the picture with the filter 2D ( ) method.

Fig. 13. Filtering the Image
RESULTS

We were successfully able to detect the objects of the images and we were able to label it according to the predefined dataset, & we manipulate the images as shown below:

Fig. 14. Detecting & labeling of the objects

Fig. 15. Converting into grayscale

.

Fig. 16. Color Quantization

Fig. 17. Bilateral Filtering

Fig. 18. Adding Cartoon Effect

Fig. 19. Filtering the Image using Gaussian, sharpen and Mean Blur filters
CONCLUSION

In this study, We fine tuned a framework that comprised of the superlative model for the object detection application practices, for which we have developed and implemented, an advanced object detection by offering speedy training, the FAIR software system that implements object detection algorithms like, Mask R-CNN, Retina Net, Faster R-CNN, RPN, Fast R-CNN, R-FCN & it uses Caffe2 deep learning framework for it. We have also manipulated images by grey scaling, cartoonizing, applying bilateral & Gaussian filtering, to derive insights addressing the issues companies face when making the step from research to production.
REFERENCES

https://analyticsvidhya.com/blog/2018/01/facebook-launched- detectron-platform-object-detection-research/
https://towardsdatascience.com/image-labelling-using-facebooks- detectron-4931e30c4d0c
Archana B. Patankar; Purnima A. Kubde; Ankita Karia (Aug. 2016).

Image cartoonization methods IEEE
Vung Pham; Chau Pham; Tommy Dang (2020). Road Damage Detection and Classification with Detectron2 and Faster R-CNN 20511275 IEEE

Detectron2 Object Detection & Manipulating Images using Cartoonization

Leave a Reply