Performance Analysis of Yolo Versions V5 and V7 for Disease Detection in Plants

Sajitha P; Asiya Beevi S; Devadutt M B; Muhammed Ismail Z

doi:10.17577/IJERTV13IS030030

Volume 13, Issue 03 (March 2024)

Performance Analysis of Yolo Versions V5 and V7 for Disease Detection in Plants

DOI : 10.17577/IJERTV13IS030030

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 137
Authors : Sajitha P, Asiya Beevi S, Devadutt M B, Muhammed Ismail Z
Paper ID : IJERTV13IS030030
Volume & Issue : Volume 13, Issue 03 (March 2024)
Published (First Online): 14-03-2024
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Performance Analysis of Yolo Versions V5 and V7 for Disease Detection in Plants

Sajitha P

Dept. of Electronics and Communication

Ace College of Engineering, (Affiliated to APJKTU) Thiruvananthapuram, India

Devadutt M B

Dept. of Electronics and Communication

Ace College of Engineering, (Affiliated to APJKTU) Thiruvananthapuram, India

Asiya Beevi S

Dept. of Electronics and Communication

Ace College of Engineering, (Affiliated to APJKTU) Thiruvananthapuram, India

Muhammed Ismail Z

Dept. of Electronics and Communication

Ace College of Engineering, (Affiliated to APJKTU) Thiruvananthapuram, India

Abstract- AI has seen increasing use in agriculture lately, aiding in cultivating healthier crops, managing pests, monitoring soil and growing conditions, analyzing data for farmers, and improving other aspects of the food supply chain. Monitoring plants from an early stage is crucial to prevent diseases, but relying on the naked eye is time- consuming and prone to inaccuracies. Therefore, there is a growing need for modern technologies like AI for disease detection. This study explores the use of transfer learning and the YOLO algorithm for detecting various diseases across different crops. The YOLO algorithm employs convolutional neural networks for real-time object detection by dividing images into grids and using each cell to detect objects. A comparison of different versions of YOLO, specifically YOLOv7 and YOLOv5, reveals that while YOLOv7 requires more processing power during training, its performance is less consistent compared to YOLOv5, which delivers better results.

Keywords Artificial Intelligence, Deep Learning, plant disease, YOLO, CNN, real-time object detection.

I. INTRODUCTION

In a number of countries, such as India, agriculture is an important contributor to their national income. The presence of plant diseases and harmful insects is one of the main challenges facing this sector. As agriculture accounts for around 70% of the country's GDP, it plays a key role in economic development in India [1]. The environmental conditions, such as temperature, humidity and rainfall are different in India due to diversity of terrain. Thus, there are a variety of crops all over the country. Pests destroy crops and cause significant damage to farming communities.

Crop damage due to outbreaks of diseases, insects, nematodes and weeds. A dataset with wide range of images is necessary to accurately identify diseases. Advanced deep learning has been used over the past years. This method has increasingly aided in both identifying diseases and evaluating their severity [2]. Utilizing computer vision and soft computing methods can automate the detection of plant diseases using leaf images, as plant leaves often show early signs of illness. Monitoring crops for diseases from their initial growth stages to harvest is crucial for optimal results. This research customizes a deep learning model for the specific task using transfer learning and deep feature extraction methods. Deep neural networks (DNNs) demonstrated notable performance in classification tasks. Object identification algorithms are typically categorized into classification-based (two-step detectors) and regression-based (one-step detectors). While the two-stage object detector outperforms the one-stage detector in accuracy, it comes at the expense of slower output speed [1]. These systems operate autonomously and have the potential to improve farming accuracy and productivity without the need for human intervention.

A. Dataset

II.METHODOLOGY

C. Image Preprocessing

The dataset utilized for training the model was sourced from the Plant Doc dataset repository. We obtained the images, containing 28 plant varieties displaying both healthy and diseased conditions, from this repository. The Plant Doc Dataset comprises 2598 images categorized into 30 classes representing various types of plant diseases. These images have been distributed in a structured manner, with a ratio of 75% for training, 15% for validation, and 10% for testing, for experimental needs. The Tomato Plant Leaf category had the largest number of images totaling 801, while Soybean had the smallest with only 15 images.

figure1:sample images of diseases leaves

B. Model Architecture

The system development process, illustrated in Fig.2, encompasses data collection and preparation, data segmentation, model training and evaluation, and ultimately, model testing and deployment.

figure 2: systematic flow chart

This model analyzes the entire image through a single neural network and subsequently segments it into sections to predict bounding boxes for its components.

Before utilizing images directly from the Plant Doc dataset in YOLOv5, we initially reduced and cropped the image size. We employed the mosaic data enhancement method to augment the image set. After resizing, to enhance the dataset random scaling, random cropping and random arrangements are done. This is not only expanding it but also refining the detection of smaller targets. This process accelerates and improves the image quality to meet the required color scale, size, etc.

Random Scaling: it involves randomly adjusting the scale of images within a defined range, serving as a form of image data augmentation.
Random Cropping: this entails generating a random subset of the original image as a data augmentation method. Specifically, this involves selecting a random image from the chosen categories and extracting a random portion from the main image.
Random arrangements: it primarily involve the absence of a specific plan or pre-established order.

Training Model
- YoloV5:
  
  Yolov5 efficiently conducts object recognition in a single stage by dividing the image into N grids, utilizing CSPDarknet53 as its foundation. It boasts superior accuracy and speed in identifying objects, particularly in real-time scenarios compared to other algorithms. Initially, input images are partitioned into S * S grid cells, with only those containing the midpoint of bounding boxes responsible for image detection. The certainty rating identifies the areas of concern on the leaves, indicating where disease is present. main parts of YOLOV5 are:
  1. Model Backbone
  2. Model Head.
  The purpose of employing a model backbone is to extract essential features from the input images. To streamline its network parameters and extract key details from the input image, it utilizes ResNet101 to construct and refine the cross-stage partial bottleneck. The advancement of Yolov5 led to the creation of a highly efficient detection module for pyramid network attributes (FPN), bolstering the bottom-up pathway.
  
  The figure 3 shows the architecture of YOLOv5. The model comprises three primary elements: a backbone acting as the head, a feature extraction module resembling a sand clock, and a spatial pyramid pooling module. These elements collaborate to extract features from different levels. Additionally, the neck gathers features from multiple head layers, generating cross-stage features. The detection component synthesizes final output vectors, incorporating features from other components with anchor results at various scales, to include bounding boxes, confidence scores, and probabilities. These output vectors are aggregated to yield the ultimate detection boxes.
  
  figure 4: the architecture of yolov7
- YOLOV7
  
  figure 3: architecture of yolov5
Model Evaluation

To assess the efficiency and performance of each trained model, Mean Average Precision (mAP) can be utilized as a criterion for selecting the most proficient model. The models mAP is determined through mathematical computation, evaluating its detection accuracy post- training. Historically, mAP was calculated as follows:

Yolov7 achieves a faster inference speed of 114 FPS compared to Yolov5. Through architectural enhancements, Yolov7 demonstrates significant acceleration while also improving accuracy. Yolov7 offers a more robust and swifter network architecture, incorporating highly efficient methods for integrating characteristics and achieving accuracy in object detection. One of its key advantages is its ability to operate on significantly less expensive computer hardware compared to deep learning alternatives. Designed to facilitate quicker training on small datasets, Yolov7 does not require pre-trained weights. Yolov7 architecture consist of the main element:

Elan serves as a computational block and forms the foundation of Yolov7. Factors contributing to its speed and accuracy include:
- memory access cost
- I/O channel ratio
- gradient path
- activations.

The E-Elan architecture significantly enhances the functionality of the framework, boosting efficiency.

The formula calculates the average precision (AP) for each query (q) and the total number of queries in the set (O). By computing the AP for each query and averaging them, the resulting value, mAP, indicates the models performance effectiveness. The average precision score is obtained by dividing the total AP across all classes and IoU thresholds by the total number of detection instances.

III. OBSERVATIONS

The Yolov5 and Yolov7 models underwent training for 50 epochs each. The training dataset comprises a total of 2598 images, with 100 images in both the test and validation folders. Yolov7 took approximately 2 hours and 33 minutes on average to train, while Yolov5 required about 1 hour and 3 minutes. Following the training phase, we can evaluate the optimal weights of both models by running inference on 100 images. Yolov7 completed the task in 1.797 seconds, while Yolov5 took 2.34 seconds, indicating that Yolov7 has a faster inference speed compared to Yolov5.

figure 7: observation of diseased leaves

table 1: comparison of Yolo versions

V.CONCLUSION

This paper examines the performance of Yolov5 and Yolov7 on the Plant Doc dataset, both being single-stage object detectors. Our findings reveal that Yolov7 demands significantly more processing power compared to Yolov5 during training. Despite being trained on the same parameters and dataset, Yolov7 takes 1 hour and 30 minutes longer than Yolov5. Additionally, Yolov7 exhibits more fluctuation in mean Average Precision (mAP) values per epoch, whereas Yolov5 consistently outperforms. Both models, trained on 35 classes with a low ratio of classes to training data size, demonstrate strong performance. This study highlights the need to apply the analysis conducted here to develop a unified multiple object detection model capable of identifying both plant diseases and pests within a single image. Our future plans can involve developing a comprehensive hybrid model that can detect both plants and pests using a single algorithm. Additionally, we can aim to deploy our hybrid model for real-world use, allowing users to interact with the system for detecting plant diseases and pests in the field.

VI. REFERENCES

& Technology – Signal and Information Processing (ICETET – SIP), Nagpur, India, 2023, pp. 1-6, doi: 10.1109/ICETET- SIP58143.2023.10151484.

R. R. Maaliw, "Deep Learning Technique Detection for Cotton and Leaf Classification Using the YOLO Algorithm," 2022 International Conference on Smart Information Systems and Technologies (SIST), Nur-Sultan, Kazakhstan, 2022, pp. 1-6, doi: 10.1109/SIST54437.2022.9945757.

January 22, 2021.

1-9, 29 June 2021.

August 2021.

IEEE Xplore. \