Eyeball-Based Cursor Control for Paralyzed Individuals using Eye Blink Detection

Dr.S.Geetharani; Keerthna S

doi:10.17577/IJERTV14IS030053

Volume 14, Issue 03 (March 2025)

Eyeball-Based Cursor Control for Paralyzed Individuals using Eye Blink Detection

DOI : 10.17577/IJERTV14IS030053

Download Full-Text PDF Cite this Publication

Open Access
[post-views]
Authors : Dr.S.Geetharani, Keerthna S
Paper ID : IJERTV14IS030053
Volume & Issue : Volume 14, Issue 03 (March 2025)
Published (First Online): 20-03-2025
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Eyeball-Based Cursor Control for Paralyzed Individuals using Eye Blink Detection

Dr.S.Geetharani, Keerthna S

Associate Professor, UG Student Department of Computer Technology

PSG College of Arts and Science, Coimbatore, Tamil Nadu India

Abstract This paper presents an innovative eye-tracking system that enables hands-free mouse interaction using eye blinks as action triggers. The system utilizes a video camera to monitor the user's eye movements without requiring head-mounted hardware or calibration procedures. By analyzing the video feed, it determines the user's gaze position and executes a mouse click when an eye blink is detected. The proposed method incorporates a multi-step optimization approach, beginning with rapid face detection to define the region of interest, followed by efficient eye detection and tracking. A simplified intensity-based iris detection technique ensures swift and precise gaze estimation. Real-time mapping of eye movements to cursor positions allows for seamless interaction with graphical user interfaces. These optimizations enhance system performance, making it a practical and efficient solution for hands-free computer control, particularly beneficial for users with mobility impairments

Index Terms Eye-tracking, gaze-based interaction, eye blink detection, hands-free control, iris detection, face detection, human-computer interaction (HCI), assistive technology, real- time processing, computer vision.

offering an efficient and accessible alternative for users with motor impairments.

This research highlights the potential of facial landmark-based control systems in assistive technologies, gaming, virtual reality, and general HCI applications. By eliminating the need for traditional input devices, this system enhances accessibility and provides an intuitive, hands-free computing experience. The subsequent sections of this paper detail the methodology, implementation, and performance evaluation of the proposed system.

II. METHODOLOGY

In the context of mouse clicking, an eye blink serves as the action trigger. When the user decides to initiate a selected action, such as clicking, the eye blink triggers the mouse pointer to execute the click event. The user positions themselves in front of their personal computer or laptop screen, with a small video camera placed above the display to monitor their eye

movements.

I. INTRODUCTION

Human-computer interaction (HCI) has evolved significantly, with innovative approaches enabling seamless and intuitive control of digital interfaces. Among these advancements, facial landmark detection has emerged as a powerful tool for hands- free interaction. This paper explores the precise prediction of facial landmarks for controlling a computer cursor using facial expressions and eye movements, offering a novel approach to assistive technology and hands-free computing.

Facial landmark detection involves identifying key points on a face, enabling a wide range of applications such as emotion recognition, gaze tracking, and gesture-based control. In this work, we leverage Dlibs prebuilt model, which efficiently detects faces and predicts 68 2D facial landmarks with high accuracy. These landmarks serve as the foundation for cursor control, allowing users to manipulate the mouse through predefined facial gestures.

The proposed system operates by accessing and activating the webcam, continuously extracting frames from the video feed, and analyzing facial features in real-time. By mapping facial movements to specific cursor actions, the system enables intuitive interaction without requiring physical peripherals. Key functionalities include detecting mouth opening, right and left eye winking, eye squinting, and head movements (pitch and yaw). These actions are translated into mouse operations such as cursor movement, left-click, right-click, and double-click,

The computer continuously analyzes the video feed of the user's eyes to determine where their gaze is focused, without requiring any attachments to the user's head or body. To select a key, the user gazes at it for a specific duration, and to press a key, they simply blink their eye. Notably, there is no need for a calibration procedure in this system, as it relies solely on eye data without requiring any external hardware.

To optimize the eye-tracking system, we employ a multi-step approach to enhance efficiency. Initially, a rapid face detection algorithm identifies the face within the webcam's image frame, defining the area of interest. Subsequent eye detection algorithms efficiently isolate and track the eyes, focusing on movement in one eye for faster processing. Utilizing simplified intensity-based methods that capitalize on the iris's lower intensity, swift iris detection is achieved. By referencing the corners of the eye, precise tracking of iris movement captures changes in the user's focus. This movement is promptly translated into cursor position on a graphical user interface, facilitating seamless interaction. These optimizations ensure real-time performance and accurate gaze mapping, enhancing the system's usability and efficiency for practical applications.

Figure 1. Flowchart of Eye-Tracking-Based Cursor Control System

Pupil Tracking: Pupil tracking, a gaze detection technique often integrated with other methods, recognizes that the eye serves as more than just a tool for swiftly moving cursors. Eye movement input surpasses the speed of other input methods. Typically, before engaging any mechanical pointing system, users direct their gaze toward their intended destination. Eye movement input is derived from the individual's pupil. For instance, if a person focuses on a central mouse pointer, that point becomes the input location, initiating gaze tracking. The cursor then follows the direction of the person's eye movement, stopping when the eye returns to its initial position. Various detection methods involve multiple stages, including facial and eye positioning from different perspectives.

Regression Approach: This approach minimizes disparities between anticipated and actual eye positions. The characteristic-based method involves facial feature extraction for face identification. Initially, it categorizes facial and non- facial regions, aiming to enhance our understanding of facial structures. This method comprises several phases and utilizes images of numerous faces.

Horizontal Pupil Movement: Circular artifacts enable horizontal pupil movement detection. When the pupil shifts leftward, the mouse pointer also moves left, and similarly, when the pupil moves right, the pointer follows in that direction.

Vertical Pupil Movement: Vertical pupil movement utilizes pupil scaling. When looking downward, the eyes are slightly half-closed, a phenomenon that is exploited to guide the mouse pointer from top to bottom.

III. DESIGN CONSIDERATIONS

Input design is a crucial aspect of software development that involves planning and defining input methods for a software application. In the case of the Eyeball-Based Cursor Movement using Python project, the input design focuses on defining the user inputs required to control the cursor movements using eye tracking.

Input Design Considerations:

Eye Tracking: The primary input for the system is the users eye movements. The system tracks the user's gaze and uses this information to move the mouse cursor on the screen. The input design should ensure precise traking and determine the appropriate camera or hardware for optimal accuracy.
Calibration: The system needs to be calibrated according to the user's specific eye movements and preferences. The input design should define the calibration process and user interface to facilitate this setup.
Clicks and Double-Clicks: The system should allow users to perform clicks and double-clicks through predefined eye movements or blinking patterns. This feature should be designed to be intuitive and user-friendly.
User Interface: A well-designed interface is necessary for seamless interaction. The interface should be simple, easy to navigate, and provide clear instructions on input methods.
Accessibility: The system should be accessible to individuals with physical disabilities. The input design must accommodate their needs, ensuring ease of use.

Additionally, the system includes administrative functionalities:
- Admin Management: The administrator manages workers and contractors, accesses user details, and generates reports.
- User Authentication: Users log in using their credentials and can retrieve forgotten passwords through a hint-based security system.
- Worker Registration: Workers register their skills, and users can search for workers based on their expertise.
- Feedback Mechanism: Users provide feedback on workers and contractors to improve service quality.
  
  Output Design
  
  Outputs are essential in software systems to communicate processing results to users. These outputs can be categorized as follows:
  1. Admin Outputs:
    1. Admin has a dedicated homepage.
    2. Admin can access details of all registered users and their skills.
    3. Admin can search for user data based on different criteria.
    4. Various reports can be generated for analysis.
  2. User Outputs:
    1. Users have personalized homepages.
    2. User registration data is stored in a centralized database.
    3. Users can retrieve forgotten passwords.
  3. Types of Outputs:
    1. External Outputs: Intended for users outside the organization.
    2. Internal Outputs: Used within the organization for interaction with the system.
    3. Operational Outputs: Processed within the system for internal operations.
    4. Interface Outputs: Directly facilitate user interaction with the system.
      
      Output Definition Criteria:
      
      To ensure clarity and relevance, the outputs should be defined based on:
- Type and content of the output.
- Format and location of the output.
- Frequency, volume, and sequence of generated outputs.
  
  Output Media Considerations
  
  The choice of output media should be based on:
- Suitability for the application.
- Need for hard copies or digital records.
- Response time and accessibility.
- Availability of software and hardware resources.

IV IMPLEMENTATION AND TESTING

The proposed system is designed to assist paralyzed individuals in communicating through eye blink-based interactions. The implementation consists of multiple phases, including face and eye detection, eye blink recognition, virtual interface navigation, and speech synthesis. The system is developed using Python, incorporating OpenCV, dlib, PyAutoGUI, and pyttsx3 for real-time processing.

Face and Eye Detection

The system captures real-time video from a webcam and detects the user's face using Histogram of Oriented Gradients (HOG) and a Support Vector Machine (SVM) classifier. Eye detection is performed using facial landmark recognition, specifically targeting the eye regions using 68 facial landmarks provided by dlib.
- Left eye indices: (37, 38, 39, 40, 41, 42)
- Right eye indices: (43, 44, 45, 46, 47, 48)
Eye Blink Detection

Eye blinks are detected using the Eye Aspect Ratio (EAR), which is calculated using the vertical and horizontal distances of the eye. The EAR value decreases significantly when the eye is closed. A threshold-based approach is used:
- If the EAR falls below the predefined threshold for 0.3 to 0.4 seconds, a blink is detected.
- Continuous blinks trigger predefined actions, such as phrase selection or cursor movement.
Virtual Interface Navigation

The PyAutoGUI library is used to control the mouse pointer, allowing users to scroll through a predefined set of phrases displayed on the screen.
- Eye blinks serve as selection triggers.
- The moveTo() and moveRel() functions adjust the cursor position dynamically.

The system was tested on multiple users to evaluate its accuracy, response time, and usability. The testing process involved the following aspects:

Accuracy of Eye Blink Detection
- The system was tested on 10 participants with different eye shapes and lighting conditions.
- Blink detection achieved an accuracy of 94% under normal lighting conditions.
- Accuracy dropped to 85% under poor lighting, highlighting the need for adaptive brightness adjustments.
Response Time Analysis
- The systems average response time from blink detection to cursor movement was 0.2 seconds.
- The time taken for phrase selection and speech output ranged from 0.5 to 1 second, ensuring near-instant communication.
Usability and Accessibility Testing
- The interface was tested with users across different age groups, including elderly and physically challenged individuals.
- 90% of users found the interface intuitive, with minimal learning required.
- Minor calibration adjustments were needed for users with glasses or low vision.
Performance Under Different Lighting Conditions

The system performed optimally in well-lit environments but required brightness normalization techniques in dim conditions.

V CONCLUSION AND FUTURE SCOPE

The proposed eye and mouth-based cursor control system provides an innovative and accessible solution for individuals with mobility impairments. By utilizing Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR), the system accurately detects eye blinks and mouth movements, enabling users to control the mouse cursor and select commands effortlessly. The use of the Haar-Cascade algorithm for face and eye tracking, along with EAR-based drowsiness detection, enhances precision and usability. The system effectively integrates real- time video processing, cursor movement, and gesture-based interactions, providing an intuitive and efficient human- computer interaction method.

During implementation and testing, the system deonstrated high accuracy and responsiveness. The detection of eye blinks for cursor selection and mouth movements for activation proved to be effective in real-time applications. The system can significantly benefit individuals with paralysis, motor disabilities, or conditions like ALS (Amyotrophic Lateral Sclerosis), offering them a hands-free means of communication and control.

While the system successfully enables cursor movement using eye and mouth gestures, several enhancements can be considered for improved accuracy, adaptability, and user experience:

Enhanced Gaze Tracking: Integrating AI-based gaze estimation can improve the accuracy of cursor movement, allowing smoother and more precise control.
Adaptive Calibration: Implementing a personalized calibration module can help adjust the system for different users based on their eye structure, blinking patterns, and facial features.
Robust Performance in Low-Light Conditions: Enhancing the detection algorithms with deep learning models can ensure reliable operation under varying lighting conditions and for users wearing glasses.
Multi-Language Speech Output: Extending the system to support multilingual text-to-speech conversion can increase accessibility for users from diverse backgrounds.
Integration with Augmented Reality (AR) and Virtual Reality (VR): Combining the system with AR/VR interfaces can enhance gaming, virtual communication, and assistive technology applications.
Mobile and Web Compatibility: Developing a mobile- friendly version and web-based interface can make the system more accessible across multiple platforms.
Machine Learning-Based Customization: Implementing machine learning models to adapt to individual user behaviors and predict movement patterns can enhance usability.

By incorporating these advancements, the system can evolve into a fully adaptive and AI-powered assistive tool, bridging the gap between technology and accessibility for individuals with physical disabilities.

VI REFERENCES

V. Kazemi and J. Sullivan, "One millisecond face alignment with an ensemble of regression trees," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18671874, 2014.
D. E. King, "Dlib-ml: A machine learning toolkit," Journal of Machine Learning Research, vol. 10, pp. 17551758, 2009.
P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 511518, 2001.
T. SoukupovÃ¡ and J. ech, "Real-time eye blink detection using facial landmarks," Proceedings of the 21st Computer Vision Winter Workshop (CVWW), pp. 18, 2016.
D. W. Hansen and Q. Ji, "In the eye of the beholder: A survey of models for eyes and gaze," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 3, pp. 478500, 2010.
Z. Zhu and Q. Ji, "Robust real-time eye detection and tracking under variable lighting conditions and facial expressions," Computer Vision and Image Understanding, vol. 98, no. 1, pp. 124154, 2005.
A. Mollahosseini, D. Chan, and M. H. Mahoor, "Going deeper in facial expression recognition using deep neural networks," 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 110, 2016.
Pyautogui Documentation, "Pyautogui library for GUI automation," Available at: https://pyautogui.readthedocs.io/.
Dlib Library Documentation, "Dlib: A toolkit for machine learning and image processing," Available at: http://dlib.net/.
OpenCV Library Documentation, "OpenCV: Open-source computer vision library," Available at: https://opencv.org/.