Real-Time Gaze Estimation Using Webcam-Based CNN Models for Human-Computer Interactions

Visal Vidhya, Diego Resende Faria

Research output: Contribution to journalArticlepeer-review

4 Downloads (Pure)

Abstract

Gaze tracking and estimation are essential for understanding human behavior and enhancing human–computer interactions. This study introduces an innovative, cost-effective solution for real-time gaze tracking using a standard webcam, providing a practical alternative to conventional methods that rely on expensive infrared (IR) cameras. Traditional approaches, such as Pupil Center Corneal Reflection (PCCR), require IR cameras to capture corneal reflections and iris glints, demanding high-resolution images and controlled environments. In contrast, the proposed method utilizes a convolutional neural network (CNN) trained on webcam-captured images to achieve precise gaze estimation. The developed deep learning model achieves a mean squared error (MSE) of 0.0112 and an accuracy of 90.98% through a novel trajectory-based accuracy evaluation system. This system involves an animation of a ball moving across the screen, with the user’s gaze following the ball’s motion. Accuracy is determined by calculating the proportion of gaze points falling within a predefined threshold based on the ball’s radius, ensuring a comprehensive evaluation of the system’s performance across all screen regions. Data collection is both simplified and effective, capturing images of the user’s right eye while they focus on the screen. Additionally, the system includes advanced gaze analysis tools, such as heat maps, gaze fixation tracking, and blink rate monitoring, which are all integrated into an intuitive user interface. The robustness of this approach is further enhanced by incorporating Google’s Mediapipe model for facial landmark detection, improving accuracy and reliability. The evaluation results demonstrate that the proposed method delivers high-accuracy gaze prediction without the need for expensive equipment, making it a practical and accessible solution for diverse applications in human–computer interactions and behavioral research.
Original languageEnglish
Article number57
Pages (from-to)1-27
Number of pages27
JournalComputers
Volume14
Issue number2
Early online date10 Feb 2025
DOIs
Publication statusPublished - 10 Feb 2025

Keywords

  • CNN
  • eye tracking
  • gaze estimation

Fingerprint

Dive into the research topics of 'Real-Time Gaze Estimation Using Webcam-Based CNN Models for Human-Computer Interactions'. Together they form a unique fingerprint.

Cite this