Fall Detection Analysis With Machine Learning

An Introduction to Real-Time Fall Detection Using Computer Vision and Machine Learning

Leonardo Gamboa Uribe
8 min readFeb 5, 2023

Accidental falls are the second leading cause of unintentional injury death worldwide. Every year, an estimated 684,000 people die from falls worldwide, of which more than 80% are in low- and middle-income countries. Adults over the age of 60 suffer the greatest number of fatal falls [1]. In addition to patients with; Alzheimer, Parkinson, and other pathologies that involve cognitive impairments.

This kind of events has been extensively studied in many research papers in the field of Artificial Intelligence. The main focus of this article is to analyze the data provided by Computer Vision, which will help us identify specific behaviors or postures characteristic of a possible fall [2]

Steps to estimate a fall. Source referenced [1]

Computer vision uses pattern detection in the human body, which it interprets and defines as points or coordinates of the human skeleton, which will facilitate the extraction of important features of body symmetry.

To facilitate the task of pattern detection, we need a pre-trained machine learning model that detects the joints of the body in an image/video. In this article, we will use Google’s MediaPipe Pose, which is based on the BlazePose convolutional neural network [3] that provide to the extraction and monitoring of the skeleton data and its movement.

These characteristics will facilitate the prediction of human posture; the vertical angle of the trunk between the hip and the shoulder, the speed of human gait, and the width-height ratio of the rectangular human body.

Reality or fiction? In the following image (left), we can see a fragment of the movie Mission Impossible (Rogue Nation) where gait analysis is used as a biometric authentication method. In the image on the right, a basic example of the traceability generated by the pre-trained model of BlazePose [3]. In my next article we will develop this model.

Source: Image (Left) Movie sequence Secret Nation, Mission Impossible (Paramount Pictures) . Image (right) by the author

The use of video pose estimation plays a critical role in enabling the overlay of digital content and information on the physical world into augmented reality, sign language recognition, full-body gesture control, even quantification of physical exercises, and in our case the Fall Detection. Video pose estimation for accident prevention is particularly challenging due to the wide variety of possible poses, numerous degrees of freedom [3] and all this regardless of the type of clothing that the person wears.

Key points detected using Mediapipe code
Source: Image by the author

Google’s MediaPipe Pose uses a two-step ML solution (detector and tracker), which has proven to be effective in detecting points in the human body.

draft. need to put reference

The detector is inspired by the BlazeFace [3] model, used as a people detector. It predicts virtual key points that describe the center, rotation, and scale of the human body as a circle. And the tracker predicts the reference points of the pose and segmentation mask within the ROI using the clipped frame of the ROI as input. [3]

Inspired by Leonardo’s Vitruvian Man, that predict the midpoint of a person’s hips, the radius of a circle that circumscribes the entire person, and the angle of inclination of the line connecting the midpoints of the shoulder and hip.[3]
Leonardo’s Vitruvian Man. Souce Internet.

Inspired by Leonardo’s Vitruvian Man, that predict the midpoint of a person’s hips, the radius of a circle that circumscribes the entire person, and the angle of inclination of the line connecting the midpoints of the shoulder and hip.[3]

This way Google’s MediaPipe Pose returns us as results coordinates of 2D and 3D (key points). The 2D key points are used to calculate the gait speed, skeletal posture and the rectangular body width-height ratio, while the 3D key points are used to calculate the trunk angle to predict a in our case, and fall event.[3]

The 3D key points are an array with 33 objects (lendmarks), with each object having x, y, and z coordinates that are transformed to units of meter. The key points of each axis are normalized, ranging from -1 to 1, and the origin of 3D space starts from the center of the hip (0, 0, 0). [7]

Pose landmarks (left) Source referenced [4]. Key points detected using mediapipe
Source: Image by the authorSource: Image by the author

Approaches that allow us to predict a fall

To detect a potential fall accident event, we can use different approaches, but to be practical and to understand the idea, we will explain the simplest to represent with Google’s MediaPipe Pose.

The trunk angle is the angle between the trunk and the vertical plane through the center of mass of the body (hip). Trunk control is an important feature in the assessment of functional gait balance while walking, as it shows the ability of the trunk muscles to maintain an upright or neutral position, support body weight, and move selectively to maintain center of vision and gravity on the base of support. In other words, trunk angle has a significant relationship with trunk balance during functional activities and contributes to the probability of falling [10]

Angle between shoulder and hip points.Source referenced [4]

Typically, when the individual is standing, the symmetry of the body will be upright and the angle of the vertical line through the center of gravity will be approximately 0 degrees with respect to the shoulder position; when the individual walks, the angle value will be between 45 and 90 degrees; when the individual is falling, the angle will be greater than 90 degrees

As we can see in the following animation (using Google’s MediaPipe Pose), is possible to obtain the references of the shoulder (A), hip (B) and knee ( C) joints. Using the midpoints, and averaging the left and right joints, we can calculate the hip angle. For example, an angle of 0º means that the body is fully upright. And an angle of more than 90º we assume that there may be a risk of falling, for an older person. [6]

Angle between the three points A, B and C. Source: Image by the author
Calculating the angle between the three points A, B and C. Source referenced [6]

Gait, speed, width-height ratio of the human body Rectangular and balance, although we will only mention them in this article, their evaluation is a vital step to identify if a person is at increased risk of falls. (Peel et al., 2013). Research has shown that there is a strong association between gait speed and balance in falls, so a single gait speed factor is sufficient to identify whether a person is at high risk for falls.

Let’s CODE

Once explained how the images are processed and the importance of the skeleton key points, that MediaPipe provides us, we will develop PHASE 1, one a very simple example that will help us understand and consolidate the concepts.

To reduce the computational time and increase the accuracy of the prediction, we will select only key points that are highly related to the behavior of the fall, which are: the head, shoulders, hip, knee, ankle, and feet. The X and Y axes of the chosen skeletal points will be used for the different fall prediction methods.

The following code has been used to generate the images (outside) that you have been able to see in the article, in addition to generating the fall detector that shows the animation on the right below.

In lines 39 and 40, we mark in a yellow circle the articulations that we are going to validate: shoulder (key point 12) and knee (key point 26).

In line 34, we verify that if the Y coordinates of the shoulder (key point 12), they are closer than 50 points to the Y coordinate of the knee (key point 26), we define that there is a “Fall Detected”.

As we have seen in the example, we have many alternatives, we can validate shoulder and ankle, head and knee, etc. In addition to including the calculation of the degrees of inclination, as you can see there is a lot to do.

TO BE

The next articles:

PHASE 2: we will include deep neural networks, LSTM, GRU, CNN and the attention mechanism, to predict a fall based on the study of the balance of the human body. We will use different data sets already generated [9] with the data indicating different situations, normal position and drop, or generate them ourselves the data provided by MediaPipe (lendmarks x, y, z).

PHASE 3, we will implement the solution in an Edge Computing environment, using a RaspBerry PI 4, in which the tests shown in the article have been carried out. Also the alert notifications will be processed in a Kafla queue and later integrated with WhatsApp or Telegram.

You can see the assembly of Apache Kafka on Raspberry Pi4 in my article. https://medium.com/@novagenio/truth-1-apache-kafka-on-raspberry-pi-working-b43b9e2a8120

Conclusions

The use of computer vision based on pre-trained machine learning models, such as Google’s MediaPipe Pose, allows us to perform many different studies of the human body, and allows us to easily create applications that need real-time data. The possible areas of application are extensive, especially in the health area, in the prevention of accidents and injuries.

Additional Resources

Google’s MediaPipe Pose install: https://google.github.io/mediapipe/getting_started/install.html

References

[1] Falls. (2021, 26 abril). https://www.who.int/news-room/fact-sheets/detail/falls

[2] Lau, X. T. L. C. (s. f.). Fall Detection and Motion Analysis Using Visual Approaches. IJTech — International Journal of Technology. https://ijtech.eng.ui.ac.id/article/view/5840

[3] On-device, Real-time Body Pose Tracking with MediaPipe BlazePose. (2020, 13 agosto). https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html

[4] Meel, V. (2023, 1 enero). What is the COCO Dataset? What you need to know in 2023. viso.ai. https://viso.ai/computer-vision/coco-dataset/

[5] Eversberg, L. (2023, 14 enero). Detecting Bad Posture With Machine Learning — Towards AI. Medium. https://pub.towardsai.net/detecting-bad-posture-with-machine-learning-be4b9de763d0

[6] MediaPipe in Python. (s. f.). mediapipe. https://google.github.io/mediapipe/getting_started/python

[7] Pose. (s. f.). mediapipe. https://google.github.io/mediapipe/solutions/pose.html

[8]Just a moment. . . (s. f.). https://codepen.io/mediapipe/pen/jOMbvxw

[9] (https://google.github.io/mediapipe/solutions/media_sequence.html)

[10] Acasio, J. C., Butowicz, C. M., Golyski, P. R., Nussbaum, M. A., & Hendershot, B. D. (2018). Associations between trunk postural control in walking and unstable sitting at various levels of task demand. Journal of Biomechanics, 75, 181–185. https://doi.org/10.1016/j.jbiomech.2018.05.006

[11] S. (2022, 7 septiembre). Human Pose Estimation 이란? (2022). SuperMemi’s Study. https://supermemi.tistory.com/entry/Human-Pose-Estimation-%EC%9D%B4%EB%9E%80-2022

--

--

Leonardo Gamboa Uribe
Leonardo Gamboa Uribe

Written by Leonardo Gamboa Uribe

PhD in AI student, Master's degree in Artificial Intelligence, Computer Science Engineering, Digital Transformation Expert

No responses yet