Vital signs monitoring is pivotal not only in clinical settings but also in home environments. Remote monitoring devices, systems, and services are emerging as tracking vital signs must be performed on a daily basis. Different types of sensors can be used to monitor breathing patterns and respiratory rate. However, the latter remains the least measured vital sign in several scenarios due to the intrusiveness of most adopted sensors. In this paper, we propose an inexpensive, off-the-shelf, and contactless measuring system for respiration signals taking as region of interest the pit of the neck. The system analyses video recorded by a single RGB camera and extracts the respiratory pattern from intensity variations of reflected light at the level of the collar bones and above the sternum. Breath-by-breath respiratory rate is then estimated from the processed breathing pattern. In addition, the effect of image resolution on monitoring breathing patterns and respiratory rate has been investigated. The proposed system was tested on twelve healthy volunteers (males and females) during quiet breathing at different sensor resolution (i.e., HD 720, PAL, WVGA, VGA, SVGA, and NTSC). Signals collected with the proposed system have been compared against a reference signal in both the frequency domain and time domain. By using the HD 720 resolution, frequency domain analysis showed perfect agreement between average breathing frequency values gathered by the proposed measuring system and reference instrument. An average mean absolute error (MAE) of 0.55 breaths/min was assessed in breath-by-breath monitoring in the time domain, while Bland-Altman showed a bias of −0.03 ± 1.78 breaths/min. Even in the case of lower camera resolution setting (i.e., NTSC), the system demonstrated good performances (MAE of 1.53 breaths/min, bias of −0.06 ± 2.08 breaths/min) for contactless monitoring of both breathing pattern and breath-by-breath respiratory rate over time.
Accurate measurement and monitoring of physiological parameters, such as body temperature, heart rate, respiratory patterns, and, above all, the respiration rate, play a crucial role in a wide range of applications in healthcare and sport activities [
Temporal changes of physiological parameters can indicate relevant variations of the physiological status of the subject. Among the wide range of parameters which can be measured in clinical settings, the respiratory rate is the most crucial vital sign to detect early changes in the health status of critically ill patients. For instance, respiratory rate is typically collected at regular interval by operators (i.e., every 8–10 hours) in the clinical setting, while it is often neglected in home-monitored people (i.e., telemonitoring and telerehabilitation). However, the respiratory rate has been demonstrated to be a significant and sensitive clinical predictor for serious adverse events; its value increases during exacerbation of COPD [
Conventional techniques for measuring respiration parameters require sensors in contact with the subject. Measuring techniques based on the monitoring of several parameters sampled from inspiratory and/or expiratory flow (e.g., temperature, RH, CO2, and flow) are widely used. [
For this reason, solutions—even commercial ones—based on the analysis of the sound recorded surrounding the person and on the monitoring of temperature map changes adopted to thermal cameras and depth map changes due to breathing have been designed and tested. However, they suffer from high cost, needs of specialized people, and sometimes of low signal-to-noise ratio. Optical motion capture systems have gained greater interest in the field of respiratory monitoring in both research and clinical scenarios [
Different types of cameras have been used to measure physiological parameters, including heart rate and respiratory rate, either by adopting specific sensor camera technology, principle of work, or signals processing procedures. Two main methods have been used, based on remote photoplethysmography and body motion estimation.
Several attempts have been proposed to extract respiratory features from video frames recording breathing-related movements of thorax [
Different approaches have been also used to postprocess the pixel data to extract signal related to the respiration from such videos by the subtraction of two continuous images [
Then, a large amount of studies does not declare the details of the camera adopted and a there is a lot of variability in terms of camera resolutions used in such studies (i.e., from 640 × 480 [
Despite the large number of studies adopting video cameras for respiratory monitoring purposes, there is a lack of results about validity and accuracy of such methods in the practice, since the majority of these studies present proof of concepts or preliminary tests and no error metrics are reported [
In this paper, we present a single-camera video-based respiratory monitoring system based on the selection of the pit of the neck area. The aim of the present study is threefold: (
The respiratory pattern is estimated by analyzing the intensity of reflected light at the level of the pit of the neck. Experimental trials are presented to test the measuring system in the real practice, that is, in the monitoring of breathing pattern of twelve healthy subjects at self-paced breathing rate. Lastly, an analysis of performances of the proposed measuring system at different camera resolution is presented and discussed.
The proposed measuring system is composed by a hardware module for data recording and preprocessing and a software for respiratory pattern extraction.
A video recorded with a CCD camera is considered as a series of
In this work, we used the built-in CCD RGB webcam (iSight camera) from a laptop (MacBook Pro, Apple Inc.). Images were recorded at 24-bit RGB with three channels, 8 bits per channel. An
The video is collected at a set frame rate of 30 Hz, which is enough to discretize the breathing movements that commonly occur up to 60 breaths per minute, equal to 1 Hz.
The proposed system needs to collect an RGB video of a person seated in front of the camera (see Figure
Schematic representation of the setup used to record the video and the pressure at the level of the nose. The volunteer is seated in the position shown in the figure at a distance between the camera and chest wall of around 1.5 m. The red highlighted area on the chest is the area recorded by the video. The area used to extract the breathing is the ROI of size
The script automatically delineates ROI that consists of a rectangular region with dimensions
To extract the respiratory pattern from the video, firstly, the selected ROI is split in the red, green, and blue channels (Figure
Therefore,
(a) Example of a trend of
To extract the respiratory pattern from the number of trends
Figure
For the filter, an infinite impulse response (IIR) filter was designed: a 3rd-order Butterworth digital filter was employed. The transfer function is expressed in terms of
At that point, the signal
Normalized signal
Example of the normalized
The breathing rate can be extracted from
In the frequency domain, the breathing rate can be identified via power spectral density (PSD) estimate. The PSD estimation aims to assess the spectral density of a signal from a sequence of time samples of the same signal (finite set of data). PSD is useful in signal detection, classification, and tracking for detecting any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities [
The main approaches for frequency analysis consist of parametric methods (such as AR, ARMA) and nonparametric methods (window methods). Here, we focus on a nonparametric method.
Let
Let
When using this method, the most pronounced maximum frequency peaks of the spectrum identify the periodicity of the signal. Each spectrum obtained with PSD describes how the power of the
Contrary to frequency domain analysis, the time-domain specific points on the signal must be identified. Different approaches can be used based on the detection of maximum and minimum points, as well as on zero-crossing point individuation. We used a method based on both these approaches split into two steps. In the first step, the algorithm identifies the zero-crossing points on the video signal. It allows determining the onset of each respiratory cycle, characterized by a positive going zero-crossing value as
Our dataset consists of recordings of 12 participants (six males, six females, mean ages 25 ± 3 years old, mean height of 163 ± 8 cm, mean weight 58 ±9 kg). All the participants provided their informed consent. Each participant was invited to sit on a chair in front of the RGB camera at distance of about 1.5 m (see Figure
At the same time, the pressure drop (ΔP), which occurs during exhalation/inhalation phases of respiration, was collected by a differential pressure sensor [
Example of trends relative to (a) pressure drop ΔP recorded by the differential pressure sensor, (b) integrated and normalized ΔP signal used to extract reference pattern and respiratory rate values, and (c)
Flowchart presenting all the steps carried out to extract the respiratory pattern and then the respiratory rate from in both the frequency domain and time domain.
Then, we carried out a temporal standard cumulative trapezoidal numerical integration of the ΔP signal (i.e., integrated ΔP) to provide a smooth signal for further analysis and to emphasize the maximum and minimum peaks on the signal (see Figure
Afterwards, such integrated ΔP has been filtered using a bandpass Butterworth digital filter in the frequency range 0.05–2 Hz and normalized following the formula in (
An example—obtained from one volunteer—of the ΔP trend collected by the pressure sensor, the normalized and integrated ΔP signal, and the
Signals obtained from the measuring systems have been compared to the reference signals in terms of similarity of curves and respiratory rate values. The similarity of the frequency content of signals and average respiratory rate values have been investigated from the normalized PSD.
The similarity between signals has been evaluated by overlapping the two normalized PSD, considering the one of the reference instrument as the reference PSD. From frequency dominant peak, the average respiratory rate value can be extracted. From average values of breathing rate, the accuracy (expressed in %) of the proposed method can be calculated as
Additionally, the breath-by-breath respiratory rate values have been compared between instruments by extracting such values with the time-domain analysis. To compare the values gathered by the reference instrument and computed by the video-based method, we calculate the mean absolute error (MAE) of breaths per minute as
Additionally, the strength of associations between the breath-by-breath values collected with the proposed method and those collected by the reference instrument were evaluated with the Spearman correlation coefficient. Then, the slope
To investigate the influence of camera sensor resolution on the accuracy of the proposed measuring system, we postelaborated the videos to decrease each frame resolution. This postprocessing was carried out in MATLAB. Bicubic interpolation was used for interpolating data points on a two-dimensional regular grid (sensor matrix). With this method, the output pixel value is a weighted average of pixels in the nearest 4-by-4 neighborhood.
We decided to investigate the performances of 6 camera sensor resolutions (including the resolution of the original video, HD 720) since they can be considered the most used resolution of commercial in-built webcam as HD 720, PAL, WVGA, VGA, NTSC, and SVGA, characterized by three different aspect ratios (i.e., 4 : 3, 5 : 3, and 16 : 9). Attributes such as sensor’s size and number of
Size of ROI in function of CCD camera setting:
CCD camera setting | Characteristics of the proposed method | |||
---|---|---|---|---|
Aspect ratio | Resolution (sensor setting) | Number of pixels in the ROI [px] | ||
16 : 9 | 1280 × 720 (HD 720) | 384 | 216 | 82944 |
1024 × 576 (PAL) | 306 | 172 | 52632 | |
854 × 480 (WVGA) | 256 | 144 | 36864 | |
5 : 3 | 800 × 480 (VGA) | 240 | 120 | 28800 |
4 : 3 | 640 × 480 (NTSC) | 192 | 144 | 27648 |
800 × 600 (SVGA) | 240 | 180 | 43200 |
Since the ROI size is linked with the maximum size of
The results obtained from the proposed measuring system are compared to the reference ones. The analysis is carried out on both frequency and time domains, separately.
In the frequency domain, we computed the normalized PSD obtained in each trial. Average breathing rate is calculated indirectly by taking the maximum peak of the normalized PSD plot. The values for each volunteer are reported in Table
Dominant peak of the normalized PSD from the reference signal and the video signal (
Trial | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | ||
Reference signal |
Dominant |
0.27 | 0.23 | 0.12 | 0.22 | 0.24 | 0.22 | 0.25 | 0.20 | 0.21 | 0.17 | 0.31 | 0.39 |
— | — | — | — | — | — | — | — | — | — | — | — | ||
0.27 | 0.23 | 0.12 | 0.22 | 0.24 | 0.22 | 0.25 | 0.20 | 0.21 | 0.17 | 0.31 | 0.39 | ||
Estimated |
16.11 | 13.92 | 7.32 | 13.18 | 13.18 | 13.20 | 15.01 | 12.09 | 12.82 | 10.25 | 18.31 | 23.44 | |
— | — | — | — | — | — | — | — | — | — | — | — | ||
16.11 | 13.92 | 7.32 | 13.18 | 13.18 | 13.20 | 13.20 | 12.09 | 12.82 | 10.25 | 18.31 | 23.44 |
The similarities between normalized PSD obtained with the reference signal and the
Power spectrum density estimates for each camera sensor-resolution signal and reference instrument, for each volunteer.
The analysis in the time domain provides additional information compared to the analysis in the frequency domain with normalized PSD, for example, the breath-by-breath respiratory rate values. An average MAE value of 0.55 breaths/min was found, while the maximum value was 1.23 breaths/min. To specify the uncertainty around the estimate of the mean measurement, we use SE since it provides a confidence interval. Thus, we calculated the 95% confidence interval as 1.96 × SE. The computed confidence interval was always better than 0.45 breaths/min.
By considering all the breaths collected in the 12 trials (
Table
Breathing frequency extracted with the 5 sensor settings and from reference with the PSD analysis.
Trial | Respiratory rate (breaths/min) | |||||
---|---|---|---|---|---|---|
Dominant peak frequency (Hz) | ||||||
Reference | PAL | WVGA | VGA | NTSC | SVGA | |
1 | 16.11 | 16.11 | 16.11 | 16.11 | 16.11 | 16.11 |
0.27 | 0.27 | 0.27 | 0.27 | 0.27 | 0.27 | |
2 | 13.92 | 14.28 | 13.18 | 13.18 | 13.92 | 14.28 |
0.23 | 0.24 | 0.22 | 0.22 | 0.23 | 0.24 | |
3 | 7.32 | 7.32 | 7.32 | 7.32 | 7.32 | 7.32 |
0.12 | 0.12 | 0.12 | 0.12 | 0.12 | 0.12 | |
4 | 13.18 | 13.18 | 13.18 | 13.18 | 13.18 | 13.18 |
0.22 | 0.22 | 0.22 | 0.22 | 0.22 | 0.22 | |
5 | 14.65 | 14.65 | 14.65 | 14.65 | 14.65 | 14.65 |
0.24 | 0.24 | 0.24 | 0.24 | 0.24 | 0.24 | |
6 | 13.18 | 12.45 | 12.45 | 12.45 | 12.45 | 12.45 |
0.22 | 0.21 | 0.21 | 0.21 | 0.21 | 0.21 | |
7 | 15.01 | 15.01 | 15.01 | 15.01 | 15.01 | 15.01 |
0.25 | 0.25 | 0.25 | 0.25 | 0.25 | 0.25 | |
8 | 12.09 | 11.72 | 11.72 | 11.72 | 12.09 | 11.72 |
0.20 | 0.20 | 0.20 | 0.20 | 0.20 | 0.20 | |
9 | 12.82 | 12.82 | 12.82 | 12.82 | 12.82 | 12.82 |
0.21 | 0.21 | 0.21 | 0.21 | 0.21 | 0.21 | |
10 | 10.25 | 10.99 | 10.99 | 10.99 | 10.99 | 10.99 |
0.17 | 0.18 | 0.18 | 0.18 | 0.18 | 0.18 | |
11 | 18.31 | 18.31 | 25.63 | 25.63 | 25.63 | 18.31 |
0.31 | 0.31 | 0.43 | 0.43 | 0.43 | 0.31 | |
12 | 23.44 | 23.44 | 23.44 | 23.44 | 23.44 | 23.44 |
0.39 | 0.39 | 0.39 | 0.39 | 0.39 | 0.39 |
For each subject, the normalized PSDs are very similar regarding shape and dominant peaks using different resolutions. In all cases, except trials 1 and 12, there is one dominant peak which is distinctly sharper from the surrounding peaks. In trial 1 and 12, the presence of several peaks highlighted several changes in breathing rate during the data collection. An example of these differences in the time domain and frequency domain is reported in Figure
Signals obtained by the video recordings at different resolution during 80 s of data collection. In the first subplot, an irregular breathing pattern during self-pace breathing trials, and in the second one, a regular pattern. Differences in frequency domain can be appreciated from the normalized power spectra.
The results obtained in the time domain are reported in Figure
Barplot showing mean absolute error (MAE) as coloured bars, and 1.96 × standard error (SE) values as error bar. Each color and bar is related to the 5 investigated sensor settings.
Results of the Bland-Altman analysis (MOD ± LOAs), the slope (
Sensor setting | MOD ± LOAs (breaths/min) | Correlation coefficient | |
---|---|---|---|
PAL | −0.04 ± 1.91 | 1.001 | 0.97 |
WVGA | −0.04 ± 1.76 | 1.001 | 0.97 |
VGA | −0.04 ± 1.84 | 1.001 | 0.97 |
NTSC | −0.06 ± 2.08 | 1.002 | 0.95 |
SVGA | −0.06 ± 2.05 | 1.002 | 0.96 |
Figure
(a) Bland-Altman graph and correlation analysis using the
Within the wide spectrum of physiological measurements that are useful for clinical assessment, respiratory rate plays a crucial role. Especially, in some conditions it must be monitored continuously, for instance when patients are in clinical setting (i.e., intensive care unit) or both needs the monitoring of physiological data at home (i.e., tele monitoring, tele rehabilitation).
The use of unobtrusive solutions is widespread in respiratory monitoring. Optical technologies can allow nonintrusive and low-cost monitoring of respiratory patterns. Different solutions have been proposed based upon photo-reflective markers and frame subtraction. Although relative new techniques based on the analysis of video collected by digital camera have been demonstrated to be promising in the respiratory monitoring, most of them monitoring only the average respiration rate.
In this paper, we present a single-camera video-based respiratory monitoring system based on the selection of a small skin area near the base of the neck. The proposed method for extracting the respiratory pattern and the corresponding respiratory rate consists of three steps: (
Since the proposed method can work with very different built-in RGB cameras (webcams) available in most laptops, we have investigated the influence of sensor resolution (from HD 720 to NTSC) on the respiratory pattern and respiratory rate values extracted from video signal. The method has been tested on 12 participants wearing t-shirt or sweaters during data collection in an unstructured environment. Postprocessed pressure drop signal collected at the nose was used as reference signal in this work. Computed error measurements are at par with those reported in the literature [
Results show excellent performances of the method with the use of HD resolution (HD 720) with an accuracy of the method equal to 100% in the estimation of average breathing rate from the frequency-domain analysis. Additionally, PSD spectra demonstrated the similarity of all the breathing pattern collected at the different resolutions when compared to the reference signal frequency content. It results in a lower value of 94.9% of accuracy in the estimate of the average respiratory rate from spectra. Despite the excellent results obtained in the frequency domain, further developments may be devoted to test parametric methods to estimate the PSD, for example, AR methods since the periodicity of the respiratory signal [
In the calculation of breath-by-breath respiratory rate, the use of HD 720 camera setting shows the better results in terms of MAE (average value of 0.55 breaths/min) and SE. Additionally, in this case, the method shows a bias of −0.03 ± 1.78 breaths/min in the calculation of breath-by-breath respiratory rate when compared to the reference values. With lower resolution (NTSC), the dispersions of the data are slightly higher (LOAs are wider, ±2.08 breaths/min), while the MOD value is comparable. These biases are comparable to those obtained [
By analyzing more than 200 breaths (from 12 volunteers), sensor resolution seems to influence the accuracy of the proposed method. NTSC resolution (the ROI area is one third of the HD 720 area) shows the worst results, with an accuracy of 95.6% in the estimation of average breathing rate, and a MAE error of 1.45 breaths/min. In the estimation of breath-by-breath parameter, the correlation coefficient is 0.95 with a bias of −0.06 ± 2.08 breaths/min. These values can be compared with respiratory rate bias obtained from wearable sensors like using Doppler radar (via fast Fourier transform) with the use of transmitter and receiver antennas when compared to a respiration strap [
Reference raw data and videos are available from the corresponding author upon request
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The authors would like to thank Sara Iacoponi and Giuseppe Tardi for the helpful contribution in data collection and volunteers’ enrollment. Additionally, authors would like to thank all the volunteers who accepted to address their spare time for this study. Daniel Simões Lopes is thankful for the financial support given by Portuguese Foundation for Science and Technology, namely, for the postdoctoral grant SFRH/BPD/97449/2013 and the Portuguese funds with reference UID/CEC/50021/2013 and IT-MEDEX PTDC/EEI-SII/6038/2014.