Cost-Effective and Ultraportable Smartphone-Based Vision System for Structural Deflection Monitoring

This work demonstrates the viability of using a smartphone-based vision system to monitor the deflection of engineering structures subjected to external loadings. The video images of a test structure recorded by a smartphone camera are processed using the state-of-the-art subset-based digital image correlation (DIC) algorithm to extract the vertical image displacement of discrete calculation points defined on the test object. The measured vertical image displacement can be converted to deflection (vertical displacement) by easy-to-implement scaling factor determination approaches. For accuracy validation, laboratory experiments using a cantilever beam subjected to external loadings were performed. The deflection and inherent frequency of the test cantilever beam measured by the proposed smartphone-based vision system were compared with those measured by conventional dial gauges and a dynamic strain gauge. The relative errors were estimated as 1% and 0.15% for deflection and inherent frequency, respectively. Outdoor real bridge deflection monitoring tests were also carried out on an overpass with subway passing by, and the measured deflection-time curves agree well with actual situations. The cost-effective, ultraportable, and easy-to-use smartphone-based vision system not only greatly decreases the hardware investment and complexity in deflection measurement system construction, but also increases the convenience and efficiency of deflection monitoring of engineering structures.


Introduction
Deflection measurement is an essential part in the design of various civil infrastructures and engineering structures and their ongoing safety assessment and maintenance. For instance, bridge deflection reflects the vertical integral stiffness of the bridge, which is an important indicator of bridge safety evaluation and must remain within acceptable limits during service life. Therefore, it is crucial to accurately and efficiently monitor the deflection of bridges to prevent unpredictable collapse and deterioration. In practice, structural deflection can be measured by traditional contact-based measuring methods, such as dial gauges, linear variable differential transformers (LVDT), and accelerometers [1][2][3]. These contact-type pointwise sensors, however, are required to be manually installed on and detached from discrete measurement points on the test objects, which could be time-consuming, laborintensive, expensive, and cumbersome.
As an emerging noncontact method, vision-based or image-based displacement sensor systems [4,5] offer a promising alternative to conventional contact-type sensors due to their distinctive features of remote and real-time measurement. In addition, compared to other noncontact-type (e.g., GPS, laser vibrometer, and radar interferometry system) displacement sensors [6][7][8][9], vision-based displacement sensors possess some significant advantages, including low cost, ease of operation, and flexibility to extract structural displacements at multiple points from a single-camera measurement. In past decades, with the continuous improvement of digital cameras and video image processing technology, vision-based deflection measurement techniques have gained increasing attention and widespread application [10][11][12][13][14][15][16][17][18][19][20].
The implementation of a structural displacement measurement using vision-based sensors basically involves four consecutive steps: (1) video image acquisition, (2) scaling factor determination, (3) motion tracking, and (4) physical displacement calculation. Since the pioneering work carried out by Stephen et al. for measuring the deck displacement of Humber Bridge in the UK using the motion tracking algorithm [10], significant advances have been made to this technique by introducing various new imaging devices, such as high-speed industrial cameras [16] and UAVs [21]. Also, both feature-based [22,23] and template-based (or intensity-based) [10] motion tracking algorithms have been employed in vision-based measurement methods to detect structure motion with subpixel accuracy [18,24]. Besides, it should be mentioned that the displacement measurement accuracy of vision-based methods is vulnerable to ambient interferences (e.g., ambient vibration and temperature variation) [25][26][27]. To quantify and correct the measurement errors due to ambient interferences, various approaches have also been adopted [26,27]. However, in most of these vision-based bridge deflection measurement methods, industrial-grade high-speed cameras are used, which makes the system construction costly and complex. For this reason, the use of ubiquitous smartphone cameras, which are cost-effective, easy-to-use, and ultraportable, may bring many attractive advantages (i.e., portability, simplicity, and low cost) and thus can greatly benefit research and educational efforts in resource-limited universities and institutes.
In recent years, many smartphones are equipped with high-performance processing systems and high-resolution cameras, which allow the capture of high-quality images and videos. The camera quality and processing capability of a smartphone have been greatly improved. Many smartphones possess a strong ability of 4 K video recording at the frame rate of 60 frames per second (f/s) (e.g., nova series, Huawei Technologies Co., Ltd., China), while some smartphones can shoot 1080 P video at the speed of 240 f/s (e.g., iPhone XS, Apple Inc., America; one plus 7pro, OnePlus Technology (Shenzhen) Co., Ltd., China). The image acquisition capability of high resolution and high frame rate enables a smartphone to be applied to the vision-based motion tracking of fluids [28] and deformation measurement of materials and structures under static and/or dynamic loads [29][30][31]. For example, Yu and Pan [29] first established a low-cost 2D DIC system using a camera phone as the image recording device. They found that accurate DIC measurements can be implemented using a common camera phone with adequate correction. Further, Yu et al. [30] established a costeffective and portable smartphone-based stereo-DIC system using a single smartphone and a 3D-printed four-mirror adaptor for pseudostereo imaging. They also investigated the influence of the self-heating effect of the smartphone on 3D-DIC measurements. Recently, Kromanis and his coworkers [31] investigated the feasibility of using smartphones to measure structural deformation in the laboratory environment. Their study demonstrates the capabilities of smartphone technologies for providing accurate information about structural deformations when coupled with suitable image processing software.
In this work, we present a smartphone-based deflection measurement method that combines a low-cost smartphone camera, deflection measurement software (Video Deflectometer Software -V2016 written by Long Tian), and a proper scaling factor determination method. The present method can be used for accurate static and dynamic deflection measurements of engineering structures in both the laboratory and outdoor environments. This measurement method can not only greatly reduce the cost and complexity involved in existing vision-based displacement measurement systems but also have high measurement accuracy and strong practicability. In the remainder of this work, we first introduce the principles and implementation procedures of structural displacement measurement using the proposed smartphonebased method. For accuracy validation, laboratory tests of a cantilever beam subjected to static and impulse loadings were carried out. To demonstrate the practicality of the proposed method for deflection measurement in nonlaboratory conditions, outdoor real bridge deflection monitoring tests were also performed on an overpass with a subway passing by.

Deflection Monitoring Using a Smartphone Camera and DIC
The proposed smartphone-based bridge deflection measurement method is shown in Figure 1. Basically, the measurement comprises four steps: (1) video image capture, (2) scaling factor determination, (3) video image displacement tracking, and (4) conversion from image displacement to physical displacement. First, instead of using an industrial video camera, the proposed bridge deflection measurement method adopts a smartphone camera for video recording. The smartphone should be mounted tightly on a fixed platform (e.g., a tripod) during the recording of video images of a test structure. After calibrating the scaling factor of the smartphone camera using the methods described below, the obtained video images can be processed using deflection measurement software, which is based on the state-of-theart DIC algorithm, to extract image displacement of each measurement point specified on the test structure. As a subset-based (intensity-based) image registration or matching algorithm aimed at detecting the same physical points in different images, the DIC algorithm is capable of tracking the motions of specific parts or components of the structure (e.g., bolts, holes, and corners) on a bridge. However, if those components or sections cannot provide sufficient intensity variations, which can be quantified by a metric known as sum of square of subset intensity gradients (SSSIG) [32], artificial markers or LED lamps should be mounted on the test structure. As the two critical issues in the proposed smartphone-based bridge deflection measurement method, image displacement tracking with the DIC algorithm and scaling factor determination will be introduced below in detail.
2 Journal of Sensors 2.1. Displacement Tracking Using DIC. The video recorded by smartphone can be considered a sequence of images and thus processed by advanced DIC algorithm [33] to extract displacements at interrogated points. Usually, the image of the tested sample recorded before loading is selected as the reference image and the images collected after loading are taken as the deformation images. First, one or more discrete points are specified as measurement points in the reference image.
To accurately track the displacement of these measurement points in the deformed images, an advanced subset-based local DIC method using the state-of-the-art inverse compositional Gauss-Newton (IC-GN) algorithm [33] is employed in the deflection measurement software used in this work. Specifically, for each measurement point, a square subset of ð2 M + 1Þ × ð2 N + 1Þ pixels centered at the calculation point is selected as the reference subset. Note that the subset size can be adaptively defined according to local image intensity gradients to ensure there are enough intensity variations in each interrogated subset. Then, the positions of these reference subsets are tracked in the deformed images by the IC-GN algorithm with subpixel accuracy, and the image displacement of each point can be obtained by computing the coordinate differences between the target subset center and the reference subset center. More details regarding the IC-GN algorithm can be found in our previous publications [18,33].
Although the accuracy and efficiency of the IC-GN algorithm have been well demonstrated in previous works [33], the following two practical issues, i.e., the selection of appropriate shape functions and temporal initial guess between consecutive images, should be carefully addressed to achieve high-efficiency and high-accuracy displacement tracking.
(1) Shape function: the simplest but practical zero-order displacement mapping function, which only allows rigid body translation of the target subset, should be used. This is because for remote measurement of large engineering structures, the local deformation within target subsets can be well approximated by a translation. The use of overmatched higher-order shape functions (e.g., first-order shape functions with six parameters) with heavier computational complexity has not shown an improvement in displacement measurement accuracy for remote bridge deflection measurement (2) Initial guess: note that as a nonlinear local optimization algorithm, the IC-GN algorithm requires an initial guess of deformation parameters very close to the true value to determine the subpixel motion of each calculation point. At the initial stage of motion tracking, the initial guess of the displacement vector for each point is set to be zero. Considering that the incremental displacement of a measurement point in the consecutive video frames recorded at a high-frame rate is generally smaller than a pixel, a temporal initial guess transfer scheme [34] using the calculated displacement vectors of calculation points in previous two frames are used to predict their initial guess of the current frame  Figure 1: Bridge deflection measurement system using smartphone-based method.

Journal of Sensors
Nevertheless, a simple integer displacement searching performed in the spatial domain can be used to calculate the integer displacement of the point if the final correlation coefficient value is smaller than the preset threshold (set to be 0.8 in this work).

Scaling Factor Determination.
To obtain structural vertical displacements from the recorded video images, the relationship or scaling factor (SF) between the image displacement v (in units of pixels) and the physical displacement V (in units of mm) of a test object should be established for each measurement point.
In practice, two cases need to be considered according to the pitch angle of the camera sensor. In the first case, we assume that the sensor target of the phone camera is vertically placed, as shown in Figure 2(a). Hence, the scaling factor of each measurement point on the image plane can be determined using the following two approaches [20]: (1) based on the known geometry of an object close to the measurement point, whose physical dimension is in the unit mm on the object surface and corresponding image dimension in the unit pixels (i.e., L object and l image ) are known, as shown in Equation (3) and (2) based on the intrinsic parameters of the camera (l ps : the physical size of pixel unit of camera sensor, unit: mm/pixel; f : the focal length of the camera lens, unit: mm) as well as the extrinsic parameters (D: the distance between the opti-cal center of camera and the measurement point, unit: mm), as shown in Equation (4).
In the second case, we assume that the sensor target of the phone camera is not vertically placed and has a pitch angle with respect to the horizontal plane, as shown in Figure 2(b). In this case, the scaling factor of each measurement point can be determined using the known intrinsic parameters of the phone camera and measured extrinsic parameters (e.g., β and D measured using a laser rangefinder or other available tools), as shown in Equation (5). For more detailed information about scaling factor case2, readers can refer to work [18].
3. Laboratory Experiments  Journal of Sensors (620 mm high) using nine screws to form a cantilever structure. A hook, which is located about 56 mm away from the free end of the cantilever beam, could be used to suspend load. Since there was no obvious texture feature on the cantilever beam, three "crosshair" artificial marks drawn on label paper were pasted at the cantilever beam as measurement targets. Three measurement targets were 770 mm, 550 mm, and 275 mm far away from the fixed end of the cantilever beam, respectively. At the same time, three dial gauges were installed at the corresponding positions of the three measurement targets. The smartphone (nova 3, Huawei, spatial resolution: 3840 × 2160 pixels, frame rate: 30 f/s) was fixed vertically about 1 m in distance from the cantilever beam using a special tripod for mobile phones. The load was applied manually by placing or removing weights to the free end of the cantilever beam. The weights were loaded as the following steps: 9.8 N, 19.6 N, 4.9 N, and 4.9 N, and were removed as follows: 4.9 N, 4.9 N, 19.6 N, and 9.8 N. The complete process of loading and unloading was recorded by the smartphone.
The scaling factors of the three targets were 0.2205 mm/pixel, 0.2195 mm/pixel, and 0.2189 mm/pixel, respectively, which were estimated from the fixed part of dial gauges with physical dimension (i.e., L i ) of 155 mm and their corresponding image dimension (i.e., l i ) of 703 pixels, 706 pixels, and 708 pixels, using Equation (3) since the sensor target of the smartphone camera is vertically placed. The image displacement of three targets could be obtained simultaneously by processing the recorded video images using the deflection measurement software. Then,   5 Journal of Sensors the image displacements were converted to deflections using these scaling factors. In addition, the data of three dial gauges could be read every 3 seconds from the recorded video images, instead of frequently recording the data during the operation, which greatly reduced the operation time and simplified the operation process. The deflection of three targets measured using the smartphone-based method and the data read directly from the dial gauges are shown in Figure 4.
As can be seen in Figure 4, the deflection of the three targets measured using smartphone-based method is basically consistent with the data read from dial gauges. Table 1 (SP for smartphone and DG for dial gauge) also gives the average displacement values of each loading and unloading stage measured by the smartphone-based method and the dial gauges. Since the deflection values are almost zero at the stages of before loading and fourth unloading, the relative errors of these two stages are not comparable and written as " * ." Other than that, the relative errors of other stages are all within 1%. The result shows that the smartphone-based measurement method provides accurate displacement measurement results in the case of static load experiments.

Dynamic Load Experiment.
A dynamic load experiment was also carried out in the laboratory using the same cantilever beam and measurement targets. As shown in Figure 5, a resistance strain gauge was installed at target 2, and a wireless dynamic strain collector (Donghua DH5908L) with 1 kHz acquisition frequency was connected to the resistance to get strain data of target 2. A load consisting of one 19.6 N and two 4.9 N weights was suspended at the free end of the cantilever beam. When the cantilever beam was at rest, the string used to suspend the load were cut to vibrate the cantilever beam freely.
The smartphone (one plus 7pro, spatial resolution: 1920 × 1080, frame rate: 240 f/s) placed about 1 m far away from the cantilever beam was used to record the free vibration process of the cantilever beam, while a portable computer was used to receive strain data from the dynamic strain collector in real time. For the smartphone-based method, the same   Journal of Sensors calibration method as the static load experiment was used to obtain scaling factors of three measurement targets as 0.5312 mm/pixel, 0.5325 mm/pixel, and 0.5352 mm/pixel, respectively. After postprocessing the video recorded by the smartphone, the deflection measurement software was used to simultaneously get the image displacement of three measurement targets. Then, image displacements were converted to deflection using the determined scaling factors, while the dynamic strain collector could only collect the strain data of one measurement target. Note that deflection-time curve was plotted as 240 data per second according to the image capture frame rate of 240 f/s for a smartphone-based method, and the strain-time curve was plotted as 1000 data per second since the acquisition frequency of dynamic strain collector is 1 kHz. Figure 6 shows the deflection calculated by the smartphone-based method and the strain data collected by dynamic strain collector. It can be seen from Figure 6(a) that the smartphone-based method gets a perfect deflection-time vibration curve of the cantilever beam, and the dynamic strain collector with 1 kHz acquisition frequency has also obtained a good strain-time curve as shown in Figure 6(b). Furthermore, the deflection data and strain date were converted from the time domain to the frequency domain using fast Fourier transforms (FFT) in MATLAB software to get power spectral density (PSD) values for each measurement method. Then, the PSD-frequency curves were plotted using PSD values and corresponding frequency of sampling point, as shown in Figure 7. It can be seen from Figure 7(a) that the vibration frequencies of the three targets on the cantilever beam measured by the smartphone-based method are all 11.3166 Hz, while the vibration frequency of the target 2 on the cantilever beam measured by the dynamic strain collector is 11.3333 Hz as Figure 7(b) shows, the relative error is only 0.15%. The measurement result shows that the dynamic measurement of the structure using the smartphone-based method can obtain the accurate vibration frequency of the object. Note that the frame rate 240 f/s of the smartphone with resolution of 2 megapixels far exceeds the frame rate of most conventional industrial cameras (e.g., Point Gray,  An image of field experiment is shown in Figure 8(a). A tripod with a three-way head (Manfrotto, Italy) were used to fix the smartphone (one plus 7pro, spatial resolution: 3840 × 2160, frame rate: 30 f/s), which was placed about 60 meters away from the right pier of the overpass. By adjusting the pitch angle of the smartphone camera, the test overpass is located approximately in the center of the recorded images, as shown in Figure 8(b). The video of the overpass was started to record before subway arriving and ended after the subway passing the overpass. Note that when using a smartphone for outdoor video recording, the tripod should  Journal of Sensors be placed in a stable position to ensure the stability of the smartphone to reduce the adverse impact of ambient vibration on the measurement accuracy.
Three measurement points with almost equal distance were selected on the bridge as shown in Figures 8(a) and 8(b). Target 1, target 2, and target 3 were located at the left quarter of span, the center of span, and the right quarter of span, respectively. In this experiment, the smartphone camera was deployed with its optical axis oblique to the test overpass, and the pitch angle was measured by an inclination sensor of a laser rangefinder (Leica DISTO D510, Leica Geosystems, Germany, distance measuring range: 200 m, accuracy: ±1 mm and pitch angle measuring range: 360°, accuracy: ±0.2°) as 16.8°. Also, the distances from the smartphone to the three targets were measured using the same laser rangefinder as 102.233 m, 91.631 m, and 77.910 m, respectively. Based on these measured pitch angle and distance values, the scaling factors of the three measurement targets were determined as 21.008 mm/pixel, 19.084 mm/pixel, and 16.026 mm/pixel, respectively, using the calibration method given in Equation (5). Figure 9(a) shows three states (i.e., the subway arrives at the left pier, the subway is totally on the overpass, and the subway leaves the right pier) of the subway on the overpass recorded by smartphone. The recorded video images were processed by the deflection measurement software to extract image displacements of the three targets. Then, image displacements were converted to deflections using the determined scaling factors, as shown in Figure 9(b). When the subway passes the overpass, the deflection of target 2 is the maximum, and the maximal deflection value is about 15 mm, while the deflection of target 1 is almost the same as that of target 3 being about 4 mm. This is because target 2 was at the center of span position of the bridge with the maximum deflection, while target 1 and target 3 have the same deflection value because of their center-of-span symmetry. The deflection-position trend curves of three states, which can intuitively show the overall deformation of the overpass, are fitted approximately using measured deflections of three discrete targets and hypothetical reasonable zero deflections of two solid pier positions, as shown in Figure 9(c). It can be seen clearly that all the targets will tilt upward to some extent due to uneven loading of the overpass when the subway is approaching the overpass or leaving the overpass. And the closer to the center of span, the greater the deflection when the subway was totally on the overpass. These observations are consistent with the actual situation, indicating that the smartphone-based measurement method is practical and reliable. Note that this field experiment is a qualitative measurement without considering the effect of temperature variation [25] and ambient vibration [26,27], which must be considered if high-accuracy quantitative measurement is required.

Concluding Remarks
We have demonstrated the feasibility of using low-cost and ultraportable smartphone cameras to replace the currently used specialized industrial cameras for structural deflection monitoring. We realized multipoint static deflection and dynamic frequency measurements of a cantilever beam in the laboratory, and the results agree well with those delivered by conventional contacting methods with a relative error less than 1%. Preliminary results obtained from outdoor field bridge deflection monitoring tests also demonstrate the potential of the present smartphone-based method for practical engineering applications. Furthermore, the field experiment shows the portability of the proposed smartphonebased method for a filed deflection measurement. Only a smartphone, a laser rangefinder, and a tripod are required. At present, the battery life of most smartphone can meet the needs of the duration time of field experiment, so there is no need to consider the power supply at the measurement site. This provides a convenient solution to the situation where power is not available at the measurement site.
However, the following two issues should be addressed to further improve the accuracy and practicality of the proposed smartphone-based vision system for deflection monitoring: (1) The thermal errors of a smartphone camera due to selfheating effect of the smartphone and ambient temperature variations in outdoor environment should be carefully considered. Corresponding measures should be taken to eliminate and compensate the virtual displacement in video images due to temperature variations of smartphones. (2) Real-time video image transmission to the computer via the wireless network, real-time processing these video images, and visualization of the measurement results should be explored.

Data Availability
The parameters of tools used to support the findings of this study are included within the article. The software used to support the findings of this study has not been made available because it is copyright protection and trade secret protection procedures.

Conflicts of Interest
The authors declare no conflict of interest.