A Survey of Industrial Internet of Things Platforms for Establishing Centralized Data-Acquisition Middleware: Categorization, Experiment, and Challenges

+e development of industrial Internet of +ings (IIoT), big data, and artificial intelligence technologies is leading to a major change in the production system.+e change is being propagated into the wave of transforming the existing system with a vertical structure into the corresponding horizontal platform or middleware. Accordingly, the way of acquiring IIoT data from an individual system is being altered to the way of being increasingly centralized through an integrated middleware of a scalable server or through a large platform. +at said, middleware-based IIoT data acquisition must consider multiple factors, such as infrastructure (e.g., operation environment and network), protocol heterogeneity, interoperability (e.g., links with legacy systems), real-time, and security. +is manuscript explains these five aspects in detail and provides a taxonomy of eighteen state-of-the-art IIoT data-acquisition middleware systems based on these aspects. To validate one of these aspects (network), we present our evaluation results at a real production site where IIoT data-acquisition loss rates are compared between wireless (long-term evolution) and wired networks. As a result, the wired communication can be more suitable for centralized IIoT data-acquisition middleware than wireless networks. Finally, we discuss several challenges in establishing the best IIoT data-acquisition middleware in a centralized way.


Introduction
Digital transformation, also known as DT or DX, is an important keyword for modern production systems. e utilization of technologies such as industrial Internet of ings (IIoT), big data, and artificial intelligence (AI) in existing systems enables digital transformation to immediately respond to customers' demands and build a production system that improves the current production efficiency [1,2]. us, numerous research institutes and enterprises are conducting research on upgrading production systems that apply new technologies to the industrial environment.
Compared to building a new system from scratch, changing the existing system brings many considerations. One of the most time-consuming and costly processes is to acquire high-quality data. Most of the legacy IT and production systems, including Manufacturing Execution System (MES) and Supervisory Control and Data Acquisition (SCADA), have a vertical structure.
To flatten the vertical structure for better data acquisition, the new system must be able to aggregate each production data. To this end, numerous middleware platforms adopt a horizontal structure that integrates the data acquisition [3][4][5][6][7][8][9][10][11][12]. e proposed systems have been applied in actual industrial fields.
To establish centralized data-acquisition middleware, we must determine whether the above middleware platforms meet a set of major functionalities. is manuscript proposes the following functionalities: (i) wired and/or wireless network compatibility, (ii) support for a variety of compatible industrial protocols, (iii) automated real-time data collection, (iv) data integration and external transmission, and (v) security. As necessary functions and standards have not been well standardized and established, the existing systems are based on their own criteria, which are nonconsensual.
erefore, whether we are equipped to build high-quality IIoT data acquisition middleware is difficult to discern. Such ambiguous criteria may cause duplicate development and increased development costs.
To address this problem, we propose and describe a set of functionalities that must be addressed when developing centralized IIoT data acquisition middleware. We then review eighteen cutting-edge IIoT middleware systems and provide a taxonomy of these systems based on clearly motivational functionalities. One of these functionalities (communication type) was assessed in experiments at our real production site. e acquisition percentages of IIoT data under wired and wireless (long-term evolution, LTE) communications were 99.940% and 98.983%, respectively. From this result, we inferred that wired communication is more robust for centralized IIoT data acquisition than wireless communication. is empirical result sheds light on the potential validity of the proposed functionalities.
e main contributions of this manuscript are summarized as follows: (i) We propose a number of considerations for building a centralized IIoT data-acquisition middleware (ii) We elaborate on the distinctions between IoT and IIoT data-acquisition systems (iii) We review a rich body of existing IIoT systems and qualitatively analyze them along with well-motivated criteria (iv) We present our evaluation results obtained from a real industrial site with respect to IIoT data-acquisition loss between wireless and wired networks (v) We draw several challenges for constructing IIoT data-acquisition middleware in a central server e remainder of this manuscript is organized as follows. e following section proposes a set of considerations to establish the best IIoTdata-acquisition middleware, classifies these considerations into five categories, and provides the key components of each consideration. e subsequent section reviews recent IIoT data-acquisition middleware systems.
ereafter, we present our experiment results showing different data-acquisition performances among IIoT devices (in this case, welding machines). Finally, we suggest the future research directions of our work.

Functionalities for Centralized IIoT Data-Acquisition Middleware
To build the Smart Factory or cyber-physical system (CPS) in a short time, the production data-acquisition system that serves as a backbone should be architected and welldesigned. IIoT data-acquisition middleware enables fast and easy development of the applications. Most IoT systems develop applications for a new environment without integrating with existing systems. However, building IIoT systems often require upgrading existing production systems because IIoT data are not only obtained from existing sensors, gateways, and controllers but also fused with other application data. If the upgrade is necessary, modification of the existing system need to be minimized, and the core system of the current production system should remain unchanged. e reason is that upgrading the IIoT system incurs high investment cost.
To the best of our knowledge, data acquisition at industry sites has been little investigated. In this article, we fill this gap by exploring the various factors demanded of a solid and reliable middleware system for IIoT data-acquisition. A taxonomy of these factors is illustrated in Figure 1.
In the illustrated taxonomy, the first consideration is the infrastructure, including the operation and network environment. e infrastructure factor is divisible into two subfactors: operation environment and network. e first subfactor is further divided into on-premises, cloud, and hybrid environments. Most industrial sites have applied onpremises systems that satisfy the security and management requirements within the technical limitations. At present, numerous sites have adopted the cloud environment which allows users to gather and manage their IIoT data for further analysis and development [13]. Within the cloud environment, building systems can be quickly built and can be flexibly managed. However, the cloud incurs a security risk and requires additional hardware or programs for sending data to the cloud. For these reasons, most industry sites still prefer the on-premises environment. Other companies have built hybrid environments that combine the advantages of on-premises and cloud. e second subfactor is network. e IIoT data-acquisition network environment is largely distinguished by wired and wireless networks. Wired communication is classified into analog signal, serial communication, and LAN communication. It has several advantages, such as cost-effectiveness, stability, and low maintenance. However, it can be disadvantageous when not installed in mobile environments. Recently, wireless communications have significantly expanded owing to technological advances and reduced system-development costs [14]. Wireless networks can utilize licensed frequency bands, such as 3 G, LTE, 5 G, and NB-IoT [15,16], but licensed frequency standards and abilities vary among countries and local environments. If a network uses licensed frequency bands, it must use the demilitarized zone (DMZ) for safety purposes. us, numerous industrial sites have attempted to use unlicensed frequency bands in their local networks for IIoT data acquisition.
Short-distance local networks such as Wireless Fidelity (Wi-Fi), Bluetooth Low Energy (BLE), and ZigBee are also available. Recently, many industry sites have attempted to apply low-power wide-area networks (LPWAN), including Long Range (LoRa) and Sigfox, which are specialized for IoT and support small data transfer with low-power consumption [17][18][19][20][21][22]. In contrast to wired communication, wireless communication must guarantee stable data acquisition and control. e second factor that must be considered is heterogeneity (in protocol). is factor can be divided into industrial protocol, communication protocol, and database driver. In general, most of the time and cost of an entire project is spent on setting and developing IIoT protocols and drivers. e first subfactor is industrial protocol. is can be further divided into device, common, and customized protocol levels.
At the device level, typically the gateway or controller uses programmable logic controllers (PLC). Some sensors and gateway have manufacturer-specific protocols. erefore, a variety of PLC drivers, sensor protocols, and gateway protocols are required to obtain data from industrial equipment. Recently, the IIoT system is used as part of or in place of SCADA or MES (mentioned in the Introduction), so data-acquisition middleware with the device-level protocols is required.
At the common level, recently common protocols are adapted for many sites. Standard protocols are being introduced by several manufacturers and research institutes.
e Open Platform Communications (OPC) Foundation developed two protocols-OPC-DA (Data Access) and OPC-UA (Unified Architecture)-for real-time monitoring and control systems. Again, it is very challenging to change the existing products and systems. us, protocols for existing equipment are necessitated. Moreover, because the existing applications including SCADA and MES use traditional industrial protocols such as Modbus and Fieldbus, the existing drivers must also be compatible.
At the customized protocol level, a specialized protocol for various purposes such as security and research needs to be developed. e second subfactor is communication protocol. Two components associated with the communication protocol are IoTprotocol and Representational State Transfer (REST). e existing HTTP-based protocol is built for clientserver architectures. erefore, it may have limited ability to acquire real-time IIoT data. One such limitation is the request-response method, which cannot easily receive various IIoT data in real time. Moreover, a number of packets are needed to transmit and receive data. us, many institutes, companies, and researchers have developed their own IoT protocols.
In 2013, IBM developed Message Queuing Telemetry Transport (MQTT), which is a lightweight protocol using a publish/subscribe messaging model in a TCP/IP environment. MQTT provides a total of three quality of service (QoS) levels. In the adjustment of the QoS level, factors such as network quality and usage conditions should be considered. MQTT is increasingly used in embedded IIoT equipment, requiring light network environment.
Another protocol is Constrained Application Protocol (CoAP), a lightweight message-transfer protocol for use among devices on the same constrained network. OMA Lightweight M2M (LwM2M) is a device management protocol designed for sensor networks and machine-tomachine (M2M) environments. As an extensible resource and data model, LwM2M adopts an efficient secure data transfer standard called the CoAP. e third subfactor is database driver. e database drivers, such as Java Database Connectivity (JDBC) and Open DataBase Connectivity (ODBC), for integrated system monitoring, are required to connect to the database. e third component that must be considered is interoperability, which is the component for interchanging production data with legacy IT systems. An interface with  Scientific Programming legacy IT applications is important. REST and MQTT protocols, which are widely used in IT systems, are needed as well.
Real-time is the fourth factor to be considered. is factor means the real-time equipment control and monitoring function. Equipment and a machine can be controlled manually and automatically. Remote control should be used in a wireless or wired network environment so that it can be controlled manually. e automatic and intelligent control should be able to perform real-time monitoring, analyze the current data set, and predict future situations for future systems, such as CPS.
e final factor is security. Security is divided into network, software, and hardware security. Network security aims to minimize the impact of unauthorized external disturbances by utilizing specific communication protocols [23][24][25][26][27]. Software security prevents other systems from accessing IIoT systems including sensors, gateways, and legacy systems. Software security assigns a security ID to each machine and sensor. Some recent security developments are based on blockchain technology [28,29].
Nevertheless, there are many security challenges in the existing IIoT environment. For instance, most of the systems are trying to resolve the security hardware. To prevent physical access from the outside, the DMZ installations and local networks are utilized. Many companies have various policies on security. Depending on the environment of the production system, appropriate methods should be chosen to ensure security.
Note that to improve the production efficiency through AI and analysis using IIoT data, many industrial sites and research institutes have been actively conducting research on acquiring data quickly at a low cost.
e following section provides detailed descriptions of the discussed factors that need to be considered during data acquisition.

Key Components of IIoT Data-Acquisition Middleware
Recently, a production system is rapidly being changed to meet customers' demands. To make the system more flexible and intelligent, the system needs to collect and integrate information from a variety of IIoT devices. Figure 2 illustrates such a system centrally positioning IIoT data-acquisition middleware. e industrial data gathered through this centralized middleware can be used for data-driven decision making. Furthermore, other kinds of systems, such as intelligent and flexible systems as well as simulation systems, can utilize the collected data for further analyses and services. To generate valuable information in an IIoT environment, real-time collection of consistent IIoT data is essential. Accordingly, middleware technology for robust data acquisition is solicited. Considering the fact that IIoT data obtained using such acquisition middleware usually come from many applications, building such middleware needs to consider the following key components: network bridge, licensed frequency band, LPWAN, industrial protocols, production IoT, and cloud.

Network Bridge and LPWAN.
As mentioned earlier, networks can be classified into two broad categories: wired and wireless (See Figure 1). Many industrial sites adopt wired communication owing to its stability and speed. In a wired communication, data are often received from previously developed serial interfaces, such as RS232 and RS485. In this case, only a short-distance communication is possible.
us, a network bridge is required to enable longdistance communication. For example, many production sites are heavily utilizing network bridges that can change serial communications to transmission control protocol/ Internet protocol (TCP/IP).
Recently, with the increased use of IIoT systems, increasing data are received through wireless communication owing to the cost and deployment duration. In a wireless communication, BLE, ZigBee, Wi-Fi, etc., can be utilized for short-distance communication (See Figure 1). In this case, the data is sent to the central server by improving the distance using a dedicated network bridge. Furthermore, with the development of telecommunication infrastructures, both licensed frequency band (e.g., 3 G, LTE, NB-IoT, and 5 G) and unlicensed frequency band (e.g., LPWAN) have become widely used by many industrial sites. In the case of the licensed frequency band, certain fees are paid for use, as the frequency of the license plate is managed by a professional company or institution. Owing to its superior speed and capability to provide stable communication and large bandwidth, such a licensed frequency band is being used by many industries although it comes at high costs.
Conversely, regarding the wireless communication, batteries are considered as a critical factor, particularly in LPWAN enabling long-distance communication. When IIoT systems need the transfer of small data with low-power consumption, LPWAN has three types: Sigfox, LoRaWAN, and NB-IoT (see Figure 1). Its communication distance is in the range of 1-20 km. As described previously, the dataacquisition middleware requires a structure to make it possible to acquire data through both wired and wireless communications.

Industrial Protocols.
Industrial devices are essential to achieve high reliability, durability, scalability, and ease of maintenance. PC-based controllers are used in complex operations. In fact, PLC-an industry-specific system that operates independently of the OS-is more widely used, thanks to its high compatibility with industrial protocols such as Fieldbus and Modbus (See Figure 1). Furthermore, PLC has the ability to easily acquire analog signals such as voltage or current and incurs lower cost compared to industrial PCs. Currently, the connection with IT systems has become a hot topic in PLC markets. Along with this wave, most PLCs provide common protocols to obtain and control variables over the TCP/IP environment.
However, it is costly to upgrade existing PLC programs for the purpose of sending data to other systems, in terms of expense and time. erefore, it is of paramount importance to support various PLC protocols so that data can be acquired without altering the existing PLC programs. Consequently, a number of commercial programs have been released for obtaining PLC data directly.
With the early development of PLC, a standard interface, OPC has been established. OPC enables real-time monitoring and links to automation systems, such as Human Machine Interface (HMI) and SCADA. OPC has improved the security and connection speed of the PLC protocol. In 2008, OPC-UA-vendor-dependent and highly secure protocol-was developed by the OPC Foundation [30]. It is used much in interworking with IT systems. e previous versions of OPC had a client-server architecture, which made it difficult to process multiple messages simultaneously. Conversely, OPC-UA provides publish/subscribe functions to enable 1 : N and N : N communications in real time. Moreover, it has a reliable version for cloud environments. In recent years, many researchers have been conducting research on Time-Sensitive Networking (TSN) linked with OPC to achieve 18 times faster real-time remote control and monitoring.

Production IT. Many companies have built ERP and
MES for managing quality products. Before the emergence of Industry 4.0, the old systems used to operate in a vertical structure. In ISA-95 standard, the systems operate by sending and receiving data only at each of the front and rear levels. However, the development of IIoT has eliminated the boundaries of data. Data-acquisition middleware is needed to directly obtain data from MES, ERP, and the control process. e middleware requires protocols or drivers to obtain information from legacy IT systems. For instance, considering the fact that ODBC and JDBC are usually required to connect to the database, the middleware can support the drivers. In the case of a three-tier system, another interface such as REST can be used, particularly in the environment where there is little direct access to the data due to security reasons.

Cloud vs. On-Premises.
Recently, the cloud has been widely adopted for its efficiency and cost-effectiveness. Many IT companies have customers who wish to use infrastructure and resources in the forms of SaaS, PaaS, and IaaS. For IIoT data acquisition, the cloud system acquires data in a different manner from an on-premises environment.
e IIoT equipment, including sensors, actuators, and gateways, provide data while being located at the industrial site. Numerous existing equipment are mostly connected to networks such as LAN. is industrial equipment often has its own industrial protocols. In this case, the transfer of IIoT data to the cloud environment is required. Edge consists of a device or a program that converts the sensor's analog signal or serial signals to LAN communication. Edge also uses network bridges called protocol converters or gateways. e configuration of network bridges or the edge can be applied to industrial sites for industrial controllers, such as PLC, industrial computers, and dedicated converters, that change specific signals. us, an edge program that can connect to the cloud system needs to be installed on IIoT equipment. For example, offering an API is possible with standard protocols, such as MQTT and OPC-UA, or customized protocols of their own companies. Usually, due to installation of the edge program, OS-based products are needed. In this case, it is necessary to establish an environment where packages or APIs can be used, such as Linux OS or Windows OS.
Unlike the cloud systems using edge, the on-premises system makes it easy to obtain IIoT data in a centralized network management environment. Usually, the onpremises system uses network bridges for extended communication distance. e central server manages a variety of  Scientific Programming information, including the IIoT device ID, protocols, acquisition rate, and resources. In the on-premises system, the network bridge has a wider range of configurable choices than that of the cloud system. Some cloud systems need to change equipment due to the requirement of some protocols such as MQTT, REST, and OPC-UA. However, the onpremises is more flexible than the cloud and can acquire data directly from IIoT equipment that are easy to use various gateways.
Edge software provided by IT vendors includes cloudbased middleware, such as Azure IoT Hub, AWS Industrial IoT, Oracle Internet of ings Cloud service, and Predix. ese middleware systems provide software packages or APIs for the connection from IIoT equipment to their clouds using edge devices. e OPC-UA protocol is applied considering real-time control and monitoring, as well as the interface industrial system. In addition, due to the various conditions of industrial sites, IIoT data are acquired in cooperation with specialized partners in the field to suit the site situation. Kepware, PI Collect, AVEVA Edge, and MindSphere Connect that show strength in the current OT field can easily make connection of the current IIoT equipment to their systems. e companies are also increasing the ease of connectivity by providing various industrial protocols, such as the PLC interface, Modbus, and OPC-DA/UA. Moreover, some companies and research institutions use their own technology and thus create systems optimized for specialized environments [46][47][48][49][50][51]. In this case, although a middleware system does not have many functionalities, it offers great features that are specialized in the environment of operation.
Every middleware provides real-time "monitoring" functions, but some middleware services (such as Oracle Internet of Cloud Service, ingPlug, and N-MAS) do not allow the control of IIoT devices in real time. Finally, all middleware systems well support security for communication from IIoT devices to their respective middleware.

Experiment: A Reliability Test of IIoT Data Acquisition at a Real Industrial Site
For convenient operation at industrial sites, IIoT data acquisition needs to be centralized. To minimize investment, we should have to determine the feasibility of acquiring the IIoT data from the legacy network infrastructure in a centralized way. To this end, we designed an IIoT data-acquisition experiment leveraging the wired and wireless networks used in office work. During this experiment, we measured data-acquisition rates for 24 h during weekdays and analyzed the network loads. Briefly, the results demonstrate that IIoT data can be "indeed" acquired by the networks used in general office work.

Environment Settings.
By our intended design, we conducted two actual experiments in terms of central IIoT data acquisition in an on-premises environment via two methods, as shown in Figures 3 and 4. e difference between the two experiments was the communication environment through which the data was acquired.
For the wireless networks, we utilized LTE communication using a licensed band network (KT Corporation) in South Korea. In particular, we used a router with Private-LTE (P-LTE) for security purposes. e external LTE servers checked the router's IP and port number and switched to the designated IP and port number assigned by the customer. Subsequently, the data were sent to the internal DMZ server, which checked the IP and port number for security reasons. Finally, the data were safely sent to the internal server. e total processing time was one second. e first method was to acquire IIoT data via wired communication (Figure 3(a)). e second method was to acquire data centrally via LTE communication (Figure 3(b)). e wired communication is the most adopted communication in the field, while LTE is now prevalent.
Configurations are exhibited in detail in Table 2. Test device is the welding machine used in a shipyard where ships and offshore plants are built. We used a total of 14 tags, including ID, voltage, current, temperature, and product information. e used protocol is a user-defined protocol. When data is requested, the welding machine sends the requested data (Figure 4). e requested data requires a total of 15 tags per a second. Because the data interface of the welding machine is RS232, the maximum transmission distance is 15 m. us, the machine requires a network bridge to transmit data at a long distance. e data path of the welding machine is divided into two routes. In the first route, the IIoT data acquisition middleware requests tag data through the network bridge via TCP/IP communication, and the network bridge then sends tag data to the welding machine via RS232 interface. In the second route, the welding machine responds according to the command and then forwards all tag data back to the middleware.
e rate at which all data were acquired was once per second. erefore, 86,400 s tag sets were acquired per a day. e experimental period was 10 days, excluding weekends, when the equipment was not in operation all day.
e applied network bridge model was NPORT-5610 in MOXA, which has eight ports that convert RS232 to TCP/IP communication. In the NPORT model, we used the TCP/IP server mode to communicate with the middleware. e data acquisition middleware used PTC's KEP-ServerEX 6.4 with U-CON driver, which can handle the welding machine's customized protocol. e data acquisition middleware was linked to our IIoT platform to monitor, manage, and store the data being collected.

Result Analysis.
In this section, we present and discuss our experiment results regarding the network sensitivity of IIoT data-acquisition middleware.
In our experiment the data-acquisition rates were calculated on a per-second basis. us, if the total count of data received reaches 864,000, its daily acquisition rate means 100% for 10 days.
As shown in Figure 5, we compare the data-acquisition ratios of wired and LTE communications for a total of 10 days. In the wired communication, the data-acquisition rate is from 99.984% to 98.537%. In LTE communication, on the contrary, the data-acquisition rate is 99.984% to 97.739%. Figure 6 illustrates the per-hour data-acquisition rates for 24 h. Business hours are from 08 : 00 to 20 : 00 during the daytime and from 20 : 00 to 06 : 00 during the overnight. Most employees typically work during the daytime, so it is possible to confirm whether the network load was affected by the use of an internal network. According to our results, the  network load turned out to be unaffected about acquiring IIoT data. Table 3 exhibits our results about the interarrival time of data. A one-second interval (at the second row below the header of Table 3) is considered as normal, and unstable, otherwise. Because the ratio of the 1s interval differs by about 1% between wired and LTE communications (99.730% vs. 98.857%), we empirically confirmed that wired communication was not significant but more reliable than wireless communication in our IIoT environment (although this observation could be obvious). Table 4 demonstrates the average data-acquisition rate of each of the welding machines used in our experiments. In the table, for all datasets, the averages were 99.940% for wired communication and 98.983% for LTE communication, respectively. In the case of wired communication, the hop count was seven, but the network in the case of LTE was more complex as it passed through eight hops or more through the external networks and the internal DMZ server.
us, a lower data-acquisition rate was expected. Moreover, wired communication did not acquire 100% of the data due to communication errors in the device, middleware, and timer.
In this experiment, the overall average data-acquisition rate including wired and wireless communication was 99.701% despite centralized acquisition. We also confirmed  To configure the same data acquisition middleware in the cloud, the use of an edge device is required to transfer data from the device to the cloud, taking into consideration security, as data is sent to external networks. In the cloud environment, the initial cost of infrastructure configuration is low. us, having a small number of IIoT equipment is advantageous. However, in the case of large-scale facilities, the operating costs increase with the increase in data transmission volume and data processing problems. erefore, it seems that cost, maintenance, and security should be addressed well when an operation environment is selected. Currently, numerous hybrid systems combined with the on-premises and cloud are being used to do so.

Conclusion and Future Work
We conducted an in-depth survey of recent IIoT platforms with potentiality for horizontal data acquisition. We    Scientific Programming reviewed various data-acquisition middleware products released by eighteen companies and research institutes. rough our investigation, we derived well-defined criteria by which the systems can be categorized. We also presented the major functionalities for building high-quality centralized IIoT data-acquisition middleware. To justify one of these criteria (network), we empirically evaluated the performance of centralized data acquisition via wired and LTE communications using an actual IIoT device (a welding machine). e overall average rate of 16 welding machines across the wired and wireless networks was 99.701%, validating the centralized IIoT data acquisition. Finally, we identified several challenges that must be resolved to construct the best data acquisition middleware in a centralized environment.
We expect that our work will help to clarify the criteria and the important considerations of high-quality IIoT data acquisition middleware systems. We plan to build our own data acquisition middleware that can fully meet the suggested functionalities. e middleware configuration and operation will be tested in a real production environment.

Data Availability
e experiment data are the property of Daewoo Shipbuilding & Marine Engineering Co., Ltd. (DSME). erefore, the experiment data are proprietary to DSME.

Conflicts of Interest
e authors declare no conflicts of interest.